AlgoMaster Logo

SortedSet<T>

Last Updated: May 17, 2026

18 min read

SortedSet<T> is a collection of unique elements that stays sorted at all times. It's backed by a self-balancing binary search tree (a red-black tree), so insertions, removals, and lookups all run in O(log n) and iteration visits items in sorted order. The trade-off compared to HashSet<T> is speed for order: every operation costs a logarithm instead of a constant, but you get sorted enumeration, range queries, and instant min/max for free.

Why a Sorted Set Exists

HashSet<T> is the right tool when all you need is "is this item in the collection?" with O(1) lookups. The price you pay is that iteration order is undefined and there's no concept of a smallest or largest element. As soon as you want to ask "what are the five cheapest products?" or "show me all prices between $20 and $50," a hash set falls over.

SortedSet<T> is the answer when you need both uniqueness and order. The set keeps items in ascending order according to a comparer, and exposes that order through properties like Min and Max, a Reverse() enumeration, and a GetViewBetween method that returns a live, sorted slice of the set.

Here's the smallest useful example. A catalog needs to track unique product prices that customers have seen, and show them in order:

Two things to notice. First, the duplicate 29.99m was silently ignored, just like with a HashSet<T>. Second, the iteration came out sorted ascending, even though the insertion order was scrambled. The set did the sort work as items were added, not when they were read, so the foreach is a simple in-order walk of the tree.

This is the shape of every SortedSet<T> interaction. You add items in whatever order they arrive, the set takes care of placement, and you read them back in order whenever you ask.

The Underlying Tree

The data structure inside SortedSet<T> is a red-black tree, a kind of self-balancing binary search tree. Every node holds one value, two child pointers (left and right), and a color (red or black). The tree maintains an invariant that no path from the root to a leaf is more than twice as long as any other path, which is what guarantees the O(log n) bound for all operations.

For a set holding the seven prices 9.99, 14.99, 19.99, 24.99, 29.99, 34.99, 49.99, the tree looks like this:

Reading the tree in order means an in-order traversal: left subtree, root, right subtree, recursively. That visit order is exactly 9.99, 14.99, 19.99, 24.99, 29.99, 34.99, 49.99, which is why foreach over a SortedSet<T> always gives you sorted output. The Min property is the leftmost node (9.99), and Max is the rightmost (49.99). The runtime walks pointers to find them, which is also O(log n) in the worst case but constant if the set caches them.

You don't need to know red-black tree mechanics to use the class. What matters is the cost profile that comes out of the design: O(log n) for Add, Remove, and Contains; sorted iteration without a separate sort step; and cheap access to the ends of the order.

Creating a SortedSet

The default constructor produces an empty set that sorts using the type's default comparer:

For built-in types like decimal, int, string, and DateTime, the default comparer is Comparer<T>.Default, which calls the type's IComparable<T>.CompareTo method. For numbers that means ascending numeric order. For strings it means ordinal-style culture-sensitive ordering (the same as string.Compare(a, b) with default settings).

You can seed the set from any IEnumerable<T>, which is useful when you already have data in a list or an array:

The constructor walks the source once, adding each element. Duplicates collapse, and the final set holds four distinct values in sorted order. The cost is O(n log n) for n source items, which matches the cost of sorting plus deduplication.

When the default order isn't what you want, supply your own IComparer<T>. Here's a working example that sorts strings by length first and then alphabetically:

The comparer decides every placement. "Home" and "Toys" are both length 4, so they're compared alphabetically (Home before Toys). "Books" is length 5. "Electronics" is length 11. The output reflects the custom order exactly.

The comparer is stored as a property on the set, available as prices.Comparer. Once a set is created with a particular comparer, that comparer is used for every subsequent operation: Add, Remove, Contains, GetViewBetween, and all the set operations. You can't swap it out later. If you need a different order, create a new set with a different comparer and copy the elements over.

You can also combine both options, seed from an existing collection and pass a comparer:

The duplicate "Toys" is removed during construction, and the rest are sorted according to the custom comparer.

Add, Remove, and Contains

Add(T item) returns a bool. true means the item was new and is now in the set, false means the set already contained an equal item and nothing changed. This is the same shape as HashSet<T>.Add, but the equality check goes through the comparer rather than IEqualityComparer<T>.

Two items are considered equal under a comparer when Compare(a, b) == 0. For decimal with the default comparer, that means strict numeric equality. For the custom LengthThenAlpha comparer above, two different strings with the same length and same characters would be equal, but more interesting cases pop up with culture-sensitive comparisons where different Unicode strings can compare equal under some collations.

Remove(T item) also returns a bool. true means the item was found and removed, false means it wasn't in the set:

The first call locates 19.99m in O(log n) by walking the tree, removes the node, and rebalances. The second call walks the tree, doesn't find a match, and returns false.

Contains(T item) is the pure-lookup version. No mutation, just a yes/no answer in O(log n):

The tree walk for Contains follows the same path the tree took during Add. At each node, the comparer decides whether to go left, right, or stop. A miss bottoms out at a null child pointer.

Count is a property, not a method, and it runs in O(1). The set maintains the count as a field and updates it on every add and remove:

The constructor saw four items but one was a duplicate, so the resulting count is three.

Min and Max

Two properties make sorted sets especially useful for "give me the cheapest" or "give me the latest" queries: Min and Max. Both return the value at the corresponding end of the order, without enumerating the rest of the set.

Internally, Min walks the tree from the root, always taking the left child until there are no more left children. Max does the mirror walk on the right. Both touch O(log n) nodes in the worst case.

On an empty set, both properties return default(T). For reference types and nullable value types, that's null. For non-nullable value types like int or decimal, that's 0, which can be ambiguous with a real value of zero:

A Min of $0.00 doesn't mean the set contains zero. It means the set is empty and you got the type's default value back. Check Count > 0 before reading Min or Max when an empty set is a real possibility.

These two properties are what makes a SortedSet<T> natural for "top-K" style work. To find the cheapest or priciest item among a stream of incoming prices, keep them in a SortedSet<T> and read Min or Max whenever you need the answer:

The pattern is "add the new item, then drop the smallest if the set has grown past K." At any moment, the set holds the K largest values seen so far. The cost of each iteration is O(log K), which is much cheaper than re-sorting a list of all prices.

A symmetric pattern works for "K smallest": drop Max instead of Min when the size exceeds K.

GetViewBetween

GetViewBetween(lower, upper) returns a SortedSet<T> that represents a slice of the original set. The slice is inclusive on both ends and contains exactly the elements x where lower <= x <= upper. The crucial detail is that the returned set is not a copy. It's a live view that shares storage with the source.

The view returned by GetViewBetween is itself a SortedSet<decimal>, with the same API. You can iterate it, ask for Min, Max, Count, or call Contains on it. All those operations are restricted to the slice. midRange.Min is $14.99, not $4.99.

The "live view" part means two things. Mutations to the original set show up in the view, and mutations through the view show up in the original set:

24.99m falls inside [10, 30], so adding it through the original makes it visible in the view immediately. Removing 14.99m through the view removes the actual node from the underlying tree, so the original no longer contains it.

Trying to add a value outside the view's range through the view throws ArgumentOutOfRangeException:

The view enforces its boundaries on writes. You can only add through the view what fits in the view's range, which makes the abstraction safe to compose with other code.

Two useful patterns build on GetViewBetween. The first is range queries against a stable source: find all prices in a band, all orders in a date range, all leaderboard scores between two thresholds. The second is bulk removal of a range:

You can't safely modify a SortedSet<T> while iterating over it (or any view of it). The pattern above copies the view into a list first, then removes each element from the original. Snapshot, iterate, mutate.

Reverse

Reverse() returns an IEnumerable<T> that yields the set's elements in descending order. It does not modify the set. It's an iteration helper that walks the tree in reverse in-order (right, root, left):

The first loop walks the tree right-to-left, the second walks it left-to-right. The set itself doesn't change between the two. There's no allocation of a reversed list, the enumerator just visits nodes in the opposite order.

Pairing Reverse() with LINQ's Take gives a clean "top K largest" without manual size management:

Note that Reverse() on a SortedSet<T> is its own method, distinct from the LINQ Enumerable.Reverse<T>() extension. The LINQ version works on any IEnumerable<T> but materializes the whole sequence into a buffer first before yielding it backwards. The instance method on SortedSet<T> doesn't need to buffer because the tree can be walked in either direction natively. It's faster and allocates less.

Same output, different mechanics. For sorted sets, prefer the instance method.

Set Operations

SortedSet<T> implements all the set algebra methods that HashSet<T> does, with the same names and semantics. The _HashSet&lt;T&gt;_ lesson covered these in depth, so this section is a quick recap with sorted-set specifics, plus the order-aware extras.

The mutating operations:

MethodWhat it does
UnionWith(other)Adds every element from other that isn't already in the set.
IntersectWith(other)Removes every element from the set that isn't also in other.
ExceptWith(other)Removes every element from the set that is in other.
SymmetricExceptWith(other)Keeps elements that are in one set or the other, but not both.

The query operations (return bool, don't mutate):

MethodWhat it asks
IsSubsetOf(other)Every element in this set is also in other.
IsSupersetOf(other)Every element in other is also in this set.
IsProperSubsetOf(other)Subset, and other has at least one extra element.
IsProperSupersetOf(other)Superset, and this set has at least one extra element.
Overlaps(other)At least one element is shared.
SetEquals(other)Same elements, ignoring order.

A worked example, applied to product categories a customer has shown interest in:

Notice how every result is sorted. That's the order-aware part: where HashSet<T> returns the same logical set but in undefined enumeration order, SortedSet<T> enumerates the result in the comparer's order. Same algebra, ordered output.

The semantics of "equal" inside these operations are decided by the comparer, not by Equals/GetHashCode. Two strings that compare equal under a case-insensitive comparer would be treated as the same element during a UnionWith, even if they differ in case. This is one place where the choice of comparer at construction time has lasting consequences.

The order-aware extras worth knowing are not extra methods, they're the iteration guarantees. UnionWith, IntersectWith, ExceptWith, and SymmetricExceptWith all leave the resulting set in sorted order without any extra work, because the underlying tree never accepts an unsorted state. There's no OrderBy step needed after a set operation, which is one of the small ergonomic wins of using SortedSet<T> for problems that involve both algebra and presentation.

The query operations don't depend on order at all. IsSubsetOf returns the same bool whether you ask it on a HashSet<T> or a SortedSet<T> with the same elements. They're convenience over the same underlying truth.

There's one mild gotcha. The set algebra methods accept any IEnumerable<T> as the "other" argument, not just another SortedSet<T>. If the other side is something like a List<T> that may contain duplicates, the methods do the right thing (the set treats the input as a logical set and ignores duplicates), but the work is O((n + m) log n) where m is the count of the input. Passing in another SortedSet<T> with the same comparer enables some fast paths internally, but the asymptotic cost is the same.

RemoveWhere and Bulk Removal

RemoveWhere(Predicate<T>) removes every element for which the predicate returns true, and returns the number removed:

The method walks the tree and removes each matching node, rebalancing as it goes. Total cost is O(n log n) in the worst case for a set of size n, because each removal is O(log n) and you may remove up to n elements.

RemoveWhere is the only safe way to mutate a SortedSet<T> during a single logical scan. If you tried to iterate the set with foreach and call Remove from inside the loop, you'd get InvalidOperationException: Collection was modified. Modifying a tree while an enumerator is walking it invalidates the enumerator. RemoveWhere handles this correctly by snapshotting matches internally before mutating.

A typical use is purging stale entries from a leaderboard. Suppose scores are stored as decimal (with a custom comparer so the highest score sorts first, which we'll wire up shortly), and you want to drop everyone below some threshold:

The set still iterates in ascending order, which is the natural order for int. For a "highest first" leaderboard, you'd either use Reverse() or build the set with a descending comparer (covered in the _A Leaderboard With a Custom Comparer_ section below).

The bulk-removal alternative is to use GetViewBetween plus a snapshot+iterate dance, but RemoveWhere is simpler when the criterion is a free-form predicate rather than a contiguous range.

A Leaderboard With a Custom Comparer

The pieces come together in a small leaderboard example. The set holds player records, sorted descending by score so that Min gives the lowest score and Max gives the top scorer. Two records with the same score are broken by name ascending.

Two details worth pointing out. First, Min is the top scorer here because the comparer sorts descending by score. The set has no concept of "high" or "low" in absolute terms, only "leftmost" and "rightmost" in the comparer's order, and the comparer puts the highest score on the left. Second, Alice and Eve both have a score of 1200, but they coexist in the set because the comparer breaks the tie by name. If the comparer treated them as equal (returning 0), only the first inserted would have stayed and the second Add would have returned false.

That last point is the load-bearing one for tie-breaking. SortedSet<T> decides "is this a duplicate?" by asking the comparer for zero. If your comparer ignores some field, items differing only in that field are considered duplicates. Always include enough discriminators in the comparer to make every distinct logical entity compare non-zero against every other.

Top-K and bottom-K work the same way:

Take(3) walks the tree forward (which is descending by score because of the comparer), Reverse().Take(2) walks it backward. Both are O(K log n) for K results, which is much cheaper than sorting the whole list each time you need a snapshot.

SortedSet vs HashSet

Both types implement ISet<T>. The choice between them comes down to whether order matters and what cost you can afford on the common operations.

AspectHashSet<T>SortedSet<T>
Underlying structureHash tableRed-black tree
AddO(1) average, O(n) worstO(log n)
RemoveO(1) averageO(log n)
ContainsO(1) averageO(log n)
CountO(1)O(1)
Iteration orderUndefinedAscending by comparer
Min / MaxNot providedO(log n)
GetViewBetweenNot providedO(1) construction, live view
Reverse()Not providedO(1) construction, O(n) full walk
Equality modelIEqualityComparer<T> (Equals + GetHashCode)IComparer<T> (Compare returns 0)
Memory per elementLower (one slot in a bucket array)Higher (tree node with pointers and color)
Best forMembership tests, dedupOrdered enumeration, range queries, top-K

The rule of thumb is: if all you ask is "is x in the set?" use HashSet<T>. The moment you ask "what's the smallest?", "what's between A and B?", or "give me everything in order," reach for SortedSet<T> and accept the log-factor cost.

A small mental model helps. A HashSet<T> is a wide, shallow lookup. It puts items in approximately random slots in a big array, and Contains is a hash-and-jump. A SortedSet<T> is a deep, balanced lookup. It puts items in tree nodes that maintain order, and Contains is a guided walk from the root.

There are two surprises that catch people. The first is the equality model. HashSet<T> uses Equals and GetHashCode, so two records that hash the same but compare differently under a comparer might collide differently than expected. SortedSet<T> uses only Compare, so equality is whatever returns zero. Mixing the two models on the same custom type is a common source of subtle bugs.

The second is the constant factor. The O(1) advertised for HashSet<T> is amortized average. A bad hash function or pathological collision pattern can degrade individual operations significantly. SortedSet<T> has no such failure mode (the worst case is the same as the average case, O(log n)), but the constant factor on each operation is higher because tree walks involve more pointer-chasing than a single array index.

In practice, HashSet<T> wins for raw throughput on uniform data. SortedSet<T> wins for predictable performance and order-aware queries. If you're not sure, start with HashSet<T> and switch only when you find yourself sorting the results or asking range questions.

Iteration and Enumerator Behavior

SortedSet<T> implements IEnumerable<T>, so foreach works the way it does for every other collection. The enumerator walks the tree in in-order, which produces ascending output:

The enumerator is invalidated if the set changes during iteration. Calling Add, Remove, Clear, UnionWith, IntersectWith, ExceptWith, SymmetricExceptWith, or RemoveWhere while an enumerator is live throws InvalidOperationException the next time you advance the enumerator:

The right pattern when you need to remove during a logical scan is either RemoveWhere (which handles the bookkeeping internally) or a snapshot + iterate + mutate pattern:

Snapshot the values you want to delete during the iteration. Mutate the set after the iteration finishes. The two phases never overlap.

Iteration through a view returned by GetViewBetween follows the same rules. Mutating the original set during a view enumeration invalidates the view's enumerator just like a direct enumeration:

The first iteration sees $14.99, the Remove invalidates the underlying tree's version counter, and the next MoveNext on the view throws.

The lesson is the same as for any in-place mutation: separate the read phase from the write phase, or use a method like RemoveWhere that's built to handle both at once.

Common Use Cases

The use cases where SortedSet<T> is the right choice cluster around three patterns: ordered uniqueness, range queries, and top-K maintenance.

Ordered uniqueness comes up whenever you need to deduplicate a stream of values and display them in order. Price points seen during a session, distinct order amounts, unique tags applied to products, the set of dates orders were placed: all natural fits.

The set absorbs the duplicates and presents the unique dates in chronological order, all in one step.

Range queries are the second pattern. The catalog wants to show all products with prices between two values, or all orders from the last seven days, or all users with scores in some bracket. GetViewBetween does the work in O(1) for the construction and O(K) for iterating the K matching elements:

Compare this to a List<decimal> approach, which would require either sorting first or scanning every element to find matches. The sorted set has already done the sort work, so the view is just an in-order walk from the lower bound to the upper bound.

Top-K maintenance is the third pattern. A leaderboard, a "K most popular products," a "five cheapest shipping options," a sliding window of recent prices: any time you want to keep a small ordered slice of an arbitrarily-large stream:

The set's capacity is bounded by K. Each new incoming price costs O(log K) for the add and possibly another O(log K) for the eviction. Total cost across N inputs is O(N log K), which scales gracefully even when K is small and N is large.

There's a fourth scenario where SortedSet<T> is the wrong answer despite looking like a fit: when you have many duplicates by some "key" but want to keep them all. A leaderboard with possibly-identical scores from different players (which is what the earlier example showed) only works because the comparer breaks ties by name. If two true duplicates exist by every measure you care about, a SortedSet<T> will silently collapse them. For those cases, reach for a SortedDictionary<TKey, List<TValue>> or a SortedList<TKey, List<TValue>> to keep all the duplicates indexed under each key.

Performance Summary

To wrap the cost model into one place, here are the operations you'll touch most often along with their complexity:

OperationCostNotes
Add(item)O(log n)Walks the tree, inserts, may rebalance. Returns bool.
Remove(item)O(log n)Walks, removes, may rebalance. Returns bool.
Contains(item)O(log n)Tree walk, no mutation.
CountO(1)Maintained as a field.
Min / MaxO(log n)Walk to leftmost or rightmost node.
GetViewBetween(lo, hi)O(1)Stores bounds, no copy.
view.CountO(k)Walks the slice to count.
Reverse() (instance)O(1) to construct, O(n) full walkTree walk in reverse in-order, no buffering.
Iteration with foreachO(n) totalAmortized O(1) per MoveNext.
UnionWith(other)O((n + m) log (n + m))Adds each element of other.
IntersectWith(other)O((n + m) log n)
ExceptWith(other)O(m log n)
RemoveWhere(pred)O(n log n) worst caseCost of n removals.
Constructor from IEnumerable<T>O(n log n)Each insert is O(log n).
Memory per element~5x a HashSet<T> entryTree node has color + 2 pointers + parent.

The numbers all assume the comparer runs in O(1), which is true for primitive types and most simple comparers. If the comparer itself is expensive (for example, a culture-sensitive string compare on long strings), multiply every cost by the comparer's actual cost. A red-black tree of n strings with average length L and a culture-sensitive compare is O(L log n) per operation, not O(log n).

The big-picture takeaway is that everything in SortedSet<T> is bounded by O(log n), there are no hidden O(n) operations or pathological worst cases, and there's no allocation per Contains or per Add (the tree nodes are pre-allocated as part of the data structure). It's a predictable, well-behaved container at the cost of being a small constant factor slower than a hash set on lookups.

Summary

  • SortedSet<T> stores unique elements in sorted order using a self-balancing red-black tree. Iteration always yields elements in ascending order according to the set's comparer.
  • Add, Remove, and Contains all run in O(log n). Count is O(1). Min and Max walk to the ends of the tree in O(log n) and return the smallest and largest elements.
  • GetViewBetween(lo, hi) returns a live, inclusive slice of the set in O(1). Mutations propagate in both directions between the view and the original. Reading view.Count is O(k) in the slice size, not O(1).
  • Reverse() (the instance method) returns the elements in descending order by walking the tree backwards, without buffering. It's faster than the LINQ Enumerable.Reverse extension on SortedSet<T>.
  • All HashSet<T> set operations (UnionWith, IntersectWith, ExceptWith, SymmetricExceptWith, IsSubsetOf, IsSupersetOf, Overlaps, SetEquals) are available, with results always in sorted order. Equality of elements is decided by the comparer's zero return, not by Equals and GetHashCode.
  • RemoveWhere(predicate) is the safe way to mutate during a logical scan; mutating the set inside an external foreach throws InvalidOperationException.
  • Pick SortedSet<T> when order, range queries, or top-K access matter. Pick HashSet<T> when only membership testing matters; the hash-based version is faster on average (O(1) vs O(log n)) but unordered.
  • Custom ordering goes through IComparer<T> passed at construction time. The comparer is fixed for the life of the set; switching orders means building a new set.