Last Updated: May 22, 2026
A collection is any object whose job is to hold a group of other objects and let you work with the group as a unit. Arrays are the simplest example, but arrays are fixed in size and limited in operations, so .NET ships a whole library of richer collection types built on top of them. This lesson maps the family: where the types live, how they relate, what each one is good at, and how to pick between them. The chapters that follow each take one collection and drill in. This one stays at the survey level on purpose.
An array is one shape, fixed length, integer index, contiguous in memory. That shape is fast and predictable, but it's wrong for half the things real code needs to do. A cart that grows as the user adds products doesn't have a known length up front. A lookup from product ID to product doesn't want integer indices, it wants string or Guid keys. A wishlist of unique items shouldn't allow duplicates. A line of pending orders should let you peel items off the front in arrival order. None of those map cleanly onto a raw T[].
You could build all of those out of arrays yourself, growing them, copying them, scanning them, deduplicating them. People did, for years, and the code was tedious and bug-prone. The Base Class Library (the standard library shipped with .NET) provides ready-made collection types so you don't have to. Each one is tuned for a different access pattern.
Here's what the same task looks like with a raw array versus a List<T>, which is the most common collection in C#:
List<T> wraps a private array and grows it as needed. You call Add, Remove, Insert, and the list reshapes its internal storage automatically. That convenience is what most collection types give you: a higher-level shape over a lower-level array, with the bookkeeping hidden.
The Base Class Library, usually abbreviated BCL, is the standard set of types that ships with .NET. It includes String, Math, Console, and most of what you've used so far. Collections live in two namespaces inside the BCL: System.Collections for the old, untyped collections from .NET 1.0, and System.Collections.Generic for the typed collections introduced in .NET 2.0. Generic collections are the default. You'll write using System.Collections.Generic; at the top of almost every file that uses them.
On modern projects (.NET 6 and later), System.Collections.Generic is part of the implicit usings, which means the import is added for you by the SDK. You'll still see the explicit using directive in older code, sample snippets, and any file where implicit usings are turned off. The runtime behaviour is identical either way; it's purely a question of whether the import is written in the file or supplied by the project.
List<T> doesn't allocate a new array on every Add. It doubles its internal capacity when it runs out of room, so the average cost of Add is O(1) (amortized). The doubling does mean a single Add can occasionally trigger a copy of every existing element, which matters in tight loops. Capacity tuning is covered in the List<T> chapter.
System.Collections.Generic is the namespace that holds the modern, type-safe collection types. The <T> in the namespace name is the same <T> you see on types like List<T> and Dictionary<TKey, TValue>. It marks them as generic, which means they take a type parameter at declaration time and remember it. A List<string> only stores strings. A List<Product> only stores Product objects. The compiler enforces this at compile time, and the runtime stores the items in their actual form without boxing.
Compare a generic list with the older, untyped ArrayList (which lives in the non-generic System.Collections namespace):
The ArrayList accepts anything because every item is stored as object. Getting items back requires a cast, and the cast can fail at runtime. The List<string> rejects the int at compile time. Generics give you safety without runtime checks, and they avoid the overhead of boxing value types into object on the way in.
This isn't just an ergonomic win. Generic collections store value types like int, double, and decimal without boxing, which means a List<int> of one million elements uses roughly four megabytes of int data, while an ArrayList of one million ints uses far more because each value is wrapped in a heap-allocated object. The non-generic types from System.Collections are still in the BCL for compatibility with old code, but new code should use the generic versions. The _Non-Generic Collections_ lesson covers the old types for the occasional time you'll run into them.
A few other related namespaces:
| Namespace | Purpose |
|---|---|
System.Collections.Generic | The modern, type-safe collections. The default. |
System.Collections | The original .NET 1.0 untyped collections (ArrayList, Hashtable). |
System.Collections.Concurrent | Thread-safe collections for concurrent code. |
System.Collections.Immutable | Collections that can't be modified once built. |
System.Collections.ObjectModel | Wrappers like ReadOnlyCollection<T> and ObservableCollection<T>. |
Most of this section lives in the first namespace.
An array isn't part of System.Collections.Generic. It's a built-in language feature, declared with bracket syntax (int[], string[]), and the compiler has direct knowledge of it. But arrays do implement several of the same interfaces as the generic collections, so they fit into the same family in a loose sense. Knowing where they differ helps you pick correctly.
Arrays are fixed-size. Once you create a string[5], it holds exactly five slots, forever. To "grow" an array, you allocate a new bigger one and copy. Collections like List<T> hide that copy behind their public API, but the underlying mechanic is the same.
Two other small differences come up often. Arrays use a property called Length, while almost every collection uses Count. The reason is historical: arrays use Length because they expose a fixed allocation size, and collections use Count because they expose how many items are currently in the container, which can change.
The other difference is what they specialise in. Arrays are best when you know the size in advance and never need to resize, or when you specifically want the lowest memory overhead. Collections are better when the size changes or when you need richer operations like search by key, ordering, or deduplication. For most application code (carts, order lists, product catalogues), collections win.
The next table summarises the trade-off:
| Feature | Array (T[]) | List<T> |
|---|---|---|
| Size at creation | Fixed | Grows automatically |
| Size property | Length | Count |
| Insert in middle | Manual copy required | Insert(index, item) |
| Remove from middle | Manual copy required | RemoveAt(index) |
| Sort | Array.Sort() | list.Sort() |
| Search by key | Linear scan | Linear scan (use Dictionary<TKey, TValue> for keys) |
| Memory overhead | Lowest | Slightly higher (capacity + count) |
Arrays still have a real place. Multi-dimensional arrays (int[,]), Span<T>-friendly buffers, and the lowest-overhead hot paths are all array territory. But for everyday "I need a bag of things that might grow," use a collection.
One other practical detail. The new[] { ... } array initialiser and the new List<T> { ... } collection initialiser look similar, but they produce different things. The array version produces a T[], the collection version produces a List<T>. Choose based on what the rest of the code wants to do with it.
Both forms accept a comma-separated list of initial values. C# 12 added a third unified form (collection expressions, the [1, 2, 3] syntax) that works for any collection type.
Most generic collections sit on top of a small set of interfaces. An interface in C# is a contract that says "any type that implements me must provide these members." Multiple concrete types can satisfy the same interface, which means code written against the interface works with all of them. For collections, the interface hierarchy looks roughly like this:
Read it from the top down. IEnumerable<T> is the most basic contract: "you can walk through my items one at a time." Anything you can foreach over implements IEnumerable<T>, including arrays and every collection in the namespace. ICollection<T> adds a Count property and the ability to add, remove, and check containment. The three sub-interfaces (IList<T>, ISet<T>, IDictionary<TKey, TValue>) each add a more specialised contract. Concrete types like List<T>, HashSet<T>, and Dictionary<TKey, TValue> sit at the leaves.
A few types don't fit cleanly into the IList/ISet/IDictionary split. Queue<T>, Stack<T>, and PriorityQueue<TElement, TPriority> implement ICollection<T> (or in PriorityQueue's case, only IEnumerable<T>) without claiming to be one of the three specialised interfaces, because their access patterns (FIFO, LIFO, priority-based) don't match. They're still part of the family, just on a different branch.
The interface layer matters because most well-written .NET methods take the broadest interface that does the job. A method that just iterates accepts IEnumerable<T>. A method that needs to count or check containment accepts ICollection<T>. A method that needs random access by index takes IList<T>. This means you can hand the method a List<T>, an array, or a custom type, and it works with all three.
A small example of the principle in action. The method below sums the values in any sequence of decimals, regardless of which concrete collection produced them.
SumPrices doesn't care what flavour of collection it gets. The signature IEnumerable<decimal> says "anything I can foreach over." That's the broadest contract the method can express, and it makes the method reusable across every callable shape. Accepting interfaces instead of concrete types is one of the easiest wins in collection-heavy code.
The rest of this section drills into each type one at a time. Here, each gets a paragraph: what it is, the access pattern it's tuned for, and when to use it. Treat this as a directory. If a type sounds right for a problem, the linked chapter has the details.
A List<T> is an ordered, indexable, growable collection backed by an internal array. You add items to the end, you remove from anywhere, you index with list[i], and you iterate with foreach. It permits duplicates. Random access by integer index is O(1), append is amortized O(1), and search by value is O(n). It's the default collection in C#. If you don't know which one to use, start with List<T>. Use it when you have a sequence of items in some meaningful order (cart items, recent orders, search results) and you need to add, remove, or iterate. Covered in the _List<T>_ lesson.
A LinkedList<T> is a doubly-linked list. Each node holds a value plus pointers to the previous and next nodes. You can't index into it with list[i]. To reach the third element, you walk from the head three times. Inserting and removing at known nodes is O(1), which is the appealing property, but you'd need to already have a reference to the node to take advantage of that. In practice, List<T> is faster for almost every realistic workload because the contiguous memory in an array plays nicely with the CPU cache, while linked-list nodes are scattered across the heap. Use LinkedList<T> only when you have a clear need to splice nodes in or out of the middle of a long sequence using existing node references. Covered in the _LinkedList<T>_ lesson.
A Dictionary<TKey, TValue> maps keys to values with O(1) average lookup, insert, and remove. Internally it uses a hash table: it hashes each key to a bucket, and most operations only touch that one bucket. Keys must be unique. The order of items isn't guaranteed; if you iterate, you get them in some internal order that you shouldn't rely on. This is the appropriate type when you need to look something up by a non-integer key: product ID to product, customer ID to customer, coupon code to discount. Use it whenever you'd otherwise write a loop that scans a list looking for a match. Covered in the _Dictionary<TKey, TValue>_ lesson.
(SKU here means "the product's identifier string." Different stores use different schemes.)
These are the two ordered cousins of Dictionary<TKey, TValue>. Both store keys in sorted order, so iterating gives you items in key order instead of an unpredictable bucket order. SortedDictionary uses a self-balancing binary tree internally and has O(log n) lookup. SortedList uses two parallel arrays and is more memory-efficient but slower at insertion when the new key isn't at the end. Use either when you specifically need keys in sorted order (a leaderboard by score, orders by timestamp) and accept the higher per-operation cost in exchange.
A HashSet<T> is a hash table of unique values. It stores items without duplicates, and it answers "is this item in the set?" in O(1). There's no order, and you can't index into it. Think of it as a Dictionary<TKey, TValue> where the value side is missing and the keys are the data. Use it when you need to track uniqueness (which products has this user viewed, which email addresses have already received a confirmation) or do set operations like union, intersection, and difference. Covered in the _HashSet<T>_ lesson.
A SortedSet<T> is to HashSet<T> what SortedDictionary is to Dictionary. It stores unique values in sorted order, using a self-balancing tree internally. Lookups are O(log n) instead of O(1), and iteration walks the items in sorted order. Use it when you need uniqueness and ordering at the same time (a sorted list of distinct customer ratings, a top-N leaderboard). Covered in the _SortedSet<T>_ lesson.
A Queue<T> is a first-in, first-out (FIFO) buffer. You Enqueue items at the back and Dequeue items from the front. Both operations are O(1) amortized. This is the appropriate shape for "a line of work to process in arrival order": orders waiting to be shipped, customer service tickets, jobs in a queue. Use it whenever the access pattern is "handle the oldest item next." Covered in the _Queue<T>_ lesson.
A Stack<T> is the opposite of a Queue<T>: last-in, first-out (LIFO). You Push items on top and Pop items off the top. Both are O(1). The classic uses are undo histories ("undo the most recent action first"), depth-first traversal, and parsing nested structures (parentheses, brackets, function calls). Use it when the most recent item is the one you want next. Covered in the _Stack<T>_ lesson.
A PriorityQueue<TElement, TPriority> is a queue where each item carries a priority value and the dequeue order is determined by priority, not arrival time. Lower priority value comes out first by default. Internally it's a binary heap, so Enqueue and Dequeue are O(log n). This is the appropriate type for "do the most important thing next" scenarios: a shipping queue where priority orders go first, an alert system where critical errors get handled before warnings, a task scheduler. Added in .NET 6, so older codebases won't have it. Covered in the _PriorityQueue<T, TPriority>_ lesson.
The express order leaves first because its priority value is the smallest, even though it was enqueued second.
PriorityQueue does not maintain a stable order for equal priorities. The two standard orders above could come out in either order on different runs (in practice they tend to be insertion order, but you shouldn't rely on it). If stability matters, encode a tiebreaker into the priority itself.
The collections we just covered handle most cases, but the BCL has three more groups.
The non-generic collections (ArrayList, Hashtable, Queue, Stack from System.Collections) are the original .NET 1.0 versions. They store everything as object, which means boxing for value types and casts on the way out. They exist for backwards compatibility with old code. Don't use them in new code.
The concurrent collections (ConcurrentDictionary<TKey, TValue>, ConcurrentQueue<T>, ConcurrentBag<T>, BlockingCollection<T>) are thread-safe. Multiple threads can read and write the same instance without locking from the caller. Most of the time, single-threaded collections are correct and faster, so use these only when you actually have multiple threads sharing state.
The immutable collections (ImmutableList<T>, ImmutableDictionary<TKey, TValue>, and so on, from System.Collections.Immutable) can't be modified after creation. Operations like Add return a new collection instead of mutating the existing one. This is useful for shared state across threads, functional-style code, and snapshots that mustn't change after being handed to a caller.
The read-only wrappers (ReadOnlyCollection<T>, ReadOnlyDictionary<TKey, TValue>, plus the read-only interfaces IReadOnlyList<T> and friends) wrap an existing mutable collection and expose only the read members. The underlying collection can still be modified by code that holds a reference to it, but the wrapper itself can't. This is the standard way to return a "view" of a collection from a method without letting the caller mutate it.
A short comparison:
| Group | Thread-safe? | Mutable? | When to use it |
|---|---|---|---|
Generic (List<T>, Dictionary<TKey, TValue>, ...) | No | Yes | Default for single-threaded code |
Concurrent (ConcurrentDictionary<TKey, TValue>, ...) | Yes | Yes | Shared across threads |
Immutable (ImmutableList<T>, ...) | Yes (because immutable) | No | Shared snapshots, functional style |
Read-only wrappers (ReadOnlyCollection<T>, ...) | Depends on inner | No (via the wrapper) | Exposing collections from public APIs |
Picking the right collection is mostly about answering three questions: do you need keys (lookup by key) or only values? Do you care about order? Do you need uniqueness? The decision tree falls out of those three:
The same logic in a table, including average-case complexity for the operations that usually drive the choice:
| Need | Collection | Lookup | Insert | Remove | Order | Duplicates |
|---|---|---|---|---|---|---|
| Indexed sequence | List<T> | O(n) by value, O(1) by index | O(1) amortized at end | O(n) | Insertion | Yes |
| Map by key | Dictionary<TKey, TValue> | O(1) | O(1) | O(1) | None | No (keys) |
| Map by sorted key | SortedDictionary<TKey, TValue> | O(log n) | O(log n) | O(log n) | Sorted by key | No (keys) |
| Set of unique values | HashSet<T> | O(1) | O(1) | O(1) | None | No |
| Sorted unique values | SortedSet<T> | O(log n) | O(log n) | O(log n) | Sorted | No |
| FIFO queue | Queue<T> | (peek front only) | O(1) at back | O(1) from front | FIFO | Yes |
| LIFO stack | Stack<T> | (peek top only) | O(1) at top | O(1) from top | LIFO | Yes |
| Priority queue | PriorityQueue<TElement, TPriority> | (peek smallest only) | O(log n) | O(log n) | By priority | Yes |
| Splice-heavy chain | LinkedList<T> | O(n) | O(1) at known node | O(1) at known node | Insertion | Yes |
A few rules of thumb to overlay on the table:
List<T> (indexed sequence) or Dictionary<TKey, TValue> (lookup by key), the choice usually becomes obvious from the three questions above. Most real bugs in collection choice are picking List<T> and scanning it instead of using Dictionary<TKey, TValue> for O(1) lookup.HashSet<T> for uniqueness with a List<T> for order, or you use a LinkedHashSet-style implementation from a NuGet package. We'll come back to this in the HashSet chapter.Dictionary<TKey, TValue> with terrible hash collisions becomes O(n), for example), but the average is what you should plan around.List<T>, switch when there's a reason" is a useful starting heuristic but a poor stopping point. The reason to switch shows up the first time the code does list.Any(x => x.Key == something), list.Distinct().ToList(), or list.OrderBy(...).First() in a hot path. Each of those is a hint that another collection type would do the same job in fewer operations.The other thing the table understates is that picking the right collection affects readability as much as performance. A Dictionary<string, Product> signals "this is a lookup table" at a glance, the same way a HashSet<string> signals "this is a uniqueness set" and a Queue<Job> signals "this is a pipeline of pending work." A List<T> of any of those things compiles fine but tells the reader less about intent. Treat the choice as a comment about what the data is for, not only a performance tuning knob.
The remaining lessons in this section break down into four loose groups:
The interface lessons cover the contracts the concrete types implement. IEnumerable<T> and ICollection<T> are the two most important for everyday code. IComparer<T> and IEqualityComparer<T> are how you tell a sorted or hash-based collection how to compare items when the default equality and ordering aren't what you want (sorting customers by total spend, treating SKUs as case-insensitive).
The concrete collection lessons each take one of the types we surveyed above and go deep: API, internal structure, performance characteristics, common pitfalls. Those lessons cover not only which one to pick, but how to use each one well.
The alternative-flavour lessons cover the variants for old code, multi-threaded code, immutable data, and read-only views. These are situational, not daily-driver topics, but they show up often enough that you should be able to recognise them.
The language features lessons cover the C# language constructs built around collections. yield lets you write iterators (methods that produce a sequence lazily, one item at a time, without building a whole list in memory). Collection expressions are the modern syntax for building any collection from a literal [1, 2, 3], introduced in C# 12.
The section ends with a capstone that builds a small e-commerce module pulling several of the collections together: a product catalogue keyed by SKU, a cart of items, a wishlist of unique products, a priority shipping queue, and an undo history for cart edits. That's what "I know the collections" looks like in practice.
Three behaviours apply to almost every collection in the namespace. The detailed coverage is in the specific lessons, but the survey wouldn't be complete without them.
First, most collections are reference types. A List<T> is a class, allocated on the heap. Assigning one variable to another copies the reference, not the contents. The two names point at the same list, and modifications through one are visible through the other.
Both names see two items, because they refer to the same List<string> instance. If you want an independent copy, you have to explicitly create one with new List<T>(cart) or cart.ToList(). Assuming "I copied the variable, I have two separate carts" is a common collection bug.
Second, modifying a collection while iterating over it usually throws. Every generic collection in System.Collections.Generic tracks an internal version number, and the enumerator created by foreach checks the version each step. If you Add or Remove during the loop, the next iteration throws InvalidOperationException.
The fix is to iterate over a snapshot (foreach (var order in orders.ToList())), or to use a for loop with explicit indices and walk backwards when removing, or to collect the items you want to remove first and call RemoveAll afterwards. The concurrent and immutable collections have different rules here, which is part of why they exist.
Third, `foreach` works on every collection in the namespace because they all implement IEnumerable<T>. You can iterate a List<T>, a Dictionary<TKey, TValue>, a HashSet<T>, a Queue<T>, or a Stack<T> the same way, and the same is true for arrays. The shape of the loop variable changes, of course (key-value pairs for dictionaries, plain values for the rest), but the syntax doesn't.
That uniformity is the whole point of IEnumerable<T>. Any code you write that takes IEnumerable<T> works with every collection in the BCL, plus arrays, plus LINQ query results, plus any custom iterator you build yourself.
Iterating a Dictionary<TKey, TValue> yields KeyValuePair<TKey, TValue> structs by value. That's fine for normal use, but if you only need the keys, iterate dict.Keys instead of pulling each pair to access .Key. The values-only equivalent is dict.Values. Both are slightly faster and read more clearly at the call site.
A fourth behaviour, even though the details belong to later chapters, is how equality is decided. Hash-based collections (Dictionary, HashSet, and so on) call Equals and GetHashCode on the items or keys to decide whether two values are the same. For value types like int, string, and decimal, the default implementations behave as expected: two equal values are treated as equal. For reference types like custom classes, the default implementations compare references, not contents, which means two distinct Product objects with the same fields are considered different items.
Two Product instances with identical content end up as two items in the HashSet, because the default equality on a class checks reference identity, not field-by-field equality. The fix is either to override Equals and GetHashCode on Product, to declare it as a record (records have built-in value equality), or to pass a custom IEqualityComparer<Product> when creating the set. For this lesson the takeaway is: hash-based collections do not automatically compare by content. If you put your own class in one, plan for how equality should work.