Dictionary<TKey, TValue> is the hash-based key-value store in the .NET base class library. It maps keys to values with average O(1) lookup, insert, and remove, which makes it a strong fit for looking something up "by id" instead of "by position." This lesson also covers the hashing basics that other collections in this section (HashSet, ConcurrentDictionary, ImmutableDictionary) reuse.
A List<T> is good for an ordered sequence walked from front to back. To answer "what's the product with id 42?" on a list of ten thousand products requires scanning until the match is found. That scan is O(n), which means doubling the catalog doubles the lookup time. A Dictionary<int, Product> answers the same question in roughly the same time whether the catalog has ten products or ten million.
The trade is that the dictionary doesn't preserve a meaningful order, doesn't allow duplicate keys, and uses extra memory for the bucket table. When those costs don't matter and lookup speed does, Dictionary is the default tool.
A dictionary keyed by product id:
The dictionary holds three entries. The lookup catalog[101] returns the Product whose key is 101 without scanning the other entries. The number 101 is hashed, the hash points to a bucket, and the runtime reads the product from that bucket.
The type parameters are the key type and the value type. The key is what the lookup uses. The value is what gets returned. They can be any types, but the key type's choice has consequences (equality, hashing) that the rest of this lesson works through.
Dictionary uses more memory than a list of the same size. Each entry stores the key, the value, the hash code, and a "next" link, and the bucket array has slack to keep load low. Use List for ten items where a linear scan is fine. Use Dictionary when lookup performance matters or when "by id" is the natural access pattern.
There are several ways to build a dictionary, and each has a real use case.
The simplest is the no-argument constructor:
This creates an empty dictionary with default initial capacity. The bucket array starts small and resizes as you add entries.
If the approximate entry count is known, pass a capacity hint:
The constructor allocates enough buckets up front to hold about that many entries without resizing. This matters when filling a dictionary in a loop with a known size, because each resize copies every entry into a larger bucket array, and copying ten thousand entries one resize after another is wasted work.
Resizing is O(n) at the moment it happens. Amortized over many Add calls, the cost per add is still O(1), but avoiding resizes by sizing up front is preferable.
The constructor also accepts an IEqualityComparer<TKey> to compare keys differently from the default. The most common reason is case-insensitive string keys, covered later in this chapter.
A dictionary can be built from any IEnumerable<KeyValuePair<TKey, TValue>>. That's how to copy one dictionary into another, or convert a sequence into a dictionary using LINQ's ToDictionary:
ToDictionary(p => p.Id) walks the list, picks the Id of each product as the key, and uses the whole product as the value. That's the idiomatic way to turn a list loaded from a database or a file into a fast lookup table.
The collection initializer is a compact literal form when the entries are known at compile time:
Each { key, value } pair calls Add(key, value). Because it's Add and not the indexer, two entries with the same key in a collection initializer throw at runtime.
There's also a newer syntax, the dictionary initializer, that uses indexer notation:
The two initializers look similar but compile to different calls. { key, value } calls Add. [key] = value calls the indexer setter. The practical difference is duplicate handling: Add throws on duplicates, the indexer overwrites. The next section covers why that matters.
This is a common stumbling block with Dictionary, so it deserves its own section.
Add(key, value) throws ArgumentException if the key is already present:
The original value 50 is unchanged because the failed Add doesn't touch the existing entry. This is the correct behavior when duplicates are unexpected and should raise an error. A catalog import that's supposed to have unique product ids should use Add, so a duplicate row in the source file becomes a loud failure.
The indexer assignment is the opposite. It overwrites without warning:
The first assignment creates the entry. The second replaces the value with 75. No exception, no warning, no record that the old value ever existed. This is the correct behavior when the intent is "set the current value, whatever it was", like updating a stock count after a sale.
The rule is short. Use Add when "this key is new" is part of the contract. Use the indexer when "this key now maps to this value" is the contract. Don't mix the two and expect one to behave like the other.
TryAdd (added in .NET Core 2.0) is the third option. It adds the entry if the key is new and returns true, or does nothing and returns false if the key already exists:
TryAdd fits "insert if absent" without writing the if (!dict.ContainsKey(...)) dict.Add(...) dance, and without paying the cost of an exception when the duplicate case is normal.
| Operation | If key absent | If key present | Throws on duplicate? |
|---|---|---|---|
Add(k, v) | inserts | does nothing to dictionary | yes |
dict[k] = v | inserts | overwrites | no |
TryAdd(k, v) | inserts, returns true | leaves value, returns false | no |
The indexer is the most direct way to read a value. It throws when the key isn't there:
KeyNotFoundException is raised by the indexer for a key that isn't in the dictionary. The exception is fine when the key really should always be present and missing means a bug. It's not fine when missing is a normal case, because using exceptions for normal control flow is slow and noisy.
For "the key might or might not be there," use TryGetValue:
TryGetValue returns true and writes the value into the out parameter when the key is present, or returns false and leaves the parameter at its default when the key is absent. No exception either way. This is the idiomatic shape for "look up the value if it exists" in C#.
GetValueOrDefault is shorter when there's no need to distinguish "missing" from "present but default":
The one-argument overload uses the type's default (zero for numbers, null for reference types). The two-argument overload uses the provided value. This is useful when the dictionary is conceptually "a count per id" and a missing key means zero.
A subtle point: GetValueOrDefault can hide bugs if the difference between "key absent" and "key present with the default value" matters. If a stock count of zero means "out of stock" but a missing key means "we don't carry this product at all," GetValueOrDefault collapses both into the same answer. Use TryGetValue when that distinction matters.
All three reads (indexer, TryGetValue, GetValueOrDefault) are O(1) average. They do the same hash-and-probe work internally. The choice is about what happens on a miss, not about speed.
ContainsKey answers "is this key in the dictionary?" in O(1) time on average:
Use ContainsKey when only the existence check matters and the value isn't needed. When both are needed, prefer TryGetValue, which does the lookup once instead of twice.
ContainsValue answers "does any entry have this value?" in O(n) time:
The O(n) cost is the important part. The bucket table indexes by key, not by value, so finding a value requires walking every entry and comparing. Frequent ContainsValue calls on a large dictionary suggest a need for a second dictionary keyed by the value, or a different data shape entirely.
ContainsKey is O(1) average, the same as a normal lookup. ContainsValue is O(n), which means a full scan of every entry. Avoid putting ContainsValue inside a hot loop on a large dictionary without considering the cost.
Remove(key) removes the entry with the given key and returns true if it was there, false if it wasn't:
There's also an overload that returns the removed value through an out parameter, useful to act on the value during removal:
Clear() empties the dictionary in one call:
Clear keeps the underlying bucket array allocated. If the dictionary is about to be refilled with similar size, this avoids a resize. When the dictionary is no longer needed, let it go out of scope and the garbage collector reclaims the memory.
A dictionary exposes two read-only views: Keys and Values. They're not copies. They're live views over the underlying data, so they reflect changes to the dictionary in real time:
The order in which keys come out is not guaranteed to be insertion order, and it's not sorted either. It depends on the bucket layout, which depends on the hash codes. For a specific order, sort the keys explicitly with Keys.OrderBy(...) or use a different collection like SortedDictionary.
To iterate keys and values together, foreach the dictionary directly. Each element is a KeyValuePair<TKey, TValue>:
Since C# 7, KeyValuePair supports deconstruction, the form most modern C# code uses:
var (id, name) unpacks the pair into two named variables. It reads better than pair.Key and pair.Value, and the IL it compiles to is the same.
One important rule: a dictionary cannot be modified while iterating it. Adding, removing, or replacing entries inside a foreach invalidates the enumerator, and the next iteration step throws InvalidOperationException with the message "Collection was modified."
The fix is to collect the keys to remove first, then remove them after the loop ends:
The first line builds a temporary list of keys to remove. The loop then removes them outside the iteration of the dictionary, so the enumerator is never invalidated.
The dictionary stores entries in an array of buckets. To find an entry, it hashes the key, mods the hash by the bucket count, and looks at that bucket. If two keys land in the same bucket (a collision), they're chained together in a linked list of entries, and the dictionary walks the chain comparing keys until it finds the right one or reaches the end.
The diagram shows a lookup of the key 102. The hash code of 102 is computed, the hash gets reduced to a bucket index (7), and bucket 7 holds a chain with three entries that all hashed to the same index. The dictionary walks the chain (entry 102, entry 218, entry 999) comparing keys with Equals until it finds 102 and returns its value.
When the chains stay short, every operation is O(1) on average: one hash, one bucket lookup, one or two equality checks. When the chains grow long (because too many keys collide, or because the bucket array hasn't been resized), operations degrade toward O(n) in the worst case. The dictionary fights this by resizing the bucket array when the load gets high, which redistributes the entries into more buckets and shortens the chains.
The flow for an Add operation:
This is what Add does. The indexer setter takes the same path but replaces the value at the "key already present" branch instead of throwing. TryAdd takes the same path but returns false at that branch instead of throwing.
The whole structure leans on two operations on the key: GetHashCode() and Equals(). If either is wrong, the dictionary stops working correctly. The next section covers that contract.
Every type in .NET inherits GetHashCode() and Equals(object) from object. The default implementations work for any type, but they may not suit the needs of a key.
The contract between these two methods is mandatory:
If a.Equals(b) is true, then a.GetHashCode() == b.GetHashCode() must also be true.
In plain words: equal objects must hash to the same code. The reverse doesn't have to hold (two unequal objects can hash to the same code, a collision, which is normal), but the forward direction is required.
Breaking this contract breaks the dictionary in subtle ways. An object just added appears "missing" because the lookup hashes to a different bucket than the insert did. Two equal keys can end up as two separate entries because they hash differently.
A broken example to make the point concrete:
a and b are equal because Equals was overridden to compare by Id. But GetHashCode wasn't overridden, so it uses the default reference-based hash, which is different for two different Customer instances. The dictionary stores a in one bucket. When asked about b, it hashes b to a different bucket and doesn't find anything there, so it reports the key is missing.
The fix is to override GetHashCode to be consistent with Equals:
Now a and b are equal and hash to the same code, so the dictionary treats them as the same key. Looking up by b finds the entry that was inserted with a.
A record type generates both overrides automatically based on its declared properties, which is one of the reasons records have become popular as dictionary keys:
Records generate value-based Equals and GetHashCode that compare the declared properties. As long as the listed properties match the meaning of "what makes two keys the same," this works directly.
For value types (int, string, Guid, DateTime, custom structs), the BCL already provides correct implementations. Strings hash by their character content. Integers hash by their bit pattern. Guids hash by their underlying bytes. These work as keys with no extra work.
Reference types defined by a user (plain class) use reference equality by default, which is the source of the bug above. For a class used as a dictionary key with value semantics, override both methods, or make it a record.
Hashing a key is called on every read, write, and check. If GetHashCode is slow (it loops through a thousand items, or it allocates), every dictionary operation pays that cost. For complex keys, cache the hash code in a field computed once if the type is immutable.
Sometimes the key type isn't yours to change, or the same type should behave one way in one dictionary and a different way in another. For example, string keys compared case-insensitively in one place and case-sensitively in another.
The dictionary constructor accepts an IEqualityComparer<TKey> for this case. The comparer provides the Equals and GetHashCode the dictionary uses, instead of the ones on the key type.
The BCL ships a set of ready-made string comparers in StringComparer. The most common is StringComparer.OrdinalIgnoreCase, which compares strings byte-by-byte but ignoring ASCII case differences:
The dictionary was constructed with StringComparer.OrdinalIgnoreCase, so all key comparisons ignore case. Looking up "MUG" finds the entry that was inserted as "mug", and looking up "notebook" finds the one inserted as "Notebook".
The default for Dictionary<string, T> is case-sensitive comparison. Without specifying a comparer, dict["MUG"] throws KeyNotFoundException even when dict["mug"] exists.
StringComparer has several variants:
| Comparer | Behavior |
|---|---|
StringComparer.Ordinal | Case-sensitive, byte-by-byte. The fastest. |
StringComparer.OrdinalIgnoreCase | Case-insensitive, byte-by-byte. Fast. |
StringComparer.InvariantCulture | Culture-aware but uses the invariant culture. |
StringComparer.InvariantCultureIgnoreCase | Culture-aware, case-insensitive, invariant. |
StringComparer.CurrentCulture | Uses the running thread's current culture. Avoid for stable keys. |
StringComparer.CurrentCultureIgnoreCase | Same, case-insensitive. Avoid. |
For most dictionary key scenarios where the strings are identifiers, codes, or values that aren't shown to users, the ordinal variants are the correct choice. They're the fastest and they don't change behavior based on the installed user language.
For other key types, a custom IEqualityComparer<T> works the same way: pass a comparer to the constructor and the dictionary defers all key comparisons to it.
The dictionary's bucket array grows as entries are added. Each growth roughly doubles the array and rehashes every entry into the new buckets, which is O(n) work at that moment.
If the total entry count is known up front, two options avoid the resizes:
EnsureCapacity (.NET Core 2.1+) sets the bucket array to a size that can hold at least the given number of entries without resizing. It returns the actual capacity, which is typically the next prime number at or above the request. The method does nothing if the existing capacity is already enough.
Output (typical):
The exact number depends on the runtime's internal sizing logic. The contract is "at least n," which is what callers need.
TrimExcess() does the opposite: it shrinks the bucket array to fit the current count, which can save memory after many entries have been removed. It's rarely necessary because dictionaries don't usually live long enough for that to matter, but it exists for the cases that need it.
Dictionary<TKey, TValue> is not thread-safe. Reading and writing it from multiple threads at the same time, or even writing from one thread while another reads, can corrupt the bucket array and produce undefined behavior, including infinite loops inside the dictionary's own code.
Reading from multiple threads at the same time is safe, but only if no thread is writing. As soon as a single thread writes, all readers and writers need to coordinate, either through a lock or by switching to a thread-safe collection.
For multi-threaded scenarios, use ConcurrentDictionary<TKey, TValue> from System.Collections.Concurrent. It provides atomic operations like AddOrUpdate, GetOrAdd, and TryUpdate, and its design assumes multiple threads will hit it at once.
ConcurrentDictionary is slower than Dictionary for single-threaded use because it pays the cost of internal synchronization even when only one thread is active. Use it only when concurrent access is needed. For single-threaded code, plain Dictionary is faster and uses less memory.
The short rule: one thread, Dictionary. Multiple threads with shared mutation, ConcurrentDictionary. Multiple threads reading a snapshot that never changes after construction, Dictionary is fine but consider ImmutableDictionary for clarity.
10 quizzes