AlgoMaster Logo

String Immutability & Interning

Last Updated: May 22, 2026

High Priority
10 min read

In C#, a string is immutable. Once a string object exists on the heap, its characters never change. Every method that looks like it "edits" a string actually returns a brand new string, and the original sits unchanged until the garbage collector cleans it up. This chapter walks through what immutability really means, why C# chose it, what it costs in tight loops, and a related optimization called string interning that lets identical literals share one heap entry.

What Immutability Means

A string in C# is a reference type whose contents are fixed at the moment it's created. There's no string.SetChar(i, c) method, no way to assign into name[2], and no built-in mutation API at all. Every operation that seems to change a string returns a new string and leaves the original alone.

brand.ToUpper() did not turn brand into "APPLE". It returned a new string "APPLE" and assigned it to upper. The original "apple" is still on the heap, still spelled in lowercase, still pointed to by brand.

The same is true for Replace, Substring, Trim, Insert, Remove, PadLeft, and every other "transforming" method on string. They all build a new string. If you don't capture the return value, the work is lost.

The first call to Trim() did the work, allocated a trimmed copy on the heap, and then threw the result away because nothing captured it. productName itself was never touched. A common mistake is calling a method on a string and expecting the variable to change.

The same idea with Replace, drawn with a memory diagram for clarity:

What's on the heap after this code runs:

Two separate string objects on the heap. customerName still points at the original "Ashley". fixedName points at a new "Ashleigh". Replace did not edit the first object; it built the second one. The first object stays alive as long as something holds a reference to it.

If you reassign the original variable, you change where the variable points, but you still don't change the existing heap object:

The variable customerName now points to the new object "Ashleigh". The original "Ashley" is still on the heap until the garbage collector notices no variable refers to it. Reassignment changes the arrow, not the box.

Why C# Made Strings Immutable

Immutability is a deliberate design choice. It costs allocations on every "modification," so why pay that price? Four reasons drive it.

Thread safety. Multiple threads can read the same string at the same time without any locking. There's no risk of one thread seeing a half-modified value because the contents never change. In a multi-threaded program, an immutable type is automatically safe to share. A mutable string would force callers to coordinate access, which is a constant source of bugs.

Stable hash codes. A string is a very common key type for Dictionary<TKey, TValue> and HashSet<T>. Both rely on the hash code of the key to find the right bucket. If a string's contents could change after it was inserted, its hash code would change, the bucket would be wrong, and the dictionary would lose the entry. Because strings can't change, their hash codes are stable for the lifetime of the object.

Security. A string that holds a file path, a URL, or a username can be passed to a security check and then to the function that uses it. With an immutable string, there's no risk that another thread (or a hidden reference) modifies the value between the check and the use. This pattern, called time-of-check to time-of-use, is a classic security flaw that immutability sidesteps entirely.

Intern-pool sharing. Because strings can't change, identical literals can safely share the same heap object. Two variables holding "Apple" can both point to the same memory, since neither can modify it without producing a new object anyway. This is the foundation of string interning, covered later in this lesson.

Mutability is not free either way. With mutable strings you pay for locking and defensive copies. With immutable strings you pay for new allocations on edits. C# (and Java, and many other languages) decided that the second cost is the easier one to manage, especially with a tool like StringBuilder available for the rare cases where you really do need to mutate.

PropertyImmutable stringsMutable strings
Safe to share between threadsYes, no locks neededOnly with synchronization
Safe as dictionary keyYes, hash never changesOnly if you never mutate after insertion
Edit costAllocates a new stringIn-place, no allocation
Aliasing riskNone, two refs are safeCaller may see surprise changes
Memory dedup (interning)PossibleNot safe

The Allocation Cost

Every "edit" on a string returns a new heap object. For a single operation, that's a few hundred bytes at worst, and the garbage collector handles cleanup. The cost only becomes a problem when string operations happen in a loop, because each iteration allocates a fresh string that includes everything from the previous one.

Consider building a comma-separated list of product names by concatenation:

Each iteration does this. Iteration 1 builds "Apple, ". Iteration 2 takes the existing "Apple, " and builds "Apple, Banana, ", which requires copying all of "Apple, " plus the new pieces into a fresh allocation. Iteration 3 copies "Apple, Banana, " again into a longer buffer, and so on. By the end, the program has built and thrown away five intermediate strings, each one longer than the last.

Concatenating strings in a loop of N items allocates O(N) intermediate strings, but the total bytes copied are O(N^2) because each step copies the running result. For N=1000, that's roughly half a million character copies for a final string that's only a few thousand characters long.

The same operation with the wasteful pattern and a counter:

Five intermediate strings, lengths 6, 12, 18, 24, 30. Total bytes copied across all steps is 6 + 12 + 18 + 24 + 30 = 90, which is three times the size of the final string. The pattern is quadratic. Push it to a thousand iterations and you're copying hundreds of thousands of characters for a result that's only a few thousand long.

For one-off concatenations like "Hello, " + name, the cost is negligible. For loops, batch joins, or building text incrementally, use StringBuilder, which holds a mutable buffer and only allocates a final string when you call ToString(). The _StringBuilder_ lesson covers it in detail.

A quick preview of the difference:

Same result, but the loop never allocated an intermediate string. StringBuilder grew an internal character buffer in place, doubling its capacity when needed. The final ToString() is the only string allocation in the whole loop.

string.Join(", ", cart) is the standard fix for this exact pattern. It computes the final length up front, allocates one buffer, and fills it in a single pass. Use string.Join when you have a collection ready; use StringBuilder when you're building incrementally.

String Interning

String interning is a memory-saving optimization built into the CLR (the Common Language Runtime, the part of .NET that runs your code). The CLR maintains an internal table called the intern pool. Every string in the pool is unique by content. If you ask the pool whether "Apple" is already there and it is, you get back the existing reference instead of a new one.

The point of the pool: if a million places in your program use the literal "Apple" as a product brand, you don't want a million separate "Apple" strings on the heap. One copy is enough, since strings are immutable and can't diverge.

The CLR runs the intern pool automatically for string literals. Every literal string that appears in your compiled code gets added to the pool at module load time. Two literals with the same content end up pointing to the same heap object:

ReferenceEquals checks whether two variables point to the exact same heap object, not whether they have the same content. The fact that it returns True here means a and b are literally the same object in memory. The compiler emitted the literal "Apple" once, and both a and b got the same reference out of the intern pool.

A diagram of the layout:

The three variables holding the literal "Apple" all share one heap object inside the intern pool. The fourth variable, built at runtime, points to a separate, unpooled "Apple" on the heap. Same characters, different objects.

The pool is keyed by content, so equal literals collapse to one. A more thorough demonstration with multiple values:

Equal literals share. Different literals don't. Two "Apple" strings live as one heap object, two "Samsung" strings live as one heap object, and the two pools entries are different objects from each other.

Runtime Strings Are Not Interned

Only literals are interned automatically. Strings that the program builds at runtime, by concatenation, by Substring, by ToString() on a number, by reading user input, are not added to the pool by default. They go on the regular heap and are subject to normal garbage collection.

Three strings, all spelling "Apple". The first one came from the intern pool. The second was built from a character array at runtime, so the CLR allocated a fresh heap object for it. The third was built by concatenation involving a runtime piece (new string('p', 1)), so it's also a fresh heap object. ReferenceEquals is False for both runtime-built strings even though Equals is True. The contents match; the objects don't.

One nuance: if every piece of a concatenation is a literal known at compile time, the C# compiler folds them into a single literal, and that single literal goes into the intern pool. So "Ap" + "ple" compiles to the literal "Apple", which is interned. The runtime case above breaks that optimization by mixing in new string('p', 1), which is built at runtime.

A ReferenceEquals check is O(1) and faster than Equals for long strings, because it's a pointer comparison. But it only returns True when both strings share an object. Don't use it as a substitute for Equals unless you've explicitly arranged for interning.

Manual Interning

When you have a lot of duplicate runtime strings, the BCL gives you two helpers to use the intern pool manually: string.Intern(s) and string.IsInterned(s). Both live on the static string class.

string.Intern(s) checks the pool for a string with the same content as s. If it's there, it returns the pool reference. If it isn't, it adds s to the pool and returns s itself. Either way, the returned reference is interned.

The original runtime variable was a separate heap object. string.Intern(runtime) returned the pool reference, which is the same object as literal. And runtime itself was not modified. The local variable still points at the unpooled copy. Interning doesn't reach into existing references and rewrite them; it gives you a new reference to the pool entry.

string.IsInterned(s) is the read-only check. It returns the pool reference if the string is already in the pool, or null if it isn't:

The literal "Cherry" was interned automatically when the program loaded. The runtime-built copy was not. After calling string.Intern(b), the pool now contains an entry with content "Cherry" (which is the same entry a already pointed at, deduplicated by content), so IsInterned(b) now returns a reference.

Consider a program that reads ten thousand product reviews from a file, each with a brand name. Most brands repeat. Without interning, every duplicate is a separate heap allocation:

All three checks come back True because string.Intern collapsed every duplicate to the same pool reference. The list of seven brand names now points to only three distinct heap objects ("Apple", "Samsung", "Sony") instead of seven. For a few items the savings are tiny. For a feed of ten million reviews where only a few hundred brands ever appear, the difference between a few hundred strings and ten million strings is large.

When to Use Manual Interning, and When Not To

Interning is not free. The pool itself takes memory, and pool entries are special: they live for the lifetime of the application domain and are never garbage collected. That's the trade-off. You save memory on duplicates, but you can't reclaim the memory of a pooled string when you stop using it.

SituationShould you intern?
Many duplicates of a small known set (brands, statuses, country codes)Yes, big win
Unique values per item (user IDs, order IDs, timestamps)No, you'd flood the pool
Short-lived strings used once and thrown awayNo, the pool keeps them alive forever
Strings used as dictionary keys with Equals comparisonUsually no, Equals is fine
Hot path comparing the same set of strings millions of timesYes, ReferenceEquals becomes a single pointer compare

Treat interning as a targeted optimization. The common case is to not call string.Intern at all and let the CLR handle literals on its own. When you're loading a large dataset with known repeated values, that's the moment to consider it.

Pooled strings are not garbage collected for the life of the app. Interning a million distinct strings is a memory leak that grows over time. Only intern when the set of distinct values is small and stable.