AlgoMaster Logo

String Basics

Last Updated: May 17, 2026

10 min read

A string is the type C# uses for any piece of text: a product name in a cart, a customer email, an order ID printed on a receipt. Almost every program touches strings, so getting comfortable with how they work, what they actually are under the hood, and what you can do with the most basic operations is foundational. This chapter covers what a string is in C#, how to declare and initialize one, how to read characters out of it, and the small set of gotchas that catch beginners on day one.

What a String Is

A string in C# is a sequence of characters, stored as a reference type on the heap. Once a string is created, you cannot change its contents. Every operation that looks like it modifies a string actually returns a brand new string.

A concrete example. A customer's name as a string:

Twelve characters: the eleven letters plus the space. The variable customerName holds a reference to a heap object that contains those twelve characters in order.

Four properties of C# strings are worth pinning down up front. Almost everything else in this section builds on these:

  • Reference type. A string variable holds a reference to a string object on the heap, not the characters inline.
  • Immutable. A string's contents never change once it's created. Methods like Replace or ToUpper return new strings; they don't edit the original. The full story of why and what that means for performance is in the _String Immutability & Interning_ lesson.
  • Sequence of `char`. Each position is a char, which is a 16-bit UTF-16 code unit. Index 0 is the first char, index Length - 1 is the last.
  • UTF-16 encoded. Every char is two bytes. Most everyday text fits in one char per character, but some symbols (certain emoji, rare scripts) take two char slots. This usually doesn't matter for typical e-commerce text.

A picture helps. Two string variables, one heap object each:

Each variable lives on the stack (when it's a local) and stores the address of a string object on the heap. The string object itself holds the length plus the sequence of char values.

The keyword string is a C# alias for the type System.String in the Base Class Library. The two are exactly the same type. Writing string customerName and String customerName compiles to identical IL. By convention, the lowercase string is used for variable declarations (it's a keyword like int or bool), and the uppercase String is used when you call static methods like String.IsNullOrEmpty or String.Concat. Both styles work; most modern C# code uses string everywhere, including for static members (string.IsNullOrEmpty).

Declaring and Initializing Strings

There are several ways to create a string. The simplest is a string literal, which is text wrapped in double quotes:

The empty string "" is a real, valid string object. It has zero characters, so its Length is 0, but it is not null. You can call methods on it without throwing.

A few more ways to create strings that come up in everyday code:

string.Empty is a static field that holds "". The two are interchangeable in behavior; string.Empty is mostly a readability choice. Some teams prefer it because it makes the intent clearer ("this is deliberately empty, not a forgotten value"). Others stick with "" because it's shorter. Either is fine.

null is different. A string declared as null doesn't reference any string object at all. Calling any method on it (including reading .Length) throws NullReferenceException. The _Null vs Empty vs Whitespace_ section below deals with the difference more carefully.

new string('*', 5) is the one constructor you'll actually use from time to time. It builds a string of five asterisks. There are a handful of other string constructors (for char[] arrays and pointers), but most are rare in everyday code.

Null vs Empty vs Whitespace

Three states often get mixed up: null, "", and " " (just spaces). They look similar in casual conversation but behave very differently in code.

ValueIs null?LengthSafe to call .ToUpper()?
nullyesthrowsthrows
"" (empty)no0yes, returns ""
" " (whitespace)no3yes, returns " "
"Alice"no5yes, returns "ALICE"

A short demonstration:

The Base Class Library provides two helpers that wrap up the common checks: string.IsNullOrEmpty(s) returns true for null or "", and string.IsNullOrWhiteSpace(s) returns true for null, "", or any string of only spaces, tabs, and similar characters. These are the standard way to validate user input like a customer name or coupon code. We use them in passing in this lesson and look at them in detail in the _String Methods_ lesson.

Modern C# (8 and later) has nullable reference types turned on by default in new projects. That means the compiler treats string and string? as different. A string is meant to never be null; a string? is allowed to be null. If you write string customerName = null; in a nullable-enabled project, the compiler issues a warning (CS8600) because you're putting null into a slot that promised it wouldn't be null. The runtime behavior is the same; the compiler is just helping you spot null bugs before they hit production.

Accessing Characters by Index

A string supports indexed reads. The syntax is the same as for an array: square brackets, an integer index, starting at 0.

product[0] returns a char, the type C# uses for a single 16-bit character. It's a value type, written with single quotes ('H'), not double quotes ("H" is a one-character string).

The last valid index is Length - 1. Reading product[product.Length] is one past the end and throws IndexOutOfRangeException at runtime:

You cannot write through the index. Strings are immutable, so the slot assignment that arrays allow is a compile error for strings:

The indexer on string is read-only. If you need to "change a character," you build a new string. The _String Immutability & Interning_ and _StringBuilder_ lessons get into how and why that works.

The from-end index ^1 from C# 8 works on strings too, the same way it works on arrays:

s[^n] is equivalent to s[s.Length - n]. Useful when you just want the last character without computing the index yourself.

Iterating a String

A string is a sequence, so foreach walks it character by character:

foreach is the cleanest way to look at every character when you don't care about the index. The loop variable type is char, since that's what each slot of the string is.

When you need the index too (say, to find which position a character is at), use a for loop with Length:

The pattern is the same as iterating an array: index from 0, condition is < Length, increment by 1. The for form is what you reach for when you need to compare or skip based on position.

A practical use: count how many times a particular character appears in a product name. This is a job for a foreach, since the index doesn't matter:

The standard library has helpers for this exact thing (Count, IndexOf, and friends). The point here is that a string behaves like a read-only sequence of char, and the two loop forms you already know from arrays apply directly.

Escape Sequences

Some characters can't be written literally inside a string. A double quote would end the string early. A newline would break the source file in two. For these, C# uses escape sequences: a backslash followed by one or more characters that stand for something else.

The ones that come up most often in everyday code:

SequenceMeaning
\nnewline (line feed)
\ttab
\rcarriage return
\\a literal backslash
\"a literal double quote
\'a literal single quote (mainly for char literals)
\0the null character
\uXXXXa Unicode character by its 4-digit hex code

A small example that uses several at once:

The \n characters split the receipt over three lines. The \t lines up the total under a tab. The \" pair lets the inner quotes survive without ending the outer string. And \\ produces one backslash in the output; each \\ is a single backslash to the runtime.

For accented or non-ASCII characters, C# source files are typically UTF-8 encoded, so you can just type the character directly:

If you need a character that's hard to type or you want the source to be ASCII-only, the \u escape spells out the Unicode code point in hex. é is U+00E9, so é produces the same character:

Both forms produce the same string in memory. The choice is a matter of source readability.

One thing the escape sequences don't help with is writing long multi-line text with backslashes (like Windows file paths or regular expressions). For that, C# has verbatim strings (@"...") and raw string literals ("""..."""), both of which let you skip the escaping entirely.

Concatenation with +

The simplest way to combine two strings is the + operator. It produces a new string that holds the contents of the left side followed by the contents of the right side:

The += shortcut works as well, but remember what it does under the hood: name += "!" is the same as name = name + "!". Since strings are immutable, the right side creates a new string and the left side now points to that new string. The original string isn't modified; it's just no longer referenced by name.

You can also mix in non-string values. C# automatically calls .ToString() on each operand:

For three or four pieces, + reads fine. For more pieces, or for any case where you'd like to weave variables into a sentence, string interpolation ($"...") is the modern way. The _String Interpolation_ lesson covers it.

A small loop that shows the problem (don't write code like this in production):

It works. It's also wasteful: each iteration allocates a new string, copies the old one, and throws the old one away for the garbage collector. For five iterations you won't notice. For fifty thousand, you will. The _StringBuilder_ lesson shows the right tool for the job.

Multi-line Strings with \n

The simplest way to spread a string over multiple visual lines in your output is the \n escape:

One string, three lines of output. The string itself contains two \n characters; Console.WriteLine doesn't add anything special to the newlines.

When the text is short, \n is fine. When it's longer (say, a multi-line product description with quotes and indentation), the escapes start to pile up and the code gets hard to read. That's where the verbatim string literal @"..." and the raw string literal """...""" come in. Both let you write multi-line text without escape sequences. The _Verbatim & Raw String Literals_ lesson covers them in full.

One caveat about \n worth knowing now: on Windows, the convention for "newline" is \r\n (carriage return plus line feed). On Linux and macOS, it's just \n. When you write \n in your code, you get a single line-feed character regardless of platform. For most output to the console this works fine. For producing files that other Windows programs expect to read with proper line endings, use Environment.NewLine instead, which expands to whichever convention the current OS uses. That said, in everyday code, \n is almost always what you want.

A Worked Example

A small program that pulls several pieces together. Take a customer name and an order ID, print a banner, look at individual characters, count something, and build a one-line summary.

Look at what the example exercises. A literal with a non-ASCII character. Length for the size. [0] and [^1] for first and last. A foreach to walk the characters and count spaces. And string concatenation with + for the summary. Every piece in this lesson showed up at least once.

What this lesson did not show: changing characters in a string (you can't, see the _String Immutability & Interning_ lesson), trimming or splitting (see _String Methods_), or using $"..." for cleaner formatting (see _String Interpolation_).

Summary

  • A string is a reference-typed, immutable, ordered sequence of UTF-16 char values. Once created, its contents never change.
  • string and System.String are the same type. Use the lowercase keyword by convention.
  • Declare with a literal ("text"), string.Empty, null, or a constructor like new string('*', 5). The empty string and null are not the same thing.
  • Read characters with s[i], which is O(1). The last valid index is Length - 1, or equivalently ^1. Strings are not assignable through the indexer.
  • Iterate with foreach (char c in s) when you only need the values, or for (int i = 0; i < s.Length; i++) when you also need the index.
  • Escape sequences like \n, \t, \\, \", and \uXXXX let you embed characters that can't be written literally inside a string.
  • Concatenation with + produces a new string each time. Fine for a few pieces; reach for StringBuilder inside loops.