Last Updated: February 1, 2026
Go makes concurrency easy with goroutines and channels, but this ease can be deceptive. Behind the scenes, modern CPUs and compilers perform aggressive optimizations that can reorder memory operations. Without understanding the memory model, you'll write code that works on your laptop but fails mysteriously in production under load.
Consider this innocent-looking code:
You might expect this to print "00", "02", "12", or even "01" if g() runs between the two assignments in f(). But it can also print "20", seeing b = 2 before a = 1. The compiler or CPU can reorder the writes in f() because there's no dependency between them within that goroutine.
The Go Memory Model gives you the rules to avoid these surprises. Follow the rules, and your program behaves predictably across all platforms and Go versions.
The core concept in Go's memory model is the happens-before relationship. If event A happens-before event B, then:
If neither A happens-before B nor B happens-before A, then A and B are concurrent, and there are no guarantees about which one sees the effects of the other.
Go defines several happens-before rules. Mastering these is essential for writing correct concurrent code.
Within a single goroutine, the happens-before order matches program order. If statement A comes before statement B in your code, A happens-before B.
This rule only applies within a single goroutine. It says nothing about how other goroutines see these operations.
Package initialization has a strict order:
init() functions complete before A's init() functions start.init() functions complete before main() starts.init() functions run in the order they appear (across multiple files, alphabetically by filename).The go statement that starts a new goroutine happens-before the goroutine's execution begins.
The write to msg happens-before the go statement, and the go statement happens-before the goroutine starts. By transitivity, the write to msg happens-before the read in the goroutine.
A goroutine's exit is not guaranteed to happen-before any event in the program. This is a common source of bugs:
You must use explicit synchronization (channels, sync primitives) to observe a goroutine's effects.
This is where Go's memory model shines. Channels provide strong synchronization guarantees:
A send on a channel happens-before the corresponding receive completes.
The send (B) happens-before the receive completes (C). Combined with program order (A happens-before B, C happens-before D), we get A happens-before D by transitivity.
The close of a channel happens-before a receive that returns a zero value because the channel is closed.
For buffered channels: a receive from a channel happens-before the corresponding send completes (if the buffer was full).
This enables a common pattern for limiting concurrency:
The kth receive on a channel with capacity C happens-before the (k+C)th send completes.
This generalizes the previous rule and is the foundation for using buffered channels as semaphores.
For sync.Mutex and sync.RWMutex:
For any call to `l.Unlock()`, there is a later call to `l.Lock()` (on the same mutex) that happens-after the unlock.
For `sync.RWMutex`: a call to `l.RUnlock()` happens-before a later `l.Lock()` (but RLock/RUnlock pairs don't synchronize with each other).
For sync.Once, the single call to f() from once.Do(f) happens-before any call to once.Do(f) returns.
This rule has an important subtlety: even if the function panics, subsequent calls won't retry. The "once" refers to execution attempts, not successful completions:
When to use sync.Once:
init() won't work (need runtime parameters)Edge case: If you need retryable initialization, use a mutex with a boolean flag instead:
wg.Done() happens-before wg.Wait() returns (when the counter reaches zero).
The key insight: Wait() only returns after all Done() calls have completed. This means all the writes that happened before each Done() are visible after Wait() returns.
Critical ordering: Add before Go
The Add() call must happen before the goroutine starts, not inside it:
When to use WaitGroup:
When NOT to use WaitGroup:
errgroup instead)cond.Broadcast() and cond.Signal() happen-before the corresponding cond.Wait() returns.
When to use sync.Cond: Use it when multiple goroutines need to wait for a complex condition involving multiple variables. For simple "wait for completion" scenarios, channels are usually cleaner. Cond shines when you need Broadcast() to wake all waiters simultaneously, something channels can only do once with close().
The channel rules we've covered (Rules 5-6) handle most synchronization needs elegantly. But sometimes you need finer-grained control over memory visibility without the overhead of channel communication. That's where atomic operations come in.
The sync/atomic package provides low-level atomic operations. These operations are indivisible, meaning no other goroutine can observe a partial state.
Atomic operations in Go provide sequentially consistent ordering. This means:
This is stronger than what many languages provide by default (like C++'s relaxed atomics) and makes reasoning about correctness easier.
For storing and loading interface values atomically:
The compiler might hoist the done read outside the loop (seeing it never changes within the loop), creating an infinite loop. Even if it doesn't, there's no happens-before relationship guaranteeing result is visible.
Fix: Use a channel or atomic:
This classic anti-pattern from other languages fails in Go too:
The first read of instance is unsynchronized. Another goroutine might see a non-nil pointer to a partially constructed Singleton. Here's how this can fail:
Step-by-step trace showing partial initialization:
The compiler or CPU might reorder operations so that the pointer assignment becomes visible before the struct fields are fully initialized. Goroutine 2 races past the nil check and gets a pointer to memory that hasn't been fully constructed yet.
Why sync.Once fixes this: The fix isn't just about preventing double initialization. sync.Once provides two critical guarantees:
Do(f) happen-before Do(f) returns for any callerThis second guarantee is key. When once.Do() returns, you're guaranteed to see all the writes that happened inside the function, regardless of which goroutine actually executed it.
Fix: Use sync.Once:
Maps are not safe for concurrent access. Go will panic with "concurrent map read and map write" (if you're lucky) or silently corrupt data.
Fix: Use sync.Map or protect with mutex:
Even though append looks atomic, it isn't. One goroutine might read the slice header while another is writing it.
Go provides a built-in race detector. Use it during development and testing:
The race detector instruments memory accesses and reports races:
Go's mantra is "Don't communicate by sharing memory; share memory by communicating." Channels make synchronization explicit and less error-prone.
Why this works: The channel send happens-before the receive completes (Rule 5). This single operation both transfers the data and synchronizes the goroutines. No separate "done" flag needed, no race conditions possible.
Trade-offs: Channels have higher overhead than atomics or mutexes (~50-100ns vs ~25ns). For performance-critical code with millions of operations per second, the overhead matters. But for most applications, the safety and clarity of channels outweigh the performance cost.
When to break this rule: Use shared memory with mutexes when you have true shared state that multiple goroutines read and modify (like a cache). Channels are for transferring ownership, not for protecting shared state.
When you do need shared memory, use the sync package:
Why mutexes work: Rule 6 guarantees that Unlock() happens-before any subsequent Lock(). All writes protected by the mutex are visible to any goroutine that later acquires the same mutex.
Trade-offs: Mutexes can cause contention when many goroutines compete for the same lock. Solutions include:
For simple values where you don't need complex invariants:
Why atomics work: Atomic operations are indivisible and provide sequential consistency. No goroutine can observe a "partial" atomic operation, and the ordering is globally consistent.
Trade-offs:
atomic.Value)When to use each:
| Scenario | Use |
|---|---|
| Simple counter | atomic.AddInt64 |
| Boolean flag | atomic.Bool (Go 1.19+) or atomic.Int32 |
| Multiple related fields | sync.Mutex |
| Replace entire config | atomic.Value |
| Read-heavy, rarely updated | sync.RWMutex |
Make it clear which fields need synchronization:
Why this matters: The race detector can find races during testing, but it can't find races that don't occur in your test scenarios. Documentation helps reviewers and future maintainers understand the synchronization design.
Trade-off: Documentation can drift from code. Complement comments with struct organization: group protected fields together after the mutex that protects them.
Add race detection to your continuous integration pipeline:
Why this is essential: Races are timing-dependent and may not manifest on your development machine but will appear in production under load. The race detector instruments memory accesses and catches races as they happen during tests.
Trade-offs:
Complement with: Stress testing (go test -count=100 -race), fuzzing, and code review. The race detector finds races that occur; good design prevents races from being possible.
With channels, mutexes, and atomics all providing synchronization, how do you choose? Here's a decision framework.
| Aspect | Channel | Mutex | Atomic |
|---|---|---|---|
| Use case | Transfer data between goroutines | Protect shared state | Single-value operations |
| Overhead | ~50-100ns | ~25ns (uncontended) | ~1-5ns |
| Can block | Yes (unbuffered) | Yes (contended) | No |
| Deadlock risk | Yes (circular waits) | Yes (lock ordering) | No |
| Multiple values | Yes (struct in channel) | Yes (protected section) | No (one value) |
| Composable | Yes (select) | Harder | No |
| Cancellation | Built-in (close) | Manual | N/A |
Channel: Passing work to workers
Mutex: Shared cache
RWMutex: Read-heavy config
Atomic: Metrics counter