Goroutines, channels, and sync.WaitGroup are the raw materials. This section is about the patterns commonly used in production code: pools that bound how many goroutines run at once, pipelines that chain stages together, mutexes for protecting shared data, atomics for cheap counters, and a handful of sync helpers that solve one problem each. This lesson is the map. Each subsection here is a tease for a full chapter that follows.

What Counts as a Concurrency Pattern

A concurrency pattern is a reusable shape for coordinating goroutines. It answers questions like "how do I run N tasks in parallel without spawning unbounded goroutines?" or "how do I let many readers and one writer share a cache safely?". Channels and sync.WaitGroup are the building blocks. Patterns are what you compose with them.

Go gives you two coordination styles, and most programs mix both:

Communicating sequential processes (CSP): Goroutines pass values through channels. State lives inside one goroutine at a time, and ownership moves with the message.
Shared memory with synchronization: Goroutines read and write the same variable, and a mutex, atomic, or sync.Once keeps the access safe.

Neither style is universally better. The job of this section is to give you a feel for which one fits which problem, and to introduce the standard library tools that make each style ergonomic.

The split isn't strict. A worker pool uses channels to hand out work and a mutex to update a shared counter. A pipeline might use atomic to track how many items it has processed. The two styles complement each other.

Concurrency vs Parallelism

These words get used interchangeably, but they describe different things. Concurrency is a way of structuring a program so that independent tasks can be interleaved. Parallelism is the actual simultaneous execution of those tasks on multiple CPU cores. You can have concurrency without parallelism (a single core that switches between tasks) and you usually want parallelism to be backed by concurrency (multiple cores running tasks that were structured to run independently).

Go was designed to make concurrency cheap, so you can write hundreds of goroutines without thinking about it. Whether those goroutines run in parallel depends on the number of CPU cores and the value of GOMAXPROCS, which by default matches the number of cores the runtime can see.

On a single core, tasks take turns. The runtime switches between them so quickly that the program feels concurrent, but at any given moment only one task is actually running. On two or more cores, two tasks can run at the same instant. The structure of your program (the concurrency) is what makes parallelism possible. Without independent tasks, extra cores have nothing to do.

Here's a tiny program that runs three "tasks" as goroutines. On a multi-core machine they'll likely overlap in real time; on a single-core machine they'll interleave. From the program's point of view, the code is identical either way.

The interleaving is the runtime's decision, not yours. Your job is to write tasks that don't depend on a specific ordering. The pattern chapters that follow are about how to do that cleanly.

Shared Memory vs Communicating Sequential Processes

Rob Pike's line is the most quoted thing about Go's concurrency: "Don't communicate by sharing memory; share memory by communicating." The idea is that, where you can, you should pass a value through a channel instead of having two goroutines touch the same variable. The channel becomes the boundary, and only one goroutine owns the data at a time.

Here's the same logic written both ways. Two goroutines each compute a subtotal for a half of the cart, and the launcher sums both halves.

The CSP version sends each subtotal through a channel:

The shared-memory version stores each subtotal in a slice and uses a WaitGroup for ordering:

Both produce the same answer, and neither is "wrong". The CSP version is shorter and the data flow is obvious from the channel. The shared-memory version avoids the channel allocation and lets you scale to many slots without rebuilding the receive side. The mantra isn't a law, it's a default. When the data flow is the most important part, use a channel. When you need fast counters, low-overhead caches, or a write-once flag, use a mutex, an atomic, or sync.Once.

A reasonable rule of thumb:

Situation	Prefer
Handing work off between stages	Channels (CSP)
Many goroutines updating one number	`atomic`
Read-mostly cache or config map	`sync.RWMutex`
Expensive one-time setup	`sync.Once`
Many short-lived buffers	`sync.Pool`
Bounding goroutine count	Worker pool (channels + N workers)
First error short-circuits a group	`errgroup`

The rest of this section walks through each of those, in roughly that order.

Worker Pool

A worker pool fixes the number of goroutines that handle a stream of tasks. You launch N workers up front, feed jobs into a channel, and each worker pulls from that channel until it closes. This caps both memory and CPU use, which matters once your traffic is unpredictable. Spawning one goroutine per request works fine at low scale, but at high scale it can balloon memory and overload downstream services.

A minimal sketch:

Three workers, five orders, work distributed by whichever worker is free when a job is sent. The close(jobs) is what tells the workers' range loops to exit. A full worker pool implementation also collects results, propagates errors, and picks a pool size based on the workload.

Pipeline

A pipeline strings together stages of goroutines, each connected by a channel. Stage 1 reads input, transforms it, and sends the result to stage 2. Stage 2 reads from stage 1, transforms again, sends to stage 3. The shape lets each stage scale independently, and the channels carry both data and backpressure.

Two stages. Each stage owns one goroutine and one output channel, and each closes its output when its input drains. Pipelines compose well because the only contract between stages is "I'm a <-chan T". Real pipelines also handle errors, cancel stuck stages, and decide where buffers help.

sync.Mutex and sync.RWMutex

When multiple goroutines need to read or write the same variable, a sync.Mutex is the simplest way to keep the access safe. Only one goroutine can hold the lock at a time. sync.RWMutex is the read-heavy variant: any number of readers can hold the lock at once, but a writer takes it exclusively.

Without the mutex, the concurrent map writes would either produce wrong totals or trip Go's built-in concurrent-map detector and crash. The mutex chapter covers when to use Mutex vs RWMutex, how defer mu.Unlock() saves you from lock leaks on panic, and the cost of contention.

Atomic Operations

A mutex is general but not free. For the common case of "many goroutines incrementing or comparing a single integer", the sync/atomic package gives you a much cheaper alternative: a CPU instruction that performs the read-modify-write as a single uninterruptible step. No lock, no goroutine parking.

The catch: atomics only work on simple values (integers, pointers, booleans). The moment you need to update two related fields together, or modify a slice or map, you need a mutex. The full atomic API includes the atomic.Int64/atomic.Pointer[T] types added in Go 1.19 and atomic.CompareAndSwap for lock-free updates that beat a mutex on contended paths.

sync.Once

Sometimes you want to run a piece of setup code exactly once, no matter how many goroutines reach it. Lazy initialization of a database client, parsing a config file, computing a cached value. sync.Once makes this trivial: you give it a function, it runs the function the first time Do is called, and every later call returns immediately without re-running.

The "parsing config file" line prints exactly once even though three goroutines call loadConfig concurrently. Without sync.Once, you'd need a mutex plus a done flag, and you'd have to write the double-checked locking pattern correctly. sync.Once saves you that. Go 1.21 also added sync.OnceFunc and sync.OnceValue for the common "compute once, return many times" cases.

sync.Pool

When your program allocates the same kind of buffer or struct over and over in a hot path, the garbage collector ends up doing a lot of unnecessary work. sync.Pool lets you recycle short-lived objects: take one out when you need it, put it back when you're done, and the runtime keeps a per-core cache of available items.

Each call to formatReceipt borrows a buffer from the pool, writes into it, returns the string, and puts the buffer back. Under load, the pool keeps a few buffers warm per core and skips the allocator on most calls. The catch is that sync.Pool items can be reclaimed by the garbage collector between uses, so it only works for caches you can rebuild. The pool chapter explains where the technique helps and where it hurts.

errgroup

golang.org/x/sync/errgroup is the unofficial-but-standard companion to sync.WaitGroup when each goroutine can return an error. The first error cancels a shared context.Context, and Wait returns that first error to the caller. It's what most production code reaches for when running a fan-out of API calls or database queries.

If you've used sync.WaitGroup, the API will look familiar. g.Go(func() error { ... }) is like wg.Add(1) + go func(), and g.Wait() is like wg.Wait() except it also returns the first non-nil error. The chapter on errgroup covers SetLimit (which turns it into a bounded worker pool) and how errgroup.WithContext interacts with cancellation.

Race Detector

Go ships with a built-in race detector that you enable with the -race flag (go run -race, go test -race, go build -race). It instruments memory accesses and reports any two goroutines that read or write the same address without synchronization. This is the single most useful tool in the Go concurrency toolbox, and the standard advice is: run all your concurrent tests with -race in CI.

Run this with go run -race main.go and you'll see a WARNING: DATA RACE report pointing at the count++ line. Run it without -race and the program will probably print a number less than 1000 (because the increments overlap and overwrite each other), but it won't tell you why. The race detector chapter walks through reading a race report, fixing the bug, and the small runtime cost of leaving -race on in tests.

Cost: Programs built with -race run roughly 2x to 20x slower and use more memory. Use it in tests and during local development, but ship a non-race binary to production.

Choosing the Right Tool

Here's the section's playbook, condensed into one table. Every row points to a chapter in this section.

Need	Best Tool	Why
Run N tasks, cap goroutine count	Worker pool	Predictable memory and CPU use under load
Chain transformation stages	Pipeline	Each stage scales and backpressures independently
Protect a struct or map across goroutines	`sync.Mutex`	Simple, general, easy to reason about
Read-mostly shared data	`sync.RWMutex`	Many concurrent readers, occasional writer
Increment a counter or swap a pointer	`sync/atomic`	One CPU instruction, no goroutine parking
Run setup code exactly once	`sync.Once`	Lazy init that's safe under concurrent callers
Recycle short-lived buffers	`sync.Pool`	Cuts allocations in hot paths
Fan-out work, first error wins	`errgroup`	`WaitGroup` plus error propagation plus context cancel
Find unsynchronized memory access	`-race` flag	Runtime detection, no manual review needed
Cancel work or set a deadline	`context.Context`	Cancellation flows through a call tree, channels don't

A common shape in real Go services pulls several of these together. A request handler creates an errgroup with a context.Context. The group runs three calls in parallel: an inventory lookup behind a sync.RWMutex-protected cache, a price calculation, and a fraud check that uses an atomic.Int64 to track a rate-limit counter. Buffers for response serialization come from a sync.Pool. The whole thing runs under -race in CI. No single primitive does the job; the patterns are how you compose them.

How the Chapters Are Ordered

The rest of this section follows a deliberate progression, introducing each tool in roughly the order it appears in a typical Go codebase.

Worker Pools. Bounded goroutines, jobs over a channel.
Pipelines. Multi-stage processing with channel-based handoffs.
sync.Mutex and sync.RWMutex. Shared-memory primitives for when channels don't fit.
Atomic Operations. Lock-free counters and pointers via sync/atomic.
sync.Once. Run-once setup, including the Go 1.21 OnceFunc and OnceValue helpers.
sync.Pool. Object recycling for hot allocation paths.
errgroup. Error-aware goroutine groups.
Race Detector. The -race flag and how to read its reports.
Concurrency Best Practices. A wrap-up of the section: when to pick what, common bugs, and how to keep concurrent code understandable.

Each chapter goes deep on one tool, covering the vocabulary and trade-offs needed to pick the right one for a given problem.

Quiz

Concurrency Overview Quiz

10 quizzes

Concurrency Patterns Overview