A mutex (mutual exclusion) is the most fundamental synchronization primitive. It ensures that only one thread can access a critical section at a time. Once you understand mutexes, you have the building block for almost every other synchronization mechanism.

This chapter covers what a mutex is, how mutexes work, and how to use them correctly.

What is a Mutex?

Real-World Analogy

Think of a bank with a single key to a private safety deposit box room. When a customer arrives, they take the key from the front desk, enter the room, and do whatever they need with the boxes inside. If another customer arrives while the first is inside, they have to wait at the desk until the key is returned.

The crucial detail: the person who took the key must be the one to return it. You cannot send someone else to hand it back.

In programming terms, a mutex is a synchronization primitive that provides mutual exclusion. When a thread acquires (locks) a mutex, any other thread that tries to acquire the same mutex will block until the first thread releases (unlocks) it. The thread that locks must be the one to unlock.

The diagram above captures the lifecycle of a mutex. It starts in the Unlocked state (green), meaning no thread currently holds it. When Thread A calls lock(), the mutex transitions to Locked (red). Now if Thread B tries to acquire it, Thread B cannot proceed. Instead, it enters a Waiting state (orange), blocked until the mutex becomes available again.

Once Thread A calls unlock(), the mutex releases and Thread B can finally acquire it, transitioning back to the Locked state with a new owner. This cycle repeats as threads take turns accessing the protected resource.

The key properties of a mutex:

Mutual Exclusion: At most one thread can hold the mutex at any time
Ownership: The thread that locks the mutex must be the one to unlock it
Blocking: Threads that can't acquire the mutex are suspended until it becomes available

The Problem Mutex Solves: Critical Sections

A critical section is any piece of code that reads or writes shared state and must not be executed by more than one thread at the same time. If two threads enter that section together, you can end up with race conditions, where the final result depends on timing rather than logic.

A classic example is incrementing a shared counter.

Imagine counter = 0, and two threads execute counter++ at the same time. Even though counter++ looks like a single operation, it is actually a sequence of steps:

read the current value
compute the new value (add 1)
write the new value back

Now watch what can go wrong:

Both threads started from the same old value, did correct math, and still produced the wrong result because they overwrote each other’s update. This is called a lost update, and it is one of the most common concurrency bugs.

A mutex (mutual exclusion lock) prevents this by ensuring only one thread can execute the critical section at a time. If Thread A holds the lock, Thread B must wait until Thread A exits and releases it. That turns the critical section into an atomic, safe operation from the perspective of other threads.

Let us now look at how mutex works under the hood.

How Mutex Works

A mutex (mutual exclusion lock) is the simplest and most common way to protect a critical section. The idea is straightforward: at most one thread can hold the lock at a time. Everyone else must wait.

Here’s what happens when a thread wants to enter a critical section.

Attempt to acquire: Thread checks if the mutex is available
If available: Thread sets the mutex to "locked" and enters the critical section
If not available: Thread is put into a wait queue and blocked, so it stops using CPU
On release: The mutex is set to "unlocked" and one waiting thread is woken up
Woken thread: Attempts acquisition again (may compete with others)

After the critical section completes, the thread releases the mutex. This is where the wake-up mechanism kicks in: if any threads are waiting, one of them gets notified. That woken thread loops back to the availability check and attempts acquisition again.

Note that it might compete with other threads that arrived in the meantime, so waking up does not guarantee immediate acquisition, just another chance to try.

Key Operations

Operation	Description	Blocking?
`lock()` / `acquire()`	Acquire the mutex; block if unavailable	Yes
`unlock()` / `release()`	Release the mutex; wake waiting threads	No
`tryLock()` / `try_lock()`	Try to acquire; return immediately with success/failure	No
`tryLock(timeout)`	Try to acquire with timeout; return success/failure	Yes (up to timeout)

Sequence of Two Threads

Thread 1 calls lock() → acquires the mutex → enters critical section
Thread 2 calls lock() → mutex is taken → Thread 2 blocks and waits
Thread 1 calls unlock() → releases the mutex → wakes a waiter
Thread 2 wakes → successfully acquires → enters critical section → later unlocks

What you get is serialized access: even if both threads are available and the machine has multiple cores, only one thread can be inside the protected code at a time. Only when T1 calls unlock() does T2 finally get to acquire the mutex and enter its own critical section.

Implementing Mutex

The implementation patterns here apply whether you are protecting a simple counter or a complex data structure.

Basic Usage

In Java, you typically use one of two approaches:

synchronized (intrinsic monitor lock, built into the language)
Lock implementations like ReentrantLock (explicit locks from java.util.concurrent.locks)

Both provide mutual exclusion. The difference is mostly about simplicity vs control.

The try-finally pattern is non-negotiable when using explicit locks. Notice that lock.lock() comes before the try block. If you put it inside, an exception during locking would trigger the finally block and try to unlock a lock you never acquired.

The synchronized keyword handles this automatically, which is why Approach 2 and 3 look cleaner. The trade-off: ReentrantLock gives you features like tryLock() and fairness policies that synchronized cannot provide.

Complete Example: Thread-Safe Counter

Let us put everything together with a complete, runnable example. We will create 10 threads, each incrementing a shared counter 1,000 times. Without a mutex, the final count would be some random number less than 10,000 due to lost updates. With the mutex, we get exactly 10,000 every single time.

Without the mutex, this number would be unpredictable, often landing somewhere between 5,000 and 9,000 depending on timing. The mutex ensures that each increment operation, the read-modify-write sequence, completes atomically from the perspective of other threads. Ten threads times one thousand increments equals exactly ten thousand, guaranteed.

The join() calls are equally important. They make the main thread wait until all worker threads have completed before printing the final count. Without joins, you might print the count while threads are still incrementing.

Performance Considerations

Mutexes are not free. Every lock() and unlock() adds overhead, and that overhead can explode when many threads compete for the same lock.

The goal is not to avoid locks at all costs. The goal is to understand when locking is “cheap enough” and when contention turns it into a bottleneck.

Overhead

A mutex is fast when no one is competing. In that case, lock acquisition is often just a few atomic instructions.

Once there is contention, things change. The runtime may need to park threads, manage wait queues, and involve the OS scheduler.

Operation	Typical Cost	Notes
Uncontended lock/unlock	10-100 ns	Fast atomic operations
Contended lock (context switch)	1-10 μs	OS scheduler involved
Contended lock (many waiters)	10-100 μs	Queue management, fairness

Treat these numbers as ballpark. The exact values vary by OS, CPU, JVM/runtime, and contention pattern. The trend is what matters: contention can be 100–1000× more expensive than an uncontended lock.

Lock Contention

Contention occurs when multiple threads compete for the same lock. High contention causes:

Threads spend more time waiting than working
Excessive context switching
Poor CPU utilization
Serialized execution (defeating the purpose of multithreading)

Consider a web server with 100 threads, all accessing a shared cache protected by a single mutex. If each request needs the cache for 1ms, and requests arrive faster than 1 per millisecond, threads start queueing. With high load, 99 threads might be blocked waiting while 1 thread works.

Your system may have many cores, but this hotspot makes it behave like a single-core pipeline at the lock boundary.

This is why high-throughput systems often use more sophisticated approaches: sharded locks, lock-free data structures, or thread-local caching.

Optimization Strategies

1. Minimize critical section size: Only lock what's necessary

This single change often gives the biggest win because it reduces lock hold time, which reduces contention.

2. Use finer-grained locks: If two operations do not touch the same data, they should not fight over the same lock.

This reduces unnecessary blocking and improves throughput, at the cost of more complexity. Once you have multiple locks, you must also think about lock ordering to avoid deadlocks.

3. Prefer atomics for simple counters: For small, well-defined operations like incrementing a counter, atomic types often beat a mutex because they avoid blocking entirely.

Atomics are not “free” either. They still use atomic instructions and memory ordering rules. But for simple patterns, they can be a clean upgrade.

4. Use read-write locks when reads dominate: If your workload is “many readers, few writers,” a ReadWriteLock can improve concurrency by letting readers proceed in parallel.

This helps only when:

reads greatly outnumber writes, and
reads are not extremely short (otherwise lock overhead can dominate), and
writers are not frequent enough to starve readers or vice versa.

Quiz