AlgoMaster Logo

Design Thread-Safe Blocking Queue

Last Updated: February 1, 2026

Ashish

Ashish Pratap Singh

The blocking queue provides two guarantees that regular queues cannot: producers block when the queue is full (backpressure), and consumers block when the queue is empty (no busy waiting).

Combined with thread safety, this makes blocking queues the backbone of producer-consumer systems, from message brokers to thread pools to request buffering. Every Java ExecutorService uses a blocking queue internally. Understanding how to build one exposes the core techniques of concurrent data structure design.

Problem Statement

What We're Building

A bounded, thread-safe queue that blocks producers when full and blocks consumers when empty. The capacity limit provides backpressure, preventing memory exhaustion. The blocking semantics eliminate busy waiting and coordinate producers with consumers naturally.

Required Operations

OperationDescriptionExpected Complexity
put(item)Insert item, block if queue is fullO(1) amortized
take()Remove and return head, block if emptyO(1)
offer(item)Insert if space available, return false if fullO(1)
poll()Remove head if available, return null if emptyO(1)
peek()View head without removing, null if emptyO(1)
size()Current number of elementsO(1)

Thread-Safety Requirements

  • Multiple producer threads can call put() simultaneously without losing items
  • Multiple consumer threads can call take() simultaneously without duplicate consumption
  • A single item is delivered to exactly one consumer (no duplicates, no losses)
  • Blocking operations must respond to interruption (throw InterruptedException)
  • The queue must never enter an invalid state regardless of thread interleaving

Data Structure Fundamentals

Before adding concurrency, we need a queue that efficiently uses bounded memory. A circular buffer is the standard choice: a fixed-size array where head and tail wrap around when they reach the end.

Core Concepts

  1. Circular Array: Instead of shifting elements on dequeue (O(n)), we maintain head and tail indices that wrap around modulo capacity.
  2. Head and Tail Pointers: Head points to the next element to dequeue. Tail points to the next empty slot for enqueue. When tail reaches the end, it wraps to index 0.
  3. Full vs Empty Distinction: When head equals tail, the queue could be empty or full. We track count separately to distinguish these states.
  4. Capacity vs Size: Capacity is the fixed maximum (set at construction). Size is the current element count (0 to capacity).

Single-Threaded Implementation

Internal Structure

The buffer holds elements A, B, C at indices 0, 1, 2. The head pointer at 0 means the next poll() returns A. The tail pointer at 3 means the next offer() places an item at index 3. When tail reaches 5 and another item is added, tail wraps to 0 (if space exists after dequeues). This wrap-around reuses memory without shifting elements.

Concurrency Challenges

Adding thread safety to our circular buffer reveals several subtle problems. Understanding these challenges is essential for choosing the right synchronization strategy.

Challenge 1: Lost Wakeups

The most insidious bug in blocking queue implementations. A producer signals "not empty" after adding an item, but the consumer hasn't started waiting yet. The signal is lost. The consumer then waits forever, even though the queue has items.

The consumer checks the condition, finds it empty, but before it can wait, the producer adds an item and signals. The signal goes nowhere because no thread is waiting yet. The consumer then waits on a condition that's already satisfied.

Challenge 2: Spurious Wakeups

A thread waiting on a condition variable can wake up even when no other thread signaled it. This is a documented behavior of most threading implementations (POSIX pthreads, Java, etc.). If the code assumes every wakeup means the condition is satisfied, it will dequeue from an empty queue.

Challenge 3: Producer-Consumer Race

Multiple producers competing for the last slot, or multiple consumers competing for the last item, can cause races if signaling isn't handled correctly.

Both producers wake up when a slot opens, but only one can use it. The other must go back to waiting. This is why the wait must be in a loop, not an if statement.

Challenge 4: Signaling Under Lock

Should you signal while holding the lock or after releasing it? Signaling while holding the lock is safer (no lost signals) but may cause "hurry up and wait": the signaled thread wakes up but immediately blocks on the lock the signaler still holds.

Consistency Model Requirements

PropertyRequirementWhy It Matters
Atomicityput/take must be atomicPrevents partial updates, duplicate items
FIFO OrderItems dequeue in enqueue orderExpected queue semantics
BlockingBlock rather than busy-waitCPU efficiency, battery life
LivenessEvery waiting thread eventually proceedsNo indefinite blocking when items/space exist

Approach 1: Single Lock with Conditions

The straightforward solution: one lock protects all queue state, with two condition variables for blocking semantics. Producers wait on "not full" and signal "not empty". Consumers wait on "not empty" and signal "not full".

Implementation

Analysis

PropertyStatusExplanation
Thread-safeYesSingle lock serializes all access
Deadlock-freeYesOnly one lock, no circular wait
Starvation-freeYes (with fair lock)FIFO lock acquisition
ScalabilityLimitedProducers block consumers and vice versa

Single Lock Bottleneck

The diagram shows the fundamental limitation: all four threads contend for one lock. A producer inserting at the tail blocks a consumer removing from the head, even though they're accessing different parts of the queue.

Pros

  • Simple and correct
  • Easy to reason about
  • Good for low-throughput scenarios
  • Compound operations naturally atomic

Cons

  • Producers and consumers contend for same lock
  • No concurrent put + take
  • Lock acquisition overhead on every operation
  • Potential throughput bottleneck

When to Use:

  • Small buffer sizes
  • Low concurrency (few producers/consumers)
  • Simplicity is more important than throughput
  • As correctness baseline for testing

Approach 2: Two-Lock Queue

The key insight: in a queue, producers only touch the tail and consumers only touch the head. If we use separate locks for head and tail, a producer can insert while a consumer removes simultaneously. This is the approach used by Java's LinkedBlockingQueue.

Strategy: Separate Head and Tail Locks

The consumer acquires the take lock and removes from head. The producer acquires the put lock and appends at tail. They operate simultaneously because they hold different locks.

The Tricky Part: Tracking Count

With two locks, we can't atomically read and update count under a single lock. We need an AtomicInteger. But signaling gets complex: after a put, how do we signal waiting consumers if we only hold the put lock and the notEmpty condition belongs to the take lock?

Solution: Cascading Signals

When count goes from 0 to 1, a put operation must signal consumers. But it holds the put lock, not the take lock. We use a two-step approach:

  1. Put acquires put lock, inserts, atomically increments count
  2. If count was 0 before increment (queue was empty), acquire take lock and signal notEmpty
  3. Take acquires take lock, removes, atomically decrements count
  4. If count was capacity before decrement (queue was full), acquire put lock and signal notFull

Implementation

Analysis

PropertyStatusExplanation
Thread-safeYesSeparate locks for head and tail
Deadlock-freeYesLock ordering not needed (disjoint resources)
Concurrent put+takeYesDifferent locks, different data
ScalabilityGoodProducers and consumers don't block each other

Pros

  • Concurrent put and take operations
  • Better throughput under high load
  • Producers don't block consumers and vice versa

Cons

  • More complex implementation
  • Slightly higher memory overhead (linked nodes vs array)
  • Cross-lock signaling adds latency
  • Harder to reason about correctness

When to Use:

  • High throughput requirements
  • Separate producer and consumer thread pools
  • Longer-running operations (amortizes lock overhead)
  • When array resizing isn't desired

Solution Comparison

AspectSingle Lock (ArrayBlockingQueue)Two-Lock (LinkedBlockingQueue)Lock-Free
Concurrent operationsNoneput + takeFull
Lock overhead1 lock acquisition2 lock acquisitions (cascade)No locks, CAS overhead
Memory layoutContiguous array (cache-friendly)Linked nodes (allocation)Complex structures
Bounded capacityYes, fixed at constructionYes, but can be unboundedTypically unbounded
size() accuracyExactExact (atomic count)Eventually consistent
Implementation complexityLowMediumHigh
Best forSmall queues, low contentionHigh throughputExtreme performance

Decision Flowchart

Recommendation:

  • Interviews: Start with single-lock implementation. It demonstrates condition variables, spurious wakeups, and blocking semantics clearly. Mention two-lock as an optimization if asked about throughput.
  • Production Java: Use java.util.concurrent.ArrayBlockingQueue (single-lock, bounded) or LinkedBlockingQueue (two-lock, optionally bounded). Both are battle-tested.
  • Production C++: Use boost::lockfree::queue for lock-free or implement the two-lock pattern with std::mutex and std::condition_variable.
  • Production Python: Use queue.Queue (built-in, single-lock) or multiprocessing.Queue for inter-process.

Complete Implementation

This section provides the production-ready single-lock ArrayBlockingQueue, which is most relevant for interviews. It includes all operations, proper interruption handling, and clear documentation.

Demo

Expected Output: