AlgoMaster Logo

Correlation IDs

Last Updated: January 7, 2026

Ashish

Ashish Pratap Singh

A user reports that their order failed. You open your log aggregation system and search for errors around that time. You find thousands of log entries. Some are from the order service, some from payments, some from inventory. Which ones are related to this user's failed order?

Without a way to connect them, you are reduced to guessing based on timestamps. This is where correlation IDs come in.

Correlation IDs are simple in concept but transformative in practice. They turn a haystack of unrelated logs into a coherent narrative.

In this chapter, you will learn:

  • What correlation IDs are and why they matter
  • How to generate and propagate correlation IDs
  • Common patterns for implementation
  • How correlation IDs connect to distributed tracing
  • Best practices and common pitfalls

This technique works hand-in-hand with the logging practices we covered earlier. Structured logs with correlation IDs become exponentially more useful.

The Problem: Disconnected Logs

Consider a simple request that touches multiple services:

Each service logs its activity:

But at 10:23:45, your system handled 500 requests per second. These 7 log lines are mixed with 3,000 others from the same time window. How do you know which API Gateway request led to which Order Service log, which led to which Payment failure?

Without correlation IDs, you cannot. You are left matching timestamps and hoping for the best.

What Is a Correlation ID?

A correlation ID (also called request ID, trace ID, or transaction ID) is a unique identifier assigned at the entry point of a request and propagated through all downstream services.

Now the same logs become traceable:

Query: correlation_id = "abc-123" returns exactly these 7 logs, showing the complete request flow.

Generating Correlation IDs

Correlation IDs must be unique across all requests and allow easy propagation across every hop.

UUID (Universally Unique Identifier)

Example:

Pros

  • Universally unique
  • No coordination needed
  • Built into most languages

Cons

  • Long (36 characters)
  • Not time-ordered
  • Random, no meaning

Best for: default choice when you want maximum compatibility and minimal effort.

ULID (Universally Unique Lexicographically Sortable Identifier)

Example:

Pros

  • Sortable by creation time
  • Shorter than UUID
  • Monotonic within millisecond

Cons

  • Less widely supported
  • Requires library

Best for: systems where you often sort or scan by time and want IDs that work well in logs and indexes.

Custom Formats

Example:

Typical format

Pros

  • Human readable
  • Includes timestamp
  • Easy to identify source

Cons

  • Must ensure uniqueness
  • Longer format

Best for: internal systems where readability matters and volume is moderate, or when you add a readable wrapper around a truly unique base ID.

Snowflake-style IDs

These are time-ordered, numeric IDs generated in a distributed way (commonly 18–19 digits).

Pros

  • Sortable by time
  • Compact and efficient for storage and indexing
  • Works well at very high volume

Cons

  • Less readable for humans
  • equires an ID generation strategy and operational discipline

Best for: high-volume distributed systems that already use time-ordered numeric identifiers across data stores.

Which Format to Use

FormatLengthSortableReadabilityBest For
UUID v436NoLowGeneral use, compatibility
ULID26YesMediumTime-series queries
CustomVariableOptionalHighHuman debugging
Snowflake ID18-19YesLowHigh-volume distributed systems

Propagating Correlation IDs

The correlation ID must flow through every hop in your system. If one service drops the ID, your “single request story” breaks and logs become scattered again.

HTTP Headers

For synchronous service-to-service calls, HTTP headers are the most common approach.

Typical header names:

  • X-Correlation-ID: very common custom header
  • X-Request-ID: popular alternative name (often used by proxies)
  • traceparent: W3C standard for tracing context (works for tracing and can double as correlation)

Propagation Flow

A good propagation flow looks like this:

  1. Request arrives
  2. If it already has an ID, keep it
  3. If not, generate a new one
  4. Store it in request context
  5. Include it in all logs
  6. Attach it to all downstream calls
  7. Attach it to async jobs and messages
  8. Return it in the response headers

This turns correlation IDs into a system-wide “breadcrumb trail.”

Implementation Pattern

Every service should implement middleware (or filters/interceptors) that handles correlation IDs automatically. You want developers to get correlation IDs “for free,” not by remembering to add them everywhere.

Incoming request handler

Outgoing request interceptor

Response handler

That is the core loop. Once you have this in place, correlation IDs become a standard part of the request lifecycle.

Context Propagation Patterns

Propagation is straightforward in a single synchronous thread. It gets tricky when execution hops across threads, async boundaries, or message queues.

Thread-Local Storage (synchronous)

In synchronous request handling, thread-local storage works well:

  • request arrives on Thread-1
  • middleware sets correlationId in ThreadLocal and logging MDC
  • any log call on that thread automatically includes the ID

This is why correlation IDs often “just work” in simple web apps.

Async Context Propagation (multi-threaded)

Async breaks the thread-local model because the work may resume on a different thread:

  • Thread-1 receives request and sets correlation ID
  • work is handed off to a worker thread
  • Thread-2 runs without the thread-local context
  • logs lose the correlation ID unless you explicitly carry it

There are three common solutions.

1. Explicit parameter passing:

Simple and explicit, but easy to forget and messy in deep call chains.

2. Context-aware executors:

This is the clean “make it automatic” approach. The executor captures context at submission time and restores it when running the task.

3. Framework support:

Many modern runtimes have a built-in concept of request context:

  • Java reactive: Project Reactor Context
  • Kotlin: CoroutineContext
  • Go: context.Context
  • Node.js: AsyncLocalStorage

If your stack already uses one of these, integrate correlation IDs into that context rather than reinventing it.

Message Queues

Once you introduce queues, you no longer have HTTP headers. The same principle still applies: put the correlation ID in message metadata.

Producer

  • reads correlation ID from context
  • writes it into message headers

Consumer

  • extracts correlation ID from message headers
  • sets it into the processing context and logging MDC
  • forwards it to any downstream calls

Example message format

This way, an async payment job is still tied back to the original user request.

Multiple ID Types

In real systems, one ID is rarely enough. Different IDs answer different questions, and mixing them up leads to confusion. A good observability design uses a small set of IDs with clear meanings and consistent propagation rules.

The key idea is: each ID has a scope. Some identify a single request. Others connect many requests into a user journey. Others exist only for tracing.

Common ID Types

Here are the IDs you will see most often:

ID TypeScopePurpose
Correlation IDSingle request across servicesLink logs for debugging
Session IDMultiple requests from one sessionTrack user journey
User IDAll activity by one userUser-centric debugging
Trace IDSingle request (tracing systems)Distributed tracing
Span IDSingle operation within a traceTrace hierarchy
Request IDSingle HTTP requestPer-service request tracking

When to Use Each

Debugging a single failed request: Search by correlation ID to see all services involved.

Investigating a user's experience: Search by user ID to see all their requests over time.

Analyzing a session: Search by session ID to see the sequence of user actions.

Performance analysis: Use trace ID with distributed tracing tools for timing data.

Unified Log Entry

A good log entry often includes multiple IDs because each one helps in a different way. You are not adding noise, you are making logs searchable from multiple angles.

This lets you:

  • search all logs for a request (correlation_id)
  • jump into trace timing (trace_id)
  • find every request for a user (user_id)
  • reconstruct the session journey (session_id)

Integration with Distributed Tracing

Correlation IDs are the foundation. They give you a shared identifier across services. Distributed tracing builds on that by adding structure and timing.

From Correlation ID to Trace

If you use W3C Trace Context (traceparent) for propagation, you get compatibility with modern tracing systems (OpenTelemetry, Jaeger, Zipkin, many managed APMs).

A practical approach many teams adopt:

  • use traceparent for tracing context propagation
  • log the trace_id and span_id automatically
  • optionally also expose a X-Correlation-ID header for clients and support workflows

Whether you keep correlation ID separate or reuse the trace ID depends on your tooling and org conventions. The most important thing is that engineers can reliably search and correlate.

Best Practices

1. Generate at the Edge

Create the correlation ID at the first entry point:

Do not generate new IDs in downstream services. If they do not receive one, that is a bug in propagation.

2. Include in Every Log

Every log entry must include the correlation ID. Use logging framework features to automate this:

3. Propagate to All Downstream Calls

Include in:

  • HTTP requests to other services
  • Messages published to queues
  • Async job submissions
  • External API calls
  • Database query logs (if possible)

4. Return in Response Headers

Include the correlation ID in response headers so clients can reference it:

When a user reports an issue, they can provide this ID for faster debugging.

5. Use Standard Header Names

Prefer standard or widely-used header names:

HeaderUsage
traceparentW3C Trace Context (best for tracing compatibility)
X-Request-IDCommon convention
X-Correlation-IDCommon convention
X-B3-TraceIdZipkin B3 format

Avoid creating custom names unless you have a specific reason.

Common Pitfalls

Correlation IDs are simple in theory: generate once, propagate everywhere, log consistently. In practice, most failures come from a few predictable mistakes. Fixing them early saves hours during incidents.

1. Breaking the Chain

If any service fails to propagate the ID, everything downstream becomes disconnected. You will see logs that look correct inside a service, but you cannot stitch the full request together.

Fix: Add tests that verify correlation ID propagation. Use tracing tools to detect missing context.

2. Generating New IDs Mid-Request

This is one of the most common mistakes. Each service generates its own correlation ID, so no one can correlate across services.

Fix: Only the entry point generates the ID. All other services receive and propagate it.

3. Losing Context in Async Operations

Thread-local and request-scoped context works fine in synchronous flows. Async breaks it because work often resumes on a different thread.

Fix: Use context-aware async primitives or explicitly pass the ID.

4. Not Including in Error Responses

The correlation ID is most valuable when something goes wrong, because it gives support and engineers a handle to find the exact logs fast.

Make sure it appears in:

  • error response headers (always)
  • error response body when appropriate (especially for APIs used by external clients)
  • error logs and exception traces

If a client reports “checkout failed,” you want them to paste an ID, not a screenshot.

5. High Cardinality in Metrics

Correlation IDs should not be used as metric labels. With millions of unique values, your metrics system will explode.

Summary

Correlation IDs transform distributed debugging from guesswork to precision:

  • What: A unique ID assigned at request entry and propagated through all services
  • Why: Links all logs for a single request, making them searchable together
  • How: HTTP headers propagate the ID; logging frameworks automatically include it

Key implementation points:

  • Generate at the edge, never mid-request
  • Store in context (thread-local or async context)
  • Include in every log entry automatically
  • Pass to all downstream calls (HTTP, queues, async jobs)
  • Return in response headers for client reference

Common ID formats: UUID, ULID, or custom formats with timestamps. Use standard headers like X-Correlation-ID or traceparent for compatibility.

Pitfalls to avoid:

  • Breaking the chain in any service
  • Losing context in async operations
  • Using correlation IDs as metric labels