AlgoMaster Logo

Denormalization

Medium Priority12 min readUpdated July 4, 2026
AI Mock Interview

Practice this topic in a realistic system design interview

Normalized databases are great for correctness. Each fact lives in one place, updates are cleaner, and rules are easier to enforce.

But many high-traffic systems eventually hit read paths where normalized data is too expensive to rebuild on every request. A page may need data from five tables, a dashboard may summarize millions of rows, or a service may need data from another service that is too slow to call every time.

Denormalization means deliberately copying data to make reads faster, simpler, or less dependent on other systems.

Done carefully, denormalization is a valid tradeoff, not bad database design. Reads get faster, but writes get more complex. You use more storage, and copied data can become stale or inconsistent.

Good denormalization starts with a clear read problem and a plan for keeping copied data trustworthy.

In this chapter, we will look at when to denormalize and how to keep the copies correct enough to trust.

1. The Problem With Normalization

In a normalized relational schema, data is split into related tables to avoid unnecessary copies.

For a blog application, a clean schema might look like this:

  • users(id, name, email)
  • posts(id, user_id, title, body, created_at)
  • comments(id, post_id, user_id, text, created_at)

This design is good for writes. If a user changes their name, you update one row in users.

To display a post with comments and comment author names, the database joins the tables:

At small scale, this is perfectly fine. Even at large scale, joins can be fine when tables have good indexes and the query returns a reasonable number of rows.

The problem appears when the read path becomes expensive or hard to run reliably.

For example, a query may join large tables and run too often. A dashboard may summarize too many rows while the user is waiting. A request may need data across shards or services. Or a page may need consistently fast responses, but the normalized query is sometimes slow.

Reads that happen far more often than the underlying data changes are also a strong signal.

Denormalization is one way to move work away from the request path.

The normalized design remains the source of truth. The denormalized copy exists because a specific read path needs data in a faster shape.

2. What Denormalization Means

Premium Content

This content is for premium members only.