Practice this topic in a realistic system design interview
Many database scaling problems start as query problems.
A missing index, an unbounded result set, an accidental table scan, or an N+1 query pattern can make a healthy database look overloaded. Before adding replicas, sharding, caching layers, or new infrastructure, the right first question is what work the query is making the database do.
Query optimization is the practice of making database reads and writes do less unnecessary work while still returning correct results.
Good optimization starts with measurement, execution plans, and a clear understanding of the access pattern.
Slow queries hurt more than the single request that triggered them.
A bad query increases page latency, consumes CPU, memory, and disk I/O, holds locks longer than necessary, evicts useful data from the buffer cache, slows down unrelated queries on the same database, and forces larger machines that increase cloud spend.
At small scale, a query that scans 50,000 rows may not matter. At high traffic, the same query running hundreds of times per second can dominate the database.
The goal is to make important queries predictable, bounded, and aligned with the way the database can access data efficiently. Cleverness is rarely the point.
Before optimizing SQL, understand the basic path a database follows.
The database checks syntax and resolves table and column names.
The optimizer chooses a plan: which indexes to use, join order, join algorithm, sort strategy, aggregation strategy, and how many rows it expects at each step.
The optimizer makes decisions using statistics. If statistics are stale or the data distribution is unusual, the plan can be poor.
The database runs the plan. This is where it reads pages, filters rows, joins tables, sorts data, aggregates results, and returns rows to the client.
Most query tuning is about improving the plan or reducing the amount of work done during execution.