MySQL is one of the most widely deployed databases in the world, which means you’ll run into it everywhere: early-stage startups, massive enterprise systems, and plenty of system design interviews.
But most engineers only learn the surface area (tables, indexes, and a few SQL queries) until something breaks in production: a deadlock during a sale, replication lag during peak traffic, or a query that suddenly goes from milliseconds to minutes.
This chapter covers the practical MySQL knowledge that matters in system design interviews: InnoDB internals, indexing strategies, replication architectures for availability, and sharding approaches for scale.
Client applications (Web Server 1..N) don’t connect to MySQL directly. They go through ProxySQL, which manages connection pooling, routing, and load balancing. This prevents “connection storms” and gives you a single place to enforce policies like read/write splitting and query rules.
From ProxySQL:
Inside the primary, each query flows through MySQL’s execution pipeline:
For transactional workloads, the storage engine is typically InnoDB. InnoDB handles the core durability and consistency mechanics:
For replication and read scaling, the primary ships changes to replicas using the binary log (binlog) stream (shown as flowing from the log component to Replica 1/2). Replicas apply these events to stay in sync, enabling you to offload read traffic—while keeping the primary as the source of truth for writes.
Every database makes trade-offs. MySQL traded some advanced features for operational simplicity and read performance. Understanding exactly where these trade-offs pay off, and where they do not, allows you to make defensible choices in interviews.
The classic MySQL use case. A typical web application reads data far more than it writes. User profiles are written once and read thousands of times. Product catalogs change rarely but are browsed constantly.
MySQL's replication model excels here: add read replicas to scale reads linearly while keeping a single primary for writes.
MySQL has been running in production for decades. Its failure modes are well understood. Tools for backup, monitoring, and migration are mature. Engineers know how to operate it.
This operational maturity matters more than feature lists when your system needs to run reliably at 3 AM.
Users have orders. Orders have items. Items have products. When your data naturally forms relationships, relational databases like MySQL let you query across those relationships efficiently.
The alternative, denormalizing everything for a document store, creates consistency headaches.
InnoDB provides full ACID compliance. A transfer from one account to another either completes fully or rolls back entirely. An order either reserves inventory and creates the order record, or neither happens.
These guarantees are built into the storage engine, not bolted on.
Your ORM supports MySQL. Your monitoring dashboards understand MySQL metrics. Your team has MySQL expertise.
Switching databases has hidden costs beyond code changes: training, tooling, and operational knowledge. Sometimes the best database is the one your team knows.
Understanding MySQL's limitations matters as much as knowing its strengths. Proposing MySQL for the wrong problem signals inexperience.
If your queries rely heavily on CTEs, window functions, or complex JSONB operations, PostgreSQL provides a more capable query engine.
MySQL's window function support arrived late (8.0) and remains less comprehensive. Its JSON support, while functional, lacks PostgreSQL's JSONB indexing capabilities.
MySQL's replication was designed around a single-threaded SQL applier (improved in recent versions, but still a consideration).
For workloads with millions of writes per second, databases designed for write-heavy patterns like Cassandra or ScyllaDB handle the load more naturally.
MySQL scales vertically well and handles read scaling through replicas, but write scaling requires sharding.
Unlike databases that handle sharding internally (CockroachDB, TiDB, Cassandra), MySQL sharding requires external coordination through tools like Vitess or application-level logic.
PostgreSQL's support for arrays, ranges, and custom types is more mature. If your domain naturally includes these types (scheduling with ranges, tagging with arrays), PostgreSQL provides better primitives.
MySQL has spatial extensions, but PostGIS with PostgreSQL offers a more complete geospatial toolkit. For applications where location queries are central rather than incidental, PostGIS is the stronger choice.
| System | Why MySQL Works |
|---|---|
| Social Network (Facebook) | Read-heavy, mature replication |
| E-commerce (Shopify) | ACID for orders, read replicas for catalog |
| Content Management | Simple CRUD, easy scaling |
| User Authentication | Reliable, ACID transactions |
| URL Shortener | Simple schema, high read volume |
| Chat Metadata | User profiles, relationships |
In practice: When proposing MySQL in an interview, connect your choice to the specific requirements. For a social network with high read volume, mention that MySQL's replication model allows you to scale reads by adding replicas.
For an e-commerce platform, emphasize InnoDB's ACID transactions for order processing. The strength of the answer comes from matching MySQL's capabilities to the problem, not from claiming MySQL is always the best choice.
One of MySQL's distinctive architectural choices is its pluggable storage engine layer. The SQL parser, optimizer, and connection handling are separate from the component that actually stores and retrieves data. This means different tables can use different storage engines, each with different characteristics.
In practice, this flexibility mostly matters for understanding MySQL's history and the occasional legacy system. For modern applications, the answer is simple: use InnoDB. But understanding why requires knowing what InnoDB provides that alternatives do not.
InnoDB became the default storage engine in MySQL 5.5 (2010), reflecting its maturity and the industry's need for transactional guarantees. Today, there is rarely a reason to choose anything else.
Key features:
MyISAM was the default before MySQL 5.5. Still used for specific cases.
Key features:
| Factor | InnoDB | MyISAM |
|---|---|---|
| Transactions | Yes | No |
| Locking | Row-level | Table-level |
| Foreign keys | Yes | No |
| Crash recovery | Yes | No |
| Full-text search | Yes (5.6+) | Yes |
| Concurrency | High | Low |
| Use case | Production systems | Read-only analytics, legacy |
Modern recommendation: Use InnoDB for everything. MyISAM's advantages have been eliminated in recent MySQL versions.
Understanding how InnoDB works internally is not just academic knowledge. It explains why certain configurations matter, why some queries are fast and others slow, and what happens when things go wrong.
This understanding demonstrates depth that separates senior engineers from those who just use MySQL without understanding it.
The buffer pool is InnoDB's most important component for performance. It is an in-memory cache that holds data pages, index pages, and other metadata. When MySQL reads a row, it first checks the buffer pool. If the page is there (a "hit"), no disk I/O is needed. If not (a "miss"), it must read from disk, which is orders of magnitude slower.
Configuration:
Sizing guideline:
Monitoring:
How can MySQL acknowledge a commit immediately while data pages are still only in memory?
The answer is the redo log, also known as the write-ahead log (WAL). Instead of writing changed pages to their final locations on disk (which requires random I/O), MySQL writes a compact description of the change to the redo log (sequential I/O).
Why redo logs?
Configuration:
innodb_flush_log_at_trx_commit options:
| Value | Behavior | Durability | Performance |
|---|---|---|---|
| 1 | Flush to disk on every commit | Full | Slowest |
| 2 | Flush to OS buffer on commit | OS crash = data loss | Medium |
| 0 | Flush every second | Up to 1s data loss | Fastest |
A common database problem: if one transaction is reading a row while another is modifying it, what should the reader see? Locking the row until the writer commits would work but kills concurrency. Letting the reader see uncommitted changes creates inconsistent reads.
InnoDB solves this with MVCC (Multi-Version Concurrency Control). When a transaction modifies a row, InnoDB keeps the old version in the undo log. Readers see the version that was committed when their transaction started, regardless of concurrent modifications. Writers see their own uncommitted changes. Neither blocks the other.
How MVCC works:
One of InnoDB's most important architectural decisions is the clustered index: the table data itself is stored as a B+tree, ordered by primary key. This is not just an index pointing to data stored elsewhere. The leaf nodes of the primary key index contain the actual row data.
Implications:
Primary key best practices:
| Approach | Pros | Cons |
|---|---|---|
| Auto-increment INT | Compact, sequential inserts | Hotspot on last page |
| UUID | No hotspot, globally unique | Large (16 bytes), random inserts |
| Ordered UUID | Unique, sequential | Still 16 bytes |
| Natural key | Meaningful | May change, often large |
In practice: Primary key choice has performance implications that go beyond uniqueness. Auto-increment integers provide compact keys and sequential inserts, which minimize page splits and keep the clustered index efficient. UUIDs spread inserts randomly across the index, causing more page splits and fragmentation.
However, UUIDs may be necessary for distributed systems where generating sequential IDs requires coordination. Ordered UUIDs (like ULIDs) offer a middle ground: globally unique but roughly sequential.
The difference between a query taking 50 milliseconds and 5 seconds often comes down to indexing. A missing index forces MySQL to scan every row in a table. The right index lets MySQL jump directly to the relevant rows.
But indexes are not free. Each index slows down writes because MySQL must update both the table and every affected index. Each index consumes storage. The skill lies in creating indexes that support your query patterns without creating unnecessary overhead.
MySQL uses B+Tree indexes for all standard indexes, both primary and secondary. Understanding this structure explains why some queries use indexes and others do not.
Key properties:
Because InnoDB stores table data in the clustered primary key index, secondary indexes work differently than you might expect. A secondary index does not contain a pointer to the row's physical location. Instead, it contains the primary key value.
Looking up a row through a secondary index requires two steps: find the primary key in the secondary index, then find the row in the primary index.
This two-step lookup explains why covering indexes (discussed below) are valuable: they eliminate the second step by including all needed columns in the secondary index itself.
When queries filter on multiple columns, a composite index on those columns can be far more effective than separate indexes. But column order matters critically.
Think of a composite index like a phone book sorted by last name, then first name. You can find all Smiths. You can find all Smiths named John. But you cannot efficiently find all Johns regardless of last name, because the data is not organized that way.
Leftmost prefix rule:
Include all columns needed by query to avoid table lookup.
EXPLAIN shows "Using index" for covered queries:
When MySQL chooses wrong index, you can hint:
Use sparingly. Usually indicates a statistics problem or schema issue.
For text search within columns:
Choosing the right index approach comes down to understanding your query patterns:
| Query Pattern | Index Strategy |
|---|---|
| Equality (WHERE a = ?) | Index on (a) |
| Equality + Range (a = ? AND b > ?) | Index on (a, b) |
| Multiple equalities (a = ? AND b = ?) | Index on (a, b) or (b, a) |
| ORDER BY | Include sort columns in index |
| GROUP BY | Index on grouped columns |
| JOIN | Index on join columns |
| SELECT specific columns | Covering index |
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Too many indexes | Slow writes | Remove unused indexes |
| Indexes on low-cardinality | Not selective | Use for high-cardinality |
| Function on indexed column | Index not used | Rewrite query |
| Leading wildcard LIKE | Full scan | Full-text search |
| Over-indexing | Maintenance overhead | Index for actual queries |
In practice: When discussing database design, tie your indexing strategy to specific query patterns. If the primary access pattern is "get all orders for a user, sorted by date," explain that you would create an index on (user_id, created_at). If the query also selects only order_id and status, mention that including those columns creates a covering index that eliminates the table lookup entirely.
This level of specificity demonstrates that you understand indexing as targeted optimization, not a generic checkbox
Concurrency is where database systems get hard. As soon as multiple transactions touch the same rows at the same time, you risk lost updates, inconsistent reads, or overselling inventory. Locks prevent these conflicts, but locking too aggressively can destroy throughput and increase tail latency.
InnoDB sits in the middle: it offers strong correctness guarantees while still enabling high concurrency through MVCC + fine-grained locks. Understanding how InnoDB locks work is essential if you want to design systems that behave correctly under real-world contention.
Isolation levels define what one transaction is allowed to observe about another transaction’s work. Stronger isolation reduces anomalies but can increase locking and reduce concurrency.
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| READ UNCOMMITTED | Yes | Yes | Yes |
| READ COMMITTED | No | Yes | Yes |
| REPEATABLE READ (default) | No | No | Yes* |
| SERIALIZABLE | No | No | No |
* InnoDB’s REPEATABLE READ prevents most phantom reads by using gap / next-key locks on range queries.
InnoDB doesn’t only lock “rows.” It can also lock gaps between rows to prevent conflicting inserts.
Record Lock: Locks a single index record.
Gap Lock: Locks the gap between index records. Prevents inserts in the range.
Next-Key Lock: A next-key lock combines both behaviors: it locks the matching records and the gaps around them. This is one of the mechanisms InnoDB uses to reduce phantom-like anomalies in REPEATABLE READ.
MySQL gives you explicit control over whether reads should acquire locks.
These are foundational tools for building correct systems: inventory deduction, job processing, seat booking, wallet transfers, and more.
A deadlock happens when two transactions each hold locks the other needs, so neither can progress.
InnoDB handles deadlocks well:
ERROR 1213: Deadlock found when trying to get lockTo inspect deadlocks:
How to reduce deadlocks in practice:
There are two broad approaches to handling write conflicts.
Pessimistic Locking: Lock the row first, then update it.
Optimistic Locking: Allow concurrent reads, then detect conflicts at update time using a version column.
| Approach | Best For | Trade-off |
|---|---|---|
| Pessimistic | High contention | Blocks other transactions |
| Optimistic | Low contention | Retry overhead on conflict |
A good rule of thumb: if you expect many users to compete for the same rows at the same time, lock early (pessimistic). If conflicts are rare, avoid locks and handle the occasional retry (optimistic).
In practice: The choice between pessimistic and optimistic locking depends on conflict frequency. For inventory deduction in a flash sale where many users compete for limited stock, pessimistic locking (SELECT FOR UPDATE) prevents overselling by blocking concurrent access.
For user profile updates where conflicts are rare, optimistic locking reduces database blocking at the cost of occasional retry overhead when conflicts do occur. Both approaches are valid; the context determines which is better.
A single MySQL server is both a performance bottleneck and a single point of failure. Replication addresses both: replicas can serve read queries, distributing the read load, and if the primary fails, a replica can be promoted to take over.
MySQL's replication has been refined over two decades. It is one of MySQL's strongest selling points, with operational tooling and community knowledge that newer databases cannot match. Understanding the various replication configurations, their trade-offs, and when to use each is essential for system design discussions.
MySQL replication works by recording changes on the primary server to a binary log, then replaying those changes on replica servers. The process involves three threads: the primary's binary log writer, the replica's I/O thread that reads from the primary, and the replica's SQL thread that applies the changes.
At a high level, the flow looks like this:
Binlog events can be recorded in different formats:
| Format | Content | Pros | Cons |
|---|---|---|---|
| Statement | SQL statements | Compact | Non-deterministic issues |
| Row | Actual row changes | Deterministic | Larger logs |
| Mixed | Combination | Balance | Complexity |
In most production setups, row-based replication is the default choice because it avoids subtle correctness issues with functions like NOW(), RAND(), triggers, and non-deterministic execution paths.
Asynchronous (default): The source does not wait for replicas. It commits locally, acknowledges the client, and replicas catch up later.
Semi-synchronous: The source waits until at least one replica confirms it has received the transaction’s binlog event (not necessarily applied it) before acknowledging commit.
Replication isn’t just “primary + replicas.” The topology determines scaling limits, failure behavior, and operational complexity.
Most common. Simple, effective for read scaling.
Reduces load on the primary by having replicas replicate from replicas.
One replica pulls from multiple sources.
Useful for aggregation and specialized workflows, but it raises conflict/ordering complexity.
Two primaries accept writes and replicate to each other.
This is usually avoided unless you have a robust conflict-resolution strategy, because write conflicts are inevitable.
MySQL Group Replication provides a higher-level replication system with built-in membership, failover, and consensus-based ordering. It can run in:
Key characteristics:
Trade-offs:
Replication is rarely truly “instant.” Lag happens for predictable reasons:
Monitor lag on replicas:
Common strategies to deal with lag:
In practice: Replication configuration depends on your requirements. For a typical web application prioritizing performance, asynchronous replication with row-based format provides good throughput with acceptable durability (you might lose the last few transactions on primary failure).
For systems where losing any committed transaction is unacceptable, semi-synchronous replication ensures at least one replica has received the data before acknowledging the commit. For automatic failover without human intervention, Group Replication provides consensus-based high availability at the cost of write latency.
Vertical scaling eventually hits a wall. You can only add so much CPU, RAM, and I/O to a single machine before the gains flatten out or the price becomes unreasonable. Replication helps you scale reads, but it does almost nothing for writes, because every write still funnels through the primary.
When a single primary becomes your bottleneck, sharding is the next step.
Sharding splits data across multiple database instances (shards). Each shard owns a subset of the data and handles a subset of the traffic. This distributes storage and write throughput, but it introduces operational and application complexity that you should not accept lightly.
Sharding is typically justified when at least one of these becomes true:
Split by contiguous key ranges
Pros
Cons
Choose shard by hashing the key
Pros
Cons
Use a lookup service/table to map keys to shards
Pros
Cons
The shard key determines almost everything: distribution, routing, and query shape. A bad shard key can make sharding worse than a single database.
Good shard key properties:
Common shard keys:
| Entity | Good Shard Key | Why |
|---|---|---|
| Users | user_id | Even distribution, used in most queries |
| Orders | user_id (not order_id) | Keeps user's orders together |
| Messages | conversation_id | Messages grouped by conversation |
| Multi-tenant | tenant_id | Isolates tenant data |
A practical heuristic: shard by the key that your application most naturally routes by (“who owns this data?”).
Sharding is easy when queries are “single-key, single-shard.” It gets expensive when queries need data from many shards.
Vitess is a popular open-source system for scaling MySQL horizontally. It sits between the application and MySQL and provides a sharding-aware database layer.
It’s widely used in large-scale production environments (notably built at YouTube and adopted by many others) because it reduces how much sharding logic leaks into application code.
Sharding is powerful, but it’s a “you own the complexity forever” decision. Exhaust simpler levers first:
Shard only when the single-primary architecture is the hard blocker and you’re confident the access patterns justify the added complexity.
In practice: Sharding should be a last resort, not a first instinct. Before proposing sharding, explain that you would first optimize queries and indexes, then add read replicas for read scaling, then implement caching for hot data. Only when these approaches are insufficient should you consider sharding.
If sharding becomes necessary, the shard key choice is critical: it should align with your primary access patterns so that most queries hit a single shard. Cross-shard queries and transactions are expensive and should be exceptional, not normal.
When a system slows down, it’s rarely “the database” in the abstract. It’s usually a small number of expensive queries doing disproportionate damage—scanning too many rows, missing the right index, sorting huge result sets, or triggering a bad join plan. One poorly indexed query can spike CPU, saturate I/O, and cascade into timeouts across the entire application.
You can’t optimize what you don’t understand. Start by inspecting how MySQL plans to execute your query.
| Column | Meaning |
|---|---|
| type | Join type (const, ref, range, index, ALL) |
| possible_keys | Indexes that could be used |
| key | Index actually used |
| rows | Estimated rows to examine |
| filtered | Percentage of rows filtered by condition |
| Extra | Additional information |
| Type | Meaning | Performance |
|---|---|---|
| const | Single row by primary key | Excellent |
| eq_ref | One row per join | Excellent |
| ref | Multiple rows by index | Good |
| range | Index range scan | Good |
| index | Full index scan | Medium |
| ALL | Full table scan | Poor |
As a first pass: avoid ALL on large tables, and be wary when rows is huge.
If the query can be answered entirely from an index, MySQL avoids extra table lookups.
A good signal in EXPLAIN is Extra: Using index (meaning the index covers the query).
SELECT *SELECT * increases I/O and prevents some optimizations, especially when rows are wide.
If MySQL can walk an index in the needed order, it can stop early instead of sorting a large dataset.
Reduce round trips and transaction overhead.
These patterns often block index usage or force expensive scans:
Offset pagination looks simple, but it gets slower as offsets grow because MySQL still has to walk past skipped rows.
For stable ordering with ties (common in feeds), use a composite cursor:
Optimization starts with visibility. The goal is to identify slow queries, quantify impact, and track regressions.
In practice: A methodical approach to query optimization starts with identifying which queries matter. The slow query log reveals queries exceeding a time threshold. Performance Schema provides aggregate statistics showing which query patterns consume the most total time.
For the top offenders, EXPLAIN reveals the execution plan: look for type = ALL (full table scan), missing index usage, and row estimates far exceeding actual needs. Then add or modify indexes, rewrite queries, or add caching as appropriate.
Interviewers frequently ask why you chose MySQL over alternatives. The answer should not be "it is what I know" or "it is the most popular." Instead, demonstrate understanding of how different databases make different trade-offs, and why those trade-offs align with your specific requirements.
This is the most common comparison. Both are mature, open-source relational databases with strong ecosystems. The choice often depends on specific requirements and team expertise.
| Aspect | MySQL | PostgreSQL |
|---|---|---|
| Replication | Mature, simpler | More options, complex |
| JSON support | JSON type | JSONB (better indexing) |
| Full-text search | Basic | More powerful |
| Window functions | Basic (8.0+) | Comprehensive |
| Extensions | Limited | Rich ecosystem |
| Learning curve | Easier | Steeper |
| Performance | Faster for simple queries | Better for complex queries |
Choose MySQL: Simpler setup, read-heavy web applications, existing MySQL expertise.
Choose PostgreSQL: Complex queries, advanced SQL features, JSONB, geospatial.
| Aspect | MySQL | MongoDB |
|---|---|---|
| Data model | Relational | Document |
| Schema | Fixed | Flexible |
| Transactions | Full ACID | ACID (since 4.0) |
| JOINs | Native | $lookup (limited) |
| Scaling | Manual sharding | Native sharding |
| Query language | SQL | MQL |
Choose MySQL: Structured data, complex relationships, strong consistency.
Choose MongoDB: Flexible schema, document-oriented data, rapid iteration.
| Aspect | MySQL | Cassandra |
|---|---|---|
| Model | Relational | Wide-column |
| Consistency | Strong | Tunable |
| Writes | Good | Excellent |
| Reads | Excellent | Good |
| Scaling | Manual | Native |
| Query flexibility | High | Low (query-first) |
Choose MySQL: Complex queries, ACID transactions, moderate scale.
Choose Cassandra: Extreme write throughput, time-series, known access patterns.
| Aspect | MySQL | TiDB |
|---|---|---|
| Compatibility | Native | MySQL protocol |
| Scaling | Manual | Automatic horizontal |
| Consistency | Single node | Distributed ACID |
| Complexity | Lower | Higher |
| Operations | Mature | Newer |
Choose MySQL: Data fits on one node, simpler operations.
Choose TiDB: Need horizontal scaling with MySQL compatibility.
MySQL excels for read-heavy web applications where operational simplicity and reliability matter more than advanced SQL features. The depth of understanding you demonstrate about MySQL's architecture and trade-offs signals to interviewers that you can make sound database decisions under real-world constraints.
MySQL's sweet spot is read-heavy web applications with structured data: social networks, e-commerce catalogs, content management. Its mature replication makes read scaling straightforward, and its operational tooling reduces the 3 AM surprises. Acknowledge its limitations (write scaling requires sharding, advanced SQL features lag PostgreSQL) and explain how you would address them.
The buffer pool determines read performance. The redo log enables durability without sacrificing speed. MVCC allows readers and writers to coexist. Clustered indexes mean primary key choice affects all queries. This understanding lets you explain why certain configurations matter and predict where bottlenecks will appear.
The leftmost prefix rule determines whether a composite index helps a query. Covering indexes eliminate the secondary-to-primary lookup. Understanding these mechanics lets you design indexes for specific query patterns rather than guessing.
Pessimistic locking (FOR UPDATE) prevents conflicts but reduces concurrency. Optimistic locking (version columns) allows concurrency but requires retry logic. Neither is universally better; the right choice depends on conflict frequency.
Asynchronous replication provides performance at the cost of potential data loss. Semi-synchronous ensures durability at the cost of latency. Group Replication provides automatic failover at the cost of complexity. Match the configuration to the system's requirements.
Sharding adds complexity that is difficult to remove. Query optimization, read replicas, and caching solve many scaling problems without the operational burden of distributed data. When sharding becomes necessary, the shard key choice determines whether the system remains manageable.
When you propose MySQL in an interview, the strength of your answer lies not in claiming MySQL is always the best choice, but in articulating exactly why it fits this particular problem, what trade-offs you are accepting, and how you would operate it as the system grows.
In a typical MySQL deployment with a proxy layer, what is a primary benefit of routing application connections through ProxySQL instead of connecting directly to MySQL?