What happens when your main database server crashes?
If you’ve designed your system with replication, nothing much, the system keeps running.
In this chapter, we'll explore why replication exists, how it improves availability and scalability, and how to choose between replication strategies like synchronous, asynchronous, and semi-synchronous.
You’ll also learn how real-world databases like MySQL, PostgreSQL, and MongoDB handle replication under the hood.
Replication is the process of creating and maintaining multiple copies of the same data on different servers, known as replicas.
The primary goal is to ensure that your system remains operational and your data remains safe, even if one or more servers go offline. This simple concept is the foundation for achieving high availability, fault tolerance, and read scalability.
However, keeping multiple copies of data perfectly synchronized across a network introduces complex challenges around consistency, latency, and failure handling.
Goal | Description | Example | 
|---|---|---|
High Availability  | Keep service running when one node fails  | Primary DB crashes → replica takes over  | 
Read Scalability  | Distribute read load  | Multiple app servers query replicas  | 
Disaster Recovery  | Maintain cross-region backups  | US → EU replication  | 
Fault Tolerance  | Survive hardware or network failures  | Leader crash → follower promotion  |