Last Updated: February 3, 2026
In the previous chapter, we explored network partitions, situations where nodes in a distributed system cannot communicate even though they are still running. Partitions are dangerous, but the real disaster happens when both sides of a partition believe they are in charge. This is the split-brain problem.
Imagine a database cluster with a primary node and two replicas. The primary handles all writes, coordinating with replicas to maintain consistency. Now a network partition isolates the primary from the replicas. The replicas cannot see the primary, so they elect one of themselves as the new primary.
Meanwhile, the original primary is still running, still accepting writes, unaware that it has been replaced. You now have two primaries, both accepting writes, creating conflicting data that will be nearly impossible to reconcile.
Split-brain is one of the most feared failure modes in distributed systems. It can cause data corruption, duplicate transactions, and inconsistencies that persist long after the partition heals. Preventing split-brain requires careful design, and even then, it is not always possible to guarantee safety under all circumstances.
In this chapter, you will learn: