Practice this topic in a realistic system design interview
Your service is running smoothly in a single data center. Then an earthquake hits, the power grid fails, or a fiber cable gets cut. Suddenly, millions of users cannot access your application.
This is not hypothetical. In 2017, an S3 outage in US-East-1 took down a significant portion of the internet. Companies relying solely on that region were completely offline for hours. In 2021, a BGP misconfiguration at Facebook took down WhatsApp, Instagram, and Facebook itself for six hours globally.
Multi-region architecture reduces the chance that one regional outage becomes a total service outage. By deploying across multiple geographic regions, you reduce regional blast radius, but you still have to design routing, data replication, failover, and operations carefully.
But going multi-region is not just about disaster recovery. It is also about latency. Physics is unforgiving: light through fiber travels at about 200,000 km/s, and a round trip from Tokyo to Virginia can be around 150ms before application work is counted.
A user in Tokyo should not have to wait that long for every request if the workload can be served from a nearby region.
The challenge is that multi-region is hard. Data may need to exist in multiple places, which means dealing with replication lag, consistency trade-offs, conflict resolution, and failover drills. Get it wrong, and you can end up with data loss, split-brain, or a system that is slower and less reliable than a single-region deployment.
In this chapter, we will cover the core patterns for multi-region architecture: why you might need it, the main architectural approaches, how to handle data replication, traffic routing strategies, failure handling, and the trade-offs a strong answer explains.
When interviewers ask "How would you make this globally available?" or "What happens if an entire region goes down?", they are probing for your understanding of multi-region strategies. The answer is never just "deploy in multiple regions." They want to know which pattern you would use, how you would handle data consistency, and what trade-offs you are making.