Last Updated: June 8, 2026
Microservices run multiple instances for capacity and redundancy. A load balancer spreads traffic across them so no single instance becomes overloaded while others sit idle.
This sounds simple, but real systems add complications: instances scale up and down, unhealthy instances must be skipped, some requests need sticky routing, and some instances may have more capacity than others.
This chapter covers common load-balancing algorithms, Layer 4 versus Layer 7 balancing, health checks, consistent hashing, and sticky sessions.