A rate limiter is a system component that controls the rate at which clients can make requests to a service within a specified time window.
The core idea is simple: track how many requests a client has made and reject or throttle additional requests once they exceed a predefined threshold. This prevents abuse, protects backend resources, and ensures fair usage across all clients.
Loading simulation...
Example: If a system allows a maximum of 100 requests per minute, any request beyond that limit within the same minute would either be throttled (delayed) or rejected outright, often with an HTTP 429 Too Many Requests response.
In this article, we will dive into the system design of a distributed rate limiter, and explore the the 5 most commonly used rate limiting algorithms with examples, pros and cons.
This problem is a common choice in system design interviews because it touches on distributed systems, caching, algorithms, and real-time decision making. It is also a fundamental building block that appears in almost every large-scale system.
Let’s begin by clarifying the requirements.