AlgoMaster Logo

Design Thread-Safe Rate Limiter

Last Updated: February 3, 2026

Ashish

Ashish Pratap Singh

medium

Common rate limiting algorithms

Scroll
AlgorithmHow it worksProsCons
Fixed WindowCount requests in fixed time windows (e.g., 100 req/minute)Simple to implementBurst at window boundaries
Sliding WindowRolling time window, smooths fixed window edgesSmoother limitsMore complex, needs timestamped logs
Leaky BucketRequests "leak" out at constant rateSmooths traffic perfectlyCan delay requests
Token BucketTokens refill over time, consumed per requestAllows bursts, smooth sustained rateSlightly more complex

For this problem, we'll implement the Token Bucket algorithm because it's widely used in production systems. It provides a good balance: it allows short bursts of traffic (up to bucket capacity) while enforcing a sustained average rate.

Loading simulation...

1. Problem Definition

At first glance, the requirement sounds simple: track request counts and reject when limits are exceeded. But once your API server handles requests on dozens of threads simultaneously, the problem becomes a real concurrency challenge.

Two threads might check the same counter at the exact same moment, both see "1 token remaining," both proceed, and now you have allowed 2 requests when only 1 was permitted.

In short, the system must guarantee that no client exceeds their allowed rate, even under extreme concurrency, while maintaining low latency for legitimate requests.

2. Token Bucket Algorithm Explained

Premium Content

This content is for premium members only.