Last Updated: May 29, 2026
A recommendation system with hundreds of millions of items can use retrieval to narrow the corpus to a few thousand candidates in milliseconds, but that only produces a rough set. The model that decides the final order is usually much heavier: hundreds of features, sequence models, cross-features, policy signals. Running it on every retrieved item is too slow and too expensive.
Multi-stage ranking breaks the problem into steps. Start with high-recall retrieval, filter with a cheap pre-ranker, then spend the expensive model only on the candidates that survive. Each stage narrows the set enough to make the next, more costly stage affordable.