Two-Tower Architecture

11 min readUpdated June 1, 2026

The previous chapter covered how embedding-based retrieval works at serving time: encode queries and items into vectors, search an ANN index, return candidates.

But the quality of those candidates depends entirely on how well the encoders are trained.

So the real question is: how do you train encoders that actually capture relevance?

This is where the two-tower architecture comes in. It’s the standard approach used at companies like YouTube, Google, and LinkedIn, and it powers most large-scale retrieval systems in production today.

Premium Content

Subscribe to unlock full access to this content and more premium articles.

Get Premium

Subscribe to unlock full access to all premium content

Subscribe Now

See What's New

Embedding-Based Retr...

Multi-Stage Ranking

Embedding-Based Retrieval

Multi-Stage Ranking