Standard RAG is a fixed pipeline: retrieve evidence once, pass it to a model, and generate an answer. That works well for many direct factual questions. It starts to struggle when the question needs planning, several searches, tool choice, or a check that the retrieved evidence is actually enough.
In more complex cases, the system may need to search several sources, refine a query, inspect intermediate results, call a database, compare evidence, or decide that retrieval is not needed. This is where agentic RAG can help.
Agentic RAG puts a bounded decision loop around retrieval. The model can plan, call tools, check intermediate results, retry with a different strategy, and stop when it has enough evidence. The retrieval path becomes conditional instead of fixed.
This chapter covers router RAG, multi-step retrieval, adaptive retrieval, tool design, and how to decide when the extra cost and latency are worth it.