AlgoMaster Logo

What are AI Agents?

Last Updated: March 15, 2026

Ashish

Ashish Pratap Singh

Most people interact with AI through chat interfaces. You ask a question, the model generates a response, and the interaction ends there. But modern AI systems are increasingly expected to do more than just generate text. They need to take actions, make decisions, and complete tasks autonomously.

This is where AI agents come in.

An AI agent is a system that uses a language model to reason about a problem, plan steps, and interact with external tools such as APIs, databases, search engines, or code execution environments. Instead of producing a single response, the agent can observe the environment, decide what to do next, execute actions, and iterate until the task is completed.

In this chapter, we will explore what AI agents are, how they work internally, and the core components that make agentic systems possible.

The Agent Loop: Observe, Think, Act

A chatbot's flow is straightforward. The user sends a message, the model processes it, the model sends a response. One round trip. The model has no memory of what it did, no ability to check whether its answer was correct, and no way to try again if something goes wrong.

An agent operates on a fundamentally different cycle. It observes the current state of the world (what tools returned, what errors occurred, what the user last said), it thinks about what to do next (which tool to call, what to search for, whether the goal has been achieved), and it acts (calls a tool, generates output, decides to stop).

Then it loops back and observes the new state. This cycle continues until the agent decides the goal is complete, or until it hits a limit.

The diagram captures the key difference: the agent loop has a feedback cycle. After every action, the agent observes what happened and uses that to inform its next decision. Compare this to a chatbot, which does not observe anything, it just responds and waits.

This loop is what makes agents capable of multi-step reasoning. An agent asked to "find the cheapest flight from London to Tokyo next month" can search for flights, notice the results are expensive, try different date ranges, check baggage fees, and synthesize everything into a recommendation. A chatbot can only do any one of those steps if you explicitly ask for each one.

The loop is also what makes agents risky. A chatbot cannot get "stuck" in an infinite loop or take twenty actions when one would have sufficed. An agent can, and without careful design, it will. Every iteration costs tokens (and therefore money), and every tool call is an opportunity for something to go wrong.

The Autonomy Spectrum

Not all AI systems are equally autonomous. It helps to think of them on a spectrum, from pure chatbots that do exactly what you say, to fully autonomous agents that pursue goals with minimal human involvement.

Chatbot (no autonomy)

The model responds to one message at a time. It does not call tools, does not maintain task state, and does not take any action without being explicitly asked. GPT-4 in a basic chat interface is a chatbot. It is predictable, cheap, and easy to reason about. Most customer support assistants and Q&A systems live here.

Copilot (human in the loop)

The model can suggest actions and even draft tool calls, but a human approves before anything is executed. GitHub Copilot suggesting code edits is a copilot. An AI assistant that drafts a customer refund but waits for a support agent to click "approve" is a copilot. Copilots are a good default when the cost of a wrong action is high, like modifying production data or sending communications to customers.

Supervised agent (human oversight)

The agent takes actions autonomously, but a human monitors it and can intervene. The agent might handle a sequence of ten steps without asking, but it pauses at certain decision points (like "I found three conflicting records, which one should I use?") or logs every action so a human can audit the process. This is where most production agents should live. Fully autonomous behavior with audit trails and escalation paths.

Fully autonomous agent

The agent pursues a goal with no human involvement from start to finish. It decides what to do, handles errors on its own, and reports results when done. This is the most powerful option and also the most dangerous. It requires extremely reliable tools, well-defined task scopes, and robust failure handling. Most real-world agents are not fully autonomous in practice, even if they appear to be.

Where you sit on this spectrum is not a technical decision, it is a risk decision. Ask: what is the cost of a wrong action? A wrong response in a chatbot is embarrassing. A wrong action in a fully autonomous agent might delete records, send incorrect emails to thousands of users, or make purchases. Match your autonomy level to the reversibility and blast radius of the actions your agent can take.

When to Use Agents, Pipelines, or Chatbots

One of the most common mistakes in AI engineering is reaching for agents when a simpler approach would work better. Agents introduce complexity, cost, and unpredictability. Before building one, ask whether you actually need one.

Here is a decision framework:

Scroll
SituationBest ApproachWhy
User asks one-off questionsChatbotNo multi-step reasoning needed
Fixed sequence of steps, no branchingPipelinePredictable, fast, cheap, easy to test
Task requires branching based on resultsAgentNeed to adapt based on intermediate outputs
Steps are always the same but order variesAgent or pipelineDepends on whether the order matters
High stakes actions (e.g., billing, comms)Copilot (human-in-loop)Agent mistakes are too costly
Exploratory tasks (research, debugging)AgentHard to predict what steps will be needed
Real-time, low latency requiredChatbot or pipelineAgent loops add latency

Chatbots are the right default for conversational interfaces. If the task can be answered in one model call, do not add complexity.

Pipelines are sequences of steps where you (the developer) decide the order and logic in advance. You call the LLM at step 1, take the output, pass it to a function at step 2, feed the result back to the LLM at step 3, and so on. The control flow lives in your code, not in the model's reasoning. Pipelines are predictable, testable, and cheap. If you can define the steps upfront, use a pipeline.

Agents are the right choice when the task requires adaptive decision-making, where the next step depends on the result of the previous one in ways you cannot fully predict at design time. A research agent that searches, reads, evaluates relevance, and searches again based on what it finds needs to be an agent. A fixed three-step summarization workflow does not.

A common mistake is building an agent for a task that is really a pipeline. The model does not need to decide the order, you already know it. Forcing the model to make decisions it does not need to make wastes tokens, adds latency, and introduces failure points. Build the simplest thing that works, and only add agent-level autonomy when the task genuinely requires it.

Agent Architectures

Once you have decided to build an agent, you need to choose an architecture. The three main patterns are single-loop agents, hierarchical agents, and collaborative agents. Each one trades complexity for capability.

Single-Loop Agent

This is the simplest architecture. One model, one loop, one set of tools. The model receives a goal, calls tools as needed, observes results, and keeps looping until it decides it is done. This is the ReAct pattern (Reason and Act), and it is what most people mean when they say "agent."

Single-loop agents work well for tasks that fit in a single context window, require a modest number of tool calls (under 20 or so), and do not need specialized sub-skills. A research agent, a coding assistant, a customer support agent with access to order management APIs, these can all be single-loop agents.

The limitation is context. Every tool result gets appended to the conversation. After enough iterations, the context window fills up. Long-running tasks, or tasks that require maintaining state across many steps, will hit this ceiling.

Hierarchical Agent

A hierarchical agent has an orchestrator model that delegates to sub-agents. The orchestrator breaks the goal into subtasks, assigns each one to a specialized sub-agent, collects the results, and synthesizes them into a final answer.

For example, a competitive analysis agent might have:

  • An orchestrator that receives the goal: "Analyze the top three competitors to our product."
  • A research sub-agent for each competitor that searches the web and reads documentation.
  • A synthesis sub-agent that receives all three research reports and writes the analysis.

Each sub-agent operates in its own context window. This sidesteps the context limit problem and allows specialization. The research sub-agent can have a prompt and toolset optimized for web research. The synthesis sub-agent can have a prompt optimized for comparative analysis.

The trade-off is complexity. You now have multiple models running, each incurring cost and latency. The orchestrator's output quality directly limits the system's overall quality. If the orchestrator breaks down the task poorly, every downstream agent works from bad instructions.

Collaborative Agent

Collaborative agents (also called multi-agent systems) are a network of agents that communicate with each other, each owning a part of the overall task. Unlike hierarchical agents where control flows top-down, collaborative agents can pass work back and forth, check each other's outputs, or run in parallel.

This architecture is well-suited for tasks like software development workflows, where a planning agent defines the task, a coding agent writes the implementation, a review agent checks it for bugs, and a testing agent runs the tests. If the tests fail, the testing agent can pass the failure back to the coding agent.

Collaborative architectures are the most capable and the most complex to build and debug. Use them when the task is genuinely multi-disciplinary, where different parts of the work require different capabilities or specialized context, and where the coordination overhead is worth the improvement in output quality.

For most projects, start with a single-loop agent. Add hierarchy or collaboration only when you hit clear limitations that those patterns solve.

Cost and Reliability Trade-Offs

Agents are not free, in any sense of the word. Before you commit to building one, you should have a clear-eyed view of the costs and the ways they fail.

Token cost

Every iteration of the agent loop sends the entire conversation history to the model. A loop that runs 10 iterations might use 10x the tokens of a single chatbot call, because each call includes all previous messages and tool results. With GPT-4o at $2.50 per million input tokens, a simple agent task that runs 15 iterations with 2,000 tokens per call costs roughly $0.075. That sounds small, but at scale (say, 100,000 agent runs per day) it adds up to $7,500 per day for a single agent. Always benchmark your expected token usage before deploying an agent at scale.

Latency

Each tool call adds a round trip to the LLM API, plus the time to execute the tool itself. A 10-step agent loop might take 20-30 seconds to complete, compared to 1-2 seconds for a single chatbot response. For real-time user-facing interactions, this is often unacceptable. Agents work better for background tasks and asynchronous workflows than for instant responses.

Reliability and error propagation

In a pipeline, an error at step 3 fails cleanly and you can debug it. In an agent loop, an error at step 3 gets fed back to the model, which might try to recover, sometimes successfully, sometimes by making things worse. Agents can get stuck in loops, make incorrect assumptions about tool failures, or follow a flawed reasoning path for several iterations before a human notices. Error handling in agents requires more defensive design than pipelines.

Scroll
DimensionChatbotPipelineAgent
Token costLow (1 call)Medium (fixed calls)High (variable, often 5-20x)
LatencyLow (1-2s)Medium (predictable)High (variable, 10-60s)
ReliabilityHighHighLower (error propagation)
CapabilityLimitedMediumHigh
DebuggabilityEasyEasyHard
Best forQ&A, conversationFixed workflowsAdaptive, multi-step tasks

The key insight is that agents trade reliability and cost for capability. They can do more, but they cost more to run, fail in more interesting ways, and are harder to debug when they do. Match the tool to the task.

Building a Minimal Agent Loop

Now let's build one. The core of any agent is a while loop that keeps calling the model until the model decides it is done. An agent loop extends the function calling idea: the model can decide to call tools repeatedly, and it only stops when it produces a final answer.

The key difference from function calling is the concept of a stopping condition. Instead of always running the tool call loop once, the agent loop continues as long as the model keeps requesting tools, across multiple turns if needed.

main.py
Loading...

Lets walk through what this code does.

The run_agent function takes a goal and a maximum number of iterations. It initializes the message history with a system prompt and the user's goal, then enters the agent loop.

On each iteration, it calls the model and checks finish_reason. If the reason is "stop" (or there are no tool calls), the model has decided it has enough information and produced a final answer. The loop exits and returns that answer.

If the model produces tool calls, the code executes each one, collects the results, and appends them to the messages array. Then it loops back to call the model again with the updated context.

The max_iterations guard is critical. Without it, a confused model could call tools indefinitely. When the limit is hit, the code makes one final model call asking for the best available answer, rather than silently failing.

Run this with the sample goal and you will see the agent call search_web to get the GPT-4o pricing, then call calculate to work out the cost for 500,000 tokens, and finally produce a synthesized answer that combines both tool results.

References