{"title":"What are AI Agents?","description":"","content":"A language model is often first experienced through chat: ask a question, get an answer, move on. That interface is useful, but many real systems need more than one response. They need to inspect files, query APIs, compare options, recover from errors, or keep working until a task is done.\n\nAn AI agent is a software system that uses a language model to help choose actions toward a goal. The model may plan, inspect state, call tools, read tool results, revise its approach, and decide when to stop. The tools are ordinary software interfaces: APIs, databases, search systems, file systems, browsers, code execution sandboxes, or internal services.\n\nThe important distinction is control flow. In a chatbot, the application decides when the model is called and what happens next. In an agent, the model participates in choosing the next step. That extra autonomy can be useful, but it also creates new failure modes. This chapter explains the agent loop, the autonomy spectrum, and the engineering trade-offs that determine when an agent is the right design.\n\n---\n\n# The Agent Loop: Observe, Decide, Act\n\nA chatbot's flow is simple. The user sends a message, the application calls the model, and the model returns a response. That is one round trip. Unless the application adds tools or memory, the model does not inspect external state, verify its answer, or recover from a failed action.\n\nAn agent runs a feedback loop. It observes the current state (what tools returned, what errors occurred, what the user asked), decides what to do next (which tool to call, what to search for, whether the goal is done), and acts (calls a tool, asks a question, returns an answer, or stops). Then it observes the new state. This continues until the agent reaches the goal or hits a limit.\n\n\n\n\n\n\n```mermaid\nflowchart TD\n GOAL[Goal / User Request]:::primary --> OBS[Observe
Current State]:::teal\n OBS --> THINK[Decide
What to do next?]:::orange\n THINK --> ACT{Action Type}:::yellow\n ACT -->|Call a tool| TOOL[Execute Tool]:::purple\n ACT -->|Generate output| OUT[Return Result]:::green\n ACT -->|Need more info| OBS\n TOOL --> RESULT[Observe Tool Result]:::teal\n RESULT --> THINK\n OUT --> DONE[Task Complete]:::green\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef teal fill:#38d9a9,stroke:#000,color:#000\n classDef orange fill:#ffa94d,stroke:#000,color:#000\n classDef yellow fill:#ffd43b,stroke:#000,color:#000\n classDef purple fill:#f783ac,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n```\n\n\nThe diagram captures the key difference: the agent loop has feedback. After each action, the system records what happened and uses that observation to choose the next step. The agent is not only producing text; it is operating inside a stateful control loop.\n\nThis loop is what makes agents useful for tasks where the path is not known ahead of time. An agent asked to compare flight options could search flights, notice that the first results are expensive, try nearby dates, check baggage fees, and summarize the trade-offs. A plain chatbot can discuss that strategy, but it cannot execute and adapt unless the surrounding application gives it those capabilities.\n\nThe same loop creates risk. An agent can call the wrong tool, repeat a search that is not helping, burn through a budget, or take an irreversible action from a mistaken premise. Every iteration costs tokens, latency, and operational attention. Every tool call is another place where validation, permissions, and error handling matter.\n\n---\n\n# The Autonomy Spectrum\n\nNot all AI systems are equally autonomous. It helps to think of them on a spectrum, from chatbots with application-controlled flow to agents that can pursue narrow goals with little human involvement.\n\n\n```mermaid\nflowchart LR\n CB[Chatbot]:::primary --> CP[Copilot]:::orange --> SA[Supervised
Agent]:::teal --> FA[Autonomous
Agent]:::green\n\n classDef primary fill:#00ceff,stroke:#000,color:#000\n classDef orange fill:#ffa94d,stroke:#000,color:#000\n classDef teal fill:#38d9a9,stroke:#000,color:#000\n classDef green fill:#69db7c,stroke:#000,color:#000\n```\n\n\n#### **Chatbot (application-controlled)**\n\nThe model responds to one message at a time. It does not choose tools, maintain task state, or take external actions unless the application adds those capabilities. This is predictable, less expensive, and easier to reason about. Many customer support assistants and Q&A systems should stay here.\n\n#### **Copilot (human in the loop)**\n\nThe model can suggest actions and even draft tool calls, but a human approves before anything important is executed. GitHub Copilot suggesting code edits is a copilot. An AI assistant that drafts a customer refund but waits for a support agent to click \"approve\" is a copilot. Copilots are a good default when the cost of a wrong action is high, such as modifying production data or sending messages to customers.\n\n#### **Supervised agent (human oversight)**\n\nThe agent takes low-risk actions on its own, but a human can intervene or review important decisions. It might handle a sequence of read-only steps without asking, then pause at a decision point: \"I found three conflicting records. Which one should I use?\" Many production agents belong here: bounded autonomy, clear audit trails, and explicit escalation paths.\n\n#### **Autonomous agent**\n\nThe agent pursues a narrow goal with little or no human involvement during the run. It decides what to do, handles expected errors, and reports results when done. This offers the most autonomy and the largest potential impact. It requires reliable tools, narrow task scopes, strong permissions, and robust failure handling. In practice, this level is rare outside narrow, well-bounded tasks.\n\nWhere you sit on this spectrum is primarily a risk decision. Ask: what is the cost of a wrong action, how quickly can it be detected, and how easily can it be reversed? A wrong answer in a chatbot may be embarrassing. A wrong action in an autonomous agent may delete records, send incorrect emails to customers, or place orders. Match autonomy to impact, reversibility, and your ability to audit the run after the fact.\n\n---\n\n# When to Use Agents, Pipelines, or Chatbots\n\nA common mistake in AI engineering is using an agent when a workflow would be simpler and more reliable. Agents introduce variable latency, harder testing, and less predictable behavior. Before building one, ask whether the model truly needs to decide the next step.\n\nHere is a decision framework:\n\n\n| Situation | Best Approach | Why |\n|-----------|--------------|-----|\n| User asks one-off questions | Chatbot | No tool loop needed |\n| Fixed sequence of steps, no branching | Pipeline | Predictable, fast, inexpensive, easy to test |\n| Open-ended task requires branching based on results | Agent | Need to adapt based on intermediate outputs |\n| Same steps, but which ones run depends on input | Pipeline with branching, or agent | Use a pipeline if the branches are enumerable, an agent if they are not |\n| High-stakes actions, such as billing or customer messages | Copilot (human-in-loop) | Human approval reduces the impact of mistakes |\n| Exploratory tasks (research, debugging) | Agent | Hard to predict what steps will be needed |\n| Real-time, low latency required | Chatbot or pipeline | Agent loops add latency |\n\n\n**Chatbots** are the right default for conversational interfaces. If the task can be answered in one model call, do not add complexity.\n\n**Pipelines** are sequences of steps where you, the developer, decide the order and branching logic in advance. You call the model at step 1, pass the result to a parser or service at step 2, feed the structured result to another model call at step 3, and so on. The control flow lives in your code, not in the model's judgment. Pipelines are easier to test, monitor, and control costs for. If you can define the steps upfront, use a pipeline.\n\n**Agents** are the right choice when the task requires adaptive decision-making, where the next useful step depends on intermediate results in ways you cannot list cleanly at design time. A research agent that searches, reads, evaluates relevance, and changes strategy based on what it finds may need an agent loop. A fixed three-step summarization workflow does not.\n\nA common mistake is building an agent for a task that is really a pipeline. The model does not need to decide the order, you already know it. Forcing the model to make decisions it does not need to make wastes tokens, adds latency, and introduces failure points. Build the simplest thing that works, and only add agent-level autonomy when the task genuinely requires it.\n\n---\n\n# Agent Architectures\n\nOnce you have decided to build an agent, you need to choose an architecture. Three common patterns are single-loop agents, hierarchical agents, and collaborative agents. Each one trades complexity for capability.\n\n\n\n\n\n### Single-Loop Agent\n\nThis is the simplest architecture: one model, one loop, one set of tools. The model receives a goal, calls tools as needed, observes results, and keeps looping until it decides it is done. This is close to the ReAct pattern (reasoning and acting), and it is what many engineers mean by a basic agent.\n\nSingle-loop agents work well for tasks that fit in one context window, use a modest number of tool calls, and do not need separate specialist roles. A focused research assistant, a coding assistant working on one bug, or a support agent with read-only access to order APIs can often be built this way.\n\nThe limitation is state management. Every tool result competes for context. After enough iterations, the model may carry too much irrelevant history or miss important early constraints. Long-running tasks need summaries, external state, checkpoints, or a different architecture.\n\n### Hierarchical Agent\n\nA hierarchical agent has an orchestrator that delegates to sub-agents. The orchestrator breaks the goal into subtasks, assigns each one to a specialized sub-agent, collects the results, and synthesizes them into a final answer.\n\nFor example, a competitive analysis agent might have:\n\n- An orchestrator that receives the goal: \"Analyze the top three competitors to our product.\"\n- A research sub-agent for each competitor that searches the web and reads documentation.\n- A synthesis sub-agent that receives all three research reports and writes the analysis.\n\nEach sub-agent operates in its own context window. This helps with context limits and allows specialization. The research sub-agent can have a prompt and toolset built for web research. The synthesis sub-agent can have a prompt built for comparative analysis.\n\nThe trade-off is complexity. You now have multiple model calls, each adding cost and latency. The orchestrator's output quality directly limits the system's overall quality. If the orchestrator breaks down the task poorly, every downstream agent works from bad instructions.\n\n### Collaborative Agent\n\nCollaborative agents (also called multi-agent systems) are a network of agents that communicate with each other, each owning a part of the overall task. Unlike hierarchical agents where control flows top-down, collaborative agents can pass work back and forth, check each other's outputs, or run in parallel.\n\nThis architecture can fit software development workflows, where a planning agent defines the task, a coding agent writes the implementation, a review agent checks it, and a testing agent runs tests. If the tests fail, the testing agent can pass the failure back to the coding agent.\n\nCollaborative architectures are the hardest to build and debug. They are justified when the task has genuinely different modes of work, such as planning, implementation, review, and test execution, and when independent verification is worth the coordination overhead.\n\nFor most projects, start with a workflow or a single-loop agent. Add hierarchy or collaboration only after you can point to the specific limitation it solves.\n\n---\n\n# Cost and Reliability Trade-Offs\n\nAgents are not free. Before you commit to building one, have a clear view of the costs and the ways they fail.\n\n#### **Token cost**\n\nEvery iteration of the agent loop sends context back to the model. A 10-step loop can use far more tokens than a single response because later calls include earlier messages and tool results. Avoid hard-coding pricing assumptions in your design; model prices, context windows, and cached-token discounts change. Benchmark real runs, compute cost from provider pricing at deployment time, and set per-run budgets before exposing an agent to high traffic.\n\n#### **Latency**\n\nEach model call adds network latency, and each tool call adds the time needed to execute the tool itself. Multi-step agent runs can be noticeably slower than a single chatbot response. For real-time user-facing interactions, that may be unacceptable. Agents often fit background tasks and asynchronous workflows better than instant responses.\n\n#### **Reliability and error propagation**\n\nIn a pipeline, an error at step 3 usually fails at a named step with a clear stack trace. In an agent loop, the error becomes part of the model's next observation. The agent may recover, but it may also misread the error, try a dangerous workaround, or spend several iterations on an unproductive path. Agent reliability depends on defensive orchestration: validation, timeouts, budgets, retries, explicit stop conditions, and human escalation for high-impact actions.\n\n\n| Dimension | Chatbot | Pipeline | Agent |\n|-----------|---------|----------|-------|\n| Token cost | Low (usually 1 call) | Medium (fixed calls) | Variable and often higher |\n| Latency | Low | Medium and predictable | Higher and variable |\n| Reliability | Easier to control | Easier to control | Harder because errors feed back into the loop |\n| Capability | Limited | Medium | High for adaptive tasks |\n| Debuggability | Easier | Easier | Harder |\n| Best for | Q&A, conversation | Fixed workflows | Adaptive, multi-step tasks |\n\n\nAgents trade reliability and cost for capability. They can do more, but they cost more to run, fail in more ways, and are harder to debug when they do. Match the tool to the task.\n\n---\n\n# Building a Minimal Agent Loop\n\nLet's build one. The core of a simple agent is a loop that keeps calling the model until the model returns a final answer or the system stops it. An agent loop extends function calling: the model can request tools repeatedly, and the application feeds each tool result back into the next model call.\n\nThe key difference from a single tool call is the stopping condition. The loop continues as long as the model keeps requesting tools, up to a limit you control.\n\n\n**main.py**\n\n```python\nimport json\nfrom openai import OpenAI\nfrom dotenv import load_dotenv\n\nload_dotenv()\nclient = OpenAI()\nMODEL_NAME = \"your-agent-model\"\n\n# --- Tool definitions ---\ntools = [\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": \"search_web\",\n \"description\": \"Search an information source for facts about a topic. \"\n \"Use this when the answer depends on external information.\",\n \"parameters\": {\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\n \"type\": \"string\",\n \"description\": \"The search query to look up\"\n }\n },\n \"required\": [\"query\"]\n }\n }\n },\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": \"calculate\",\n \"description\": \"Evaluate a mathematical expression and return the result. \"\n \"Use this for any arithmetic, unit conversions, or numerical reasoning.\",\n \"parameters\": {\n \"type\": \"object\",\n \"properties\": {\n \"expression\": {\n \"type\": \"string\",\n \"description\": \"A valid Python math expression, e.g. '22 * 9/5 + 32'\"\n }\n },\n \"required\": [\"expression\"]\n }\n }\n },\n {\n \"type\": \"function\",\n \"function\": {\n \"name\": \"read_file\",\n \"description\": \"Read the contents of a local text file by path.\",\n \"parameters\": {\n \"type\": \"object\",\n \"properties\": {\n \"path\": {\n \"type\": \"string\",\n \"description\": \"The file path to read, e.g. './notes.txt'\"\n }\n },\n \"required\": [\"path\"]\n }\n }\n }\n]\n\n# --- Simulated tool implementations ---\ndef search_web(query: str) -> dict:\n \"\"\"Simulates a search tool. Replace with a real search API in production.\"\"\"\n mock_results = {\n \"python asyncio\": {\n \"summary\": \"asyncio is Python's built-in library for writing concurrent code using async/await syntax.\",\n \"source\": \"docs.python.org\"\n },\n \"example model input pricing\": {\n \"summary\": \"For this example, assume the model costs $2.00 per million input tokens.\",\n \"source\": \"mock pricing table\"\n },\n }\n # Find a partial match\n for key, value in mock_results.items():\n if key.lower() in query.lower() or query.lower() in key.lower():\n return value\n return {\"summary\": f\"No specific results found for '{query}'\", \"source\": \"mock\"}\n\ndef calculate(expression: str) -> dict:\n \"\"\"Evaluates a restricted mathematical expression.\"\"\"\n try:\n # Restrict to simple numeric expressions.\n allowed = set(\"0123456789+-*/()., **eE\")\n if not all(c in allowed or c.isspace() for c in expression):\n return {\"error\": \"Expression contains unsupported characters\"}\n result = eval(expression, {\"__builtins__\": {}}, {})\n return {\"expression\": expression, \"result\": result}\n except Exception as e:\n return {\"error\": str(e)}\n\ndef read_file(path: str) -> dict:\n \"\"\"Reads a local file.\"\"\"\n try:\n with open(path, \"r\") as f:\n return {\"path\": path, \"content\": f.read()}\n except FileNotFoundError:\n return {\"error\": f\"File not found: {path}\"}\n except Exception as e:\n return {\"error\": str(e)}\n\nAVAILABLE_FUNCTIONS = {\n \"search_web\": search_web,\n \"calculate\": calculate,\n \"read_file\": read_file,\n}\n\n# --- The agent loop ---\ndef run_agent(goal: str, max_iterations: int = 10) -> str:\n \"\"\"\n Runs the observe-decide-act loop until the model produces a final answer\n or the iteration limit is reached.\n \"\"\"\n messages = [\n {\n \"role\": \"system\",\n \"content\": (\n \"You are a helpful research agent. You have access to tools for \"\n \"searching an information source, performing calculations, and reading files. \"\n \"Use them as needed to complete the user's goal. \"\n \"When you have enough information to answer fully, respond with \"\n \"your final answer directly without calling any more tools.\"\n )\n },\n {\n \"role\": \"user\",\n \"content\": goal\n }\n ]\n\n iteration = 0\n\n while iteration < max_iterations:\n iteration += 1\n print(f\"\\n--- Iteration {iteration} ---\")\n\n # Decide: ask the model what to do next\n response = client.chat.completions.create(\n model=MODEL_NAME,\n messages=messages,\n tools=tools,\n tool_choice=\"auto\",\n )\n\n message = response.choices[0].message\n finish_reason = response.choices[0].finish_reason\n\n # Act: if no tool calls, the model has reached a final answer\n if finish_reason == \"stop\" or not message.tool_calls:\n print(f\"Agent finished after {iteration} iteration(s).\")\n return message.content\n\n # The model wants to call tools: execute them and observe the results\n messages.append(message)\n\n for tool_call in message.tool_calls:\n func_name = tool_call.function.name\n func_args = json.loads(tool_call.function.arguments)\n\n print(f\" [Act] Calling {func_name}({func_args})\")\n\n func = AVAILABLE_FUNCTIONS.get(func_name)\n if func:\n result = func(**func_args)\n else:\n result = {\"error\": f\"Unknown function: {func_name}\"}\n\n print(f\" [Observe] Result: {result}\")\n\n # Add the observation back to the conversation\n messages.append({\n \"role\": \"tool\",\n \"tool_call_id\": tool_call.id,\n \"content\": json.dumps(result)\n })\n\n # If we hit the iteration limit, ask the model for its best answer so far\n print(f\"Warning: reached max iterations ({max_iterations}). Requesting final answer.\")\n messages.append({\n \"role\": \"user\",\n \"content\": \"You have reached the maximum number of iterations. \"\n \"Please provide your best answer based on what you have found so far.\"\n })\n final_response = client.chat.completions.create(\n model=MODEL_NAME,\n messages=messages,\n )\n return final_response.choices[0].message.content\n\n# --- Run the agent ---\nif __name__ == \"__main__\":\n result = run_agent(\n \"Look up the example model input pricing. If the input cost is per million tokens, \"\n \"how would I estimate the cost to process 500,000 input tokens?\"\n )\n print(f\"\\nFinal Answer:\\n{result}\")\n```\n\n\nLet's walk through what this code does.\n\nThe `run_agent` function takes a goal and a maximum number of iterations. It initializes the message history with a system prompt and the user's goal, then enters the agent loop.\n\nOn each iteration, it calls the model and checks `finish_reason`. If the reason is `\"stop\"` (or there are no tool calls), the model has decided it has enough information and produced a final answer. The loop exits and returns that answer.\n\nIf the model produces tool calls, the code executes each one, collects the results, and appends them to the messages array. Then it loops back to call the model again with the updated context.\n\nThe `max_iterations` guard is important. Without it, a confused model could keep calling tools. When the limit is hit, the code makes one final model call asking for the best available answer instead of failing silently.\n\nRun this with the sample goal and you will see the agent call `search_web` to retrieve example pricing, then call `calculate` to work out the cost for 500,000 tokens, and finally produce an answer that combines both tool results. In a real product, you would use live pricing from configuration or a provider billing API, not a hard-coded lesson value. You would also replace the small calculator example with a proper parser, a trusted math library, or a sandboxed execution environment.\n\n---\n\n# Quiz\n\n---\n\n### References\n\n- [ReAct: Synergizing Reasoning and Acting in Language Models](https://arxiv.org/abs/2210.03629)\n- [OpenAI Function Calling Documentation](https://developers.openai.com/api/docs/guides/function-calling)\n- [Anthropic: Building Effective Agents](https://www.anthropic.com/research/building-effective-agents)\n- [LangGraph Documentation](https://docs.langchain.com/oss/python/langgraph/overview)\n- [Lilian Weng: Language Model-Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/)","pageType":"ai-engineering"}

Get Premium