Last Updated: March 15, 2026
As AI systems become more capable, they increasingly need to interact with external tools, data sources, and services. A language model alone can generate text, but real-world applications often require actions such as querying databases, calling APIs, reading files, or controlling software.
To enable this safely and consistently, we need a standardized way for models to communicate with external systems. The Model Context Protocol (MCP) is designed to solve this problem.
It provides a structured interface that allows AI models to connect to tools, retrieve information, and perform actions in a controlled and reliable way. Instead of building custom integrations for every tool or service, MCP establishes a common protocol that makes these interactions predictable and scalable.
In this chapter, we will explore what MCP is, why it was created, how it works, and how it enables AI agents to interact with tools and environments in real-world AI applications.
In the early 1990s, connecting peripherals to a computer was painful. Your printer used a parallel port. Your mouse used PS/2. Your modem used a serial port. Your joystick used a game port. Every device manufacturer designed their own connector, their own electrical protocol, and their own driver software. If you wanted to connect N devices to M computer models, you needed N x M different implementations.
Then USB arrived in 1996. It defined a single standard that any device and any computer could implement. A printer manufacturer built one USB interface, and it worked on every computer with a USB port. A computer manufacturer built one USB port, and it worked with every USB device. N + M implementations instead of N x M.
MCP is USB for AI tool integrations. It defines a standard protocol that any AI application and any tool can implement. A tool developer builds one MCP server, and it works with Claude Desktop, VS Code Copilot, Cursor, or any custom chatbot that speaks MCP. An application developer implements one MCP client, and it can connect to thousands of MCP servers without writing custom code for each one.
The numbers tell the story clearly. Three apps connecting to three tools without MCP: 9 custom integrations. With MCP: 3 client implementations plus 3 server implementations, 6 total. At ten apps and twenty tools, the gap is staggering: 200 custom integrations versus 30 MCP implementations. The protocol pays for itself almost immediately, and the value compounds as the ecosystem grows.
MCP is an open protocol created by Anthropic in late 2024 and adopted by a growing number of AI vendors and tool providers. It is not proprietary to Claude or any specific model. Any AI application that speaks MCP can use any MCP server, regardless of who built either side.
MCP defines three roles that work together to connect AI applications with external capabilities. Understanding these roles is essential because the rest of this module builds on them.
The host is the AI application that end users interact with. Claude Desktop, a VS Code extension with AI features, your custom chatbot, anything that wraps an LLM and wants to give it access to external capabilities. The host has several responsibilities:
A single host can manage multiple MCP clients, each connected to a different server. This is how a coding assistant can simultaneously access your file system, your GitHub repos, and your database through one unified interface.
The client is the protocol handler that lives inside the host. Each client manages the connection to exactly one MCP server. Its job is entirely mechanical:
The 1:1 relationship between clients and servers is a deliberate design choice. It keeps connections isolated. If one server crashes or becomes unresponsive, the other connections remain unaffected. If your host connects to five servers, it runs five independent clients.
The server is where the actual capabilities live. It is a program (often lightweight) that exposes tools, data, and prompt templates through the MCP protocol. Some examples:
Servers can run locally as subprocesses of the host, or remotely as web services. They are designed to be single-purpose and composable. You connect the servers you need, and each one handles its own domain.
When a user asks a question that requires external data, here is the full sequence:
This flow has a familiar shape, but with a standardized protocol sitting between the application and the tools. The LLM still decides when to call a tool. The host still routes the call. The difference is that the connection between the host and the tool follows a universal standard instead of custom glue code.
MCP servers expose capabilities through three distinct primitives. Each one serves a different purpose, and understanding the distinctions helps you design better servers.
Resources represent data that the AI can read. Think of them as files or documents with stable URIs. A file system server exposes files as resources. A database server might expose table schemas. An API documentation server might expose endpoint specs.
Resources are identified by URIs, like file:///home/user/notes.txt or db://products/schema. Clients can list available resources, read their contents, and optionally subscribe to changes (for resources that update over time).
The defining characteristic: resources are read-only and side-effect-free. Reading a resource should never modify state. This makes them safe to read at any time, which is why the application (not the model) typically controls when resources are loaded.
Tools are functions the AI can call to do things. Search for files, execute a database query, send a message, create a pull request. Each tool has a name, a description, and a JSON schema defining its input parameters.
Tools are action-oriented and may have side effects. Writing a file, sending an email, deleting a record, these are all tool operations. Because they can change state, MCP clients typically ask for user confirmation before executing tool calls. The model decides when a tool should be called based on the user's request.
Prompts are pre-built prompt templates that the server offers to clients. A code review server might provide a "review this pull request" prompt that structures how the AI should analyze changes. A database server might offer a "optimize this query" prompt that guides the AI through a systematic analysis.
Prompts are user-initiated. Unlike tools (where the model decides to call them) or resources (where the application loads them), prompts are selected by the user from a menu. They shape the AI's approach to a task rather than providing data or performing actions.
In practice, tools dominate. Most MCP servers expose a handful of tools and maybe a few resources. Prompts are less common but valuable for servers that support specialized workflows.
MCP supports two transport mechanisms for communication between clients and servers. The choice of transport does not change the protocol messages themselves, it only changes how those messages travel between client and server.
With stdio transport, the MCP server runs as a child process of the host application. The client spawns the server process and communicates with it by writing JSON-RPC messages to the server's standard input and reading responses from its standard output.
This is the simplest transport. No network configuration, no ports to open, no authentication tokens to manage. The server starts when the host starts and stops when the host stops. It is ideal for local tools: file system access, local database queries, CLI utilities, anything that runs on the same machine as the host.
The downside is that stdio servers are tied to a single host process. You cannot share a stdio server across multiple users or applications. If the host restarts, the server restarts too, losing any in-memory state.
With Streamable HTTP transport, the server runs as a standalone web service. The client sends requests via HTTP POST and receives responses (including streaming updates) through Server-Sent Events (SSE).
Streamable HTTP enables several patterns that stdio cannot support:
The trade-off is more setup. You need a URL, potentially TLS certificates, authentication, and network access. For development and local tools, this overhead is unnecessary.
A common pattern is to develop MCP servers locally using stdio for fast iteration, then deploy them as Streamable HTTP services for production use. The server code stays the same. Only the transport configuration changes.
MCP is not just a protocol specification. It is a growing ecosystem of clients, servers, and registries that make the standard practical.
Several major AI applications already support MCP as clients:
This means that when you build an MCP server, it automatically works with all of these clients. You write the server once, and developers using Claude Desktop, Cursor, or any other MCP-compatible tool can connect to it without any additional effort from you.
The ecosystem already has hundreds of community-built and official MCP servers covering common use cases:
As the number of servers grows, discovering the right one becomes its own challenge. MCP registries solve this. They act as directories where server developers publish their servers and client applications discover them. Think of them like package managers (npm, PyPI) but for MCP servers.
Registries enable a powerful workflow: instead of manually configuring server connections, a client application can query a registry to find servers that match what the user needs. "Find me an MCP server for PostgreSQL" returns a list of options with descriptions, ratings, and configuration instructions.
This is still an emerging area. Several registries are in development, and the protocol is evolving to support standardized server discovery.
If you already know about function calling, you might be wondering: how is MCP different? This is a common source of confusion, so let's clarify.
Function calling and MCP are complementary, not competing. They operate at different layers of the stack.
Function calling is the mechanism by which an LLM decides to invoke a tool. You provide the model with tool descriptions (name, parameters, schema), the model generates a structured tool call in its response, and your application executes it. This happens inside a single application. You define the tools, you execute them, you handle the results.
MCP is the protocol by which tools are discovered, described, and accessed across application boundaries. It standardizes how a client learns what tools a server offers, how it sends requests, and how it receives responses. MCP sits between the application and the tools.
Here is the key insight: MCP uses function calling under the hood. When an MCP client discovers tools from a server, it converts those tool descriptions into the function-calling format that the LLM understands. The LLM does not know or care that MCP exists. It sees tool descriptions and generates tool calls, just like before. The MCP client handles the translation.
Use plain function calling when you have a small number of tools tightly coupled to your application. A chatbot with three custom tools that will never be reused elsewhere does not need MCP overhead.
Use MCP when you want tools that are reusable across applications, when you want to connect to existing servers from the ecosystem, when your tool infrastructure is managed by a different team, or when you need dynamic tool discovery. MCP really shines as the number of tools and applications grows.
In many production systems, you will use both. Some tools are MCP servers (shared, standardized), and some are inline function definitions (app-specific, lightweight). They coexist naturally.