In April 2026, a software founder described a failure that many platform teams now fear. An AI coding agent was working through an infrastructure task. It found an API token. The token had more power than the task needed. The agent used it to delete production data and backups in seconds. The lesson is not only "protect your tokens." The deeper problem is that autonomous agents do not behave like regular apps, which follow fixed workflows. On the contrary, AI agents reason, choose tools, change direction, and may search for credentials or context during the task. They can act in ways the original token issuer did not predict.
That is where Model Context Protocol (MCP) authorization becomes a production issue. For example, a platform engineer might connect an agent to GitHub, PagerDuty, and Confluence. The agent authenticates. The OAuth flow works. The MCP servers respond. Everything looks correct. Then the agent gets an incident task. It checks GitHub. That makes sense. It reads PagerDuty. That also makes sense. Then it calls Confluence and pulls every runbook in the workspace, including runbooks outside the current incident scope. The token was valid, the session was authenticated, and the tool existed. But nobody asked the most important question: should this agent, for this task, right now, call this tool with these arguments? That is the missing layer.
What MCP Actually Provides
MCP provides agents with a common protocol for accessing external tools. The official MCP tools specification states that tools allow models to interact with external systems, such as APIs, databases, and computing services. Each tool has a name, a description, and an input schema. Here, two operations are most relevant to this discussion. The first is tools/list. This lets the client discover the tool catalog exposed by the MCP server. The server can return tool names, descriptions, and schemas. This is useful because the agent can decide which tool might help with the task. The second is tools/call. It sends the tool name and arguments to the server. The server then runs the tool handler.
That design is clean. It is also where the security gap appears. There is no required protocol step between discovery and execution that asks whether the tool call is allowed in the current task context. MCP does include authorization, but its scope is access to the MCP server. It does not extend to per-tool, per-task decisions once that access is granted. This matters because agents do not always know in advance which tool they will call. They decide during reasoning. The risk appears after the prompt, after context retrieval, and after tool discovery. That is too late for static permission assumptions.
MCP defines two operations: tools/list and tools/call. Server-level authorization does not extend to per-call decisions between them. The gap appears at the moment of execution.
What MCP Authorization Actually Covers
MCP includes an authorization specification, but its scope is narrower than most teams assume. For example, OAuth improves session security, but it does not close this gap. OAuth 2.1 defines how an application can obtain limited access to a protected resource, either on behalf of a resource owner or on its own behalf. Resource indicators, defined in RFC 8707, allow a client to signal the protected resource it wants to access. Those are useful controls. They help bind tokens to resources and reduce broad token misuse. But they still do not inspect the actual tool call. OAuth can say: "This caller has a valid token for this resource." It does not say: this agent should delete this record, export this file, rotate this secret, read this runbook, or open this production incident.
However, there is one specific pattern in which the MCP specification is explicit, and it is widely violated in practice. The spec forbids the MCP server from passing the client's token directly to downstream APIs. The MCP server should use separately scoped upstream credentials rather than blindly passing through the client token. In practice, many current MCP implementations skip this. They use static tokens or pass the client token through unchanged, giving the agent the same broad access the original token carried, well beyond what any individual tool call requires. This is a version of the classic confused deputy problem: a trusted component uses its own authority, or a delegated token, to perform an action that is inappropriate for the requester’s actual intent. In MCP environments, the MCP server may be properly authenticated and authorized, yet still execute a tool action that is outside the current task scope. A legitimate identity is not the same as a legitimate action. MCP security guidance calls out this risk in the context of token passthrough and downstream API access.Scopes are usually defined before the agent reasons through the task. But agents choose tools after reasoning. They may see new data and may follow misleading instructions. They may combine tools in a way no developer expected. That is why static scopes are not enough for agentic systems. A token that allows deleting a record may also allow a delete record call. It may not understand which record is involved, why the deletion is occurring, whether the task requested the deletion, whether the agent has already tried safer alternatives, or whether this is a production resource. For human apps, this may be acceptable. For autonomous agents, it is too broad.
What Breaks in Production When This Layer Is Missing
The first failure is simple: agents that can see tools may try to use them. If tools/list returns the full advertised catalog, the agent learns what operations are possible. That includes tools outside the current task scope. Discovery becomes a form of information exposure. The agent may learn that it can delete, export, impersonate, escalate, archive, deploy, or change configuration. This is not only a user experience issue. It is an access control issue. A safe production design should not show an agent every possible tool by default. The catalog should be filtered before the agent sees it. If the agent is handling a low-risk support task, it should not discover production deployment tools. If it is resolving an incident for one service, it should not discover runbooks and credentials for unrelated systems.
The second failure is credential overreach. MCP servers often need credentials to call downstream APIs. Those credentials may be service tokens, cloud keys, API keys, database users, or platform secrets. Even when MCP transport authentication is correct, the downstream credential may still be too powerful. This is where many teams confuse authentication with least privilege. The agent may authenticate to the MCP server with a valid session. But once the MCP server receives the request, it may use a broad service credential to execute the action. If that credential can act across environments, projects, or resources, the agent indirectly inherits that power. The 2026 production deletion incident shows how dangerous this pattern can be. The report says the agent used a token to authorize a destructive command, and the token had root-scoped power. The deletion also affected backups due to the way the storage was configured. The lesson for MCP environments is clear: do not grant an agent standing access broader than the current task.
The third failure is weak auditability. MCP does not define a standard execution audit record for every tool call. Some MCP servers may log tool calls. Others may log only errors. Some may log the tool name but not arguments. Some may log the downstream API call but not the agent identity or task context. For security teams, that is not enough. After an incident, they need to reconstruct the chain:
- Who was the agent acting for?
- Which agent invoked the tool?
- What task was it trying to complete?
- Which tool did it call?
- What arguments were passed?
- Which credential reached the downstream system?
- Which policy allowed or denied the action?
- Was the decision made before execution?
Without that record, the execution layer becomes a blind spot. This is serious because AI adoption is already outpacing governance. IBM's 2025 Cost of a Data Breach research reports a global average breach cost of USD 4.4 million and highlights an AI oversight gap, including breached organizations that lacked AI governance policies or approval processes. Agentic AI increases that pressure. Gartner predicted that by 2026, 40% of enterprise applications would feature task-specific AI agents, up from less than 5% in 2025. More agents mean more tool calls. More tool calls mean more execution decisions. Those decisions need policy, credentials, and audit trails.
Three failure modes arise when runtime enforcement is absent: agents discover tools they should not see, credentials extend beyond what the task requires, and audit trails cannot reconstruct what happened.
What a Runtime Enforcement Layer Actually Does
A runtime enforcement layer sits between the agent and the MCP server. It does not replace MCP. It governs MCP use. The primary job is to intercepttools/call. Before the tool handler runs, the enforcement layer evaluates the invocation. Unlike a model deciding whether an action "seems reasonable," this check evaluates structured, deterministic inputs (tool name, arguments, resource identifiers, agent identity, task context, and policy version) and produces a binary decision before the handler runs. This is where execution-time authorization actually happens.
A complementary job is to intercept tools/list. Instead of showing every tool to every agent, the enforcement layer can filter the catalog so that the agent discovers only tools that match its identity, task, environment, and policy. This reduces information leakage and lowers the chance that the agent chooses a tool outside the scope. The principle is defense in depth, with tools/call enforcement as the load-bearing layer.
Another job is credential narrowing. The agent should not carry broad standing credentials for the whole session. The MCP server should not rely on one powerful service token for every downstream action. Instead, the enforcement layer should derive or broker a credential for the specific call. That credential should be short-lived, scoped, and revoked or expired after the action completes.
The final job is auditability. Every allow-or-deny decision should produce a record. That record should include the agent, tool, arguments, policy result, credential used, task context, and timestamp. This makes security review possible. It also helps platform teams debug agent behavior without guessing. This is the difference between access being assumed and access being enforced.
A runtime enforcement layer evaluates every tool call before execution. As a complementary defense, it can also filter the tool catalog before the agent sees it.
How AgntID Closes the MCP Enforcement Gap
AgntID is built for this missing layer. AgntID enforces scoped, ephemeral access for AI agents at execution time without requiring teams to replace their tools, MCP servers, or identity platform. It is designed for infrastructure teams that need runtime access control without having to redesign every agent. In an MCP deployment, AgntID sits between the agent and the MCP server. The agent does not call MCP directly. Tool invocations flow through AgntID first. AgntID intercepts the request, evaluates policy, derives scoped credentials, and then forwards the approved call to MCP.
The AgntID approach is vital because MCP is becoming the connective tissue for agentic systems. The more agents use MCP to reach real systems, the more important it becomes to control what happens after discovery. AgntID's role is not to slow agents down. It is to make agent execution safe enough for production.
AgntID sits between the agent and the MCP server. Every tool invocation is intercepted, evaluated against policy, scoped to the specific action, and logged before the MCP handler runs.
FAQ
Doesn't OAuth 2.1 in MCP already handle authorization?
OAuth helps with token issuance, delegation, and session access. It does not evaluate every MCP tool call at execution time. MCP authorization primarily governs transport-level access, while runtime authorization determines whether a specific agent action is allowed.
If my MCP server requires authentication, isn't that enough?
No. Authentication proves the caller is known. It does not prove that the agent should call a specific tool with specific arguments. Production agents also need task-aware policy, credential narrowing, and audit logs.
Does this apply to all MCP deployments, or only HTTP-based ones?
Yes, but the risk changes by transport and environment. HTTP-based MCP servers can use MCP authorization flows. STDIO deployments often rely on environment credentials, which makes runtime enforcement and credential scoping especially important.
What is the difference between filtering tools/list and enforcing tools/call?
Filtering tools/list controls what the agent can discover. Enforcing tools/call controls what the agent can execute. Ideally, production systems should govern both. A tool should not be visible if it is outside scope, and every attempted invocation should still be checked before execution.
