Deep Dive: How MCPProxy Uses MCP Tool Annotations for Smarter Routing

Algis Dumbris • 2026/03/13

TL;DR

MCP tool annotations tell agents whether tools are read-only, destructive, or open-world — but they are optional hints that most servers skip. MCPProxy captures all five annotation fields from upstream servers, passes them through in retrieve_tools responses, and uses its DeriveCallWith system to map annotations into concrete permission levels: call_tool_read, call_tool_write, and call_tool_destructive. Combined with agent tokens and intent validation, this turns hints into enforceable policy.

The Annotation Problem

The MCP spec defines five tool annotation fields: title, readOnlyHint, destructiveHint, idempotentHint, and openWorldHint. The defaults are deliberately pessimistic — a tool with no annotations is assumed non-read-only, potentially destructive, non-idempotent, and open-world. Every dimension assumes the worst case.

This design was intended to incentivize server authors to annotate their tools. In practice, it has not worked. Most MCP servers ship without annotations, and most clients either ignore them entirely or always act on them with no nuance in between.

The result: agents treat every unannotated tool as potentially dangerous, prompting for confirmation on every call, or — worse — ignore annotations completely and trust everything equally.

How MCPProxy Handles Annotations

MCPProxy takes a different approach. Rather than treating annotations as advisory information that agents can choose to ignore, MCPProxy makes them first-class inputs to its routing and access control system.

Capturing Annotations

When MCPProxy connects to an upstream MCP server and indexes its tools, it captures all five annotation fields. These are stored alongside the tool’s name, description, and input schema in the BM25 search index. When an agent calls retrieve_tools, the annotations are included in the response alongside the tool definition.

This means agents using MCPProxy always see the annotation metadata, even if they do not specifically request it.

The DeriveCallWith System

The core innovation is DeriveCallWith — a mapping from annotation hints to concrete call variants:

Annotation StateDerived Call Variant
destructiveHint: truecall_tool_destructive
readOnlyHint: false (or absent)call_tool_write
readOnlyHint: truecall_tool_read

The mapping follows the spec’s pessimistic defaults. A tool with no annotations gets call_tool_write (not read-only, so it might modify things, but not explicitly destructive). A tool that explicitly declares destructiveHint: true gets the most restrictive variant.

Each variant has its own permission model. An agent token can grant access to call_tool_read without granting call_tool_write or call_tool_destructive. This turns annotation hints into enforceable access boundaries.

Intent Validation

MCPProxy’s intent validation layer adds a second check. When an agent calls a tool, the proxy verifies that the call variant matches the tool’s annotations. In strict mode, a mismatch — calling a destructive tool via call_tool_read, for example — is blocked.

This catches both accidental misuse (an agent mistakenly treating a delete operation as read-only) and potential attacks (a malicious prompt injection trying to invoke destructive tools through a read-only code path).

Agent Token Integration

Agent tokens tie the system together. A token can specify which call variants are permitted:

{
  "agent": "reporting-bot",
  "permissions": ["call_tool_read"],
  "servers": ["database", "analytics"]
}

This agent can read data from the database and analytics servers but cannot write or delete anything. The enforcement happens at the proxy layer, not in the agent’s code — even if the agent is compromised through prompt injection, it cannot escalate beyond its token’s permissions.

What Annotations Enable

With annotations flowing through the system and mapped to permission levels, MCPProxy can make routing decisions that individual agents cannot:

Parallel execution for read-only tools. When retrieve_tools returns tools annotated as read-only, agents can call them concurrently without risk of conflicting mutations. Claude Code already does this — MCPProxy extends the same logic to the proxy layer.

Session-level risk assessment. By tracking which annotation types are active in a session, the proxy can detect when an agent accumulates the “lethal trifecta” — access to private data (non-read-only tools), exposure to untrusted content (open-world tools), and ability to externally communicate (open-world write tools). This is Simon Willison’s framework applied at the infrastructure level.

Graduated confirmation. Rather than prompting for every tool call or prompting for none, the proxy can require confirmation only for destructive operations while letting read-only and additive operations proceed automatically.

Current Gaps and Planned Work

The current implementation has known limitations:

idempotentHint and openWorldHint are captured but unused in routing logic. DeriveCallWith maps the read/write/destructive spectrum but does not yet use idempotency or open-world status for routing decisions. Idempotent tools could enable automatic retries on failure. Open-world status could trigger additional monitoring.

Annotations are excluded from quarantine hashing. When MCPProxy quarantines a new server’s tools for review, the hash that detects schema changes does not include annotation values. A server could change its tools from read-only to destructive without triggering re-quarantine. This is a known gap that will be addressed.

No annotation-based filtering in retrieve_tools. The BM25 search currently matches on tool names and descriptions but does not filter by annotation values. A planned enhancement would allow queries like “find read-only database tools” or “exclude destructive tools from results.”

Annotation change detection. If a previously read-only tool suddenly declares itself destructive (or vice versa), this should be flagged as suspicious — similar to schema drift. Planned work would track annotation history and alert on changes.

The Bigger Picture

The MCP community is actively expanding the annotation vocabulary. Discussion #2382 on the spec repository proposes three new fields — sensitiveHint (accesses private data), egressHint (can send data externally), and reversibleHint (effects can be undone) — directly mapping to the lethal trifecta risk model. Five independent SEPs propose additional metadata fields.

An official MCP blog post (PR #2230) titled “Tool Annotations as Risk Vocabulary” is in review, co-authored by maintainers from Anthropic, GitHub, and AWS. It frames annotations as inputs for policy engines rather than security guarantees — exactly how MCPProxy uses them.

As annotations become richer and more widely adopted, MCPProxy’s DeriveCallWith system provides the foundation for turning them into actionable policy. The proxy does not need to trust annotations blindly — it can combine them with runtime behavioral monitoring, quarantine analysis, and agent token permissions to build a layered security model where annotations are one signal among many.

Getting Started

If you are using MCPProxy, tool annotation routing is active by default. Tools from annotated upstream servers automatically get the appropriate call variant. For servers without annotations, the pessimistic defaults apply — all tools route through call_tool_write.

To use agent tokens with annotation-based permissions, configure tokens in your ~/.mcpproxy/mcp_config.json:

{
  "agent_tokens": {
    "my-read-only-agent": {
      "permissions": ["call_tool_read", "retrieve_tools"],
      "servers": ["*"]
    }
  }
}

Full documentation is available at mcpproxy.app.