The Four Layers of MCP Security: Why Scanners, Middleware, Quarantine, and Reputation All Exist

Algis Dumbris • 2026/04/13

The Week That Clarified the Stack

This week a well-researched comparison article landed reviewing Cisco’s mcp-scanner, Snyk’s agent-scan, and Pipelock. The recommendation was solid: stack all three. Each catches a different class of vulnerability, and layering them reduces the gaps any single tool leaves behind.

The problem is that the article only covered two of the four layers that exist. Pre-deploy static scanners and runtime content inspectors are necessary. They are not sufficient. Two more layers sit above them in the stack, and the order you deploy all four determines whether you have genuine defense in depth or just defense in width.

Here is the full picture.

Layer 1 — Pre-Deploy Static Scanning

Tools: Cisco mcp-scanner, Snyk agent-scan, AgentSeal, MCPSec

This is where most teams start, and for good reason. Static scanners operate on tool definitions before anything reaches production. Cisco’s mcp-scanner combines YARA rules with LLM analysis to catch tool poisoning — hidden instructions embedded in descriptions that manipulate agent behavior. Snyk’s agent-scan extends the concept to dependency-level analysis, flagging known-vulnerable MCP server packages before they enter your supply chain. AgentSeal and MCPSec cover overlapping ground with their own rule databases.

What these tools share is a common operating principle: examine the definition, not the execution. They catch command injection patterns in tool schemas, hardcoded secrets in server configurations, and description-level prompt injection designed to make an agent exfiltrate data through a seemingly benign tool call.

What they miss is everything that happens after deployment. A server that passes all static checks on Monday can receive a malicious update on Tuesday. A rug-pull attack — where a server replaces its tool definitions after gaining trust — is invisible to any scanner that only runs at build time. Zero-day injection patterns that are not yet in the rule database pass through unchallenged.

Static scanning is the foundation. It is not the roof.

Layer 2 — Runtime Content Scanning

Tools: Pipelock, MCP-fence, Lilith-zero

Runtime content scanners sit in the execution path, inspecting tool arguments on the way in and responses on the way out. Pipelock focuses on encoded payload detection — base64-encoded commands, unicode obfuscation, and nested injection in tool arguments that look benign to a static scanner. MCP-fence takes a policy-based approach, enforcing allow/deny rules on specific argument patterns during live tool calls.

Lilith-zero brought a particularly sharp insight to this layer: response injection is as dangerous as argument injection. A compromised MCP server can return response payloads that manipulate the calling agent into taking actions the user never intended. Runtime scanning on the response side catches this — something no pre-deploy scanner can address because the malicious content is generated dynamically.

The gap at layer 2 is different from layer 1. Runtime scanners assume the server is already connected and running. They inspect what a server does. They do not control whether the server should be running in the first place. A server that passes every content filter but was never vetted, never approved, and was added by an automated process or a compromised config file — that server is executing tool calls right now, and layers 1 and 2 are both watching it do so without questioning whether it belongs.

This is the blind spot. Scanning — whether static or runtime — answers the question “is this traffic safe?” It never asks “should this traffic exist?”

Layer 3 — Server-Trust Quarantine

Tool: MCPProxy quarantine-by-default

This is the layer the comparison article missed entirely, and it is arguably the most important one.

Quarantine-by-default operates on a simple principle: no MCP server surfaces tools to any agent until a human explicitly approves it. Every newly added server enters quarantine automatically. Its tools are registered internally but invisible to agents. It can be inspected, tested, and scanned — but it cannot execute. The admission decision is binary and manual.

This inverts the trust model that layers 1 and 2 assume. Those layers ask “is this server doing something bad?” Layer 3 asks “should this server be doing anything at all?”

The distinction matters because the threat surface for MCP is not limited to malicious payloads. Configuration injection — where an attacker modifies mcp_config.json or an equivalent to add an unauthorized server — is one of the most practical attack vectors in the ecosystem today. The MCPwned research demonstrated this with Cursor, Windsurf, and other agent frameworks: modify the config, add a server, and the agent starts calling tools the user never authorized.

Quarantine-by-default makes that attack class structurally impossible. An injected server enters quarantine. No tools surface. No execution occurs. The attack is contained before layers 1 and 2 have any traffic to inspect.

The Lilith-zero community reached a similar conclusion independently. Their analysis identified pre-connection trust as the missing piece in existing MCP security frameworks. Quarantine is the mechanism that delivers it.

Layer 4 — Behavioral Reputation Over Time

Emerging: Dominion Observatory proposal

Layer 3 gives you a binary gate: approved or not approved. That decision is made once, at a point in time, based on available information. But servers change. Maintainers change. Update patterns change. The question “should I trust this server?” is not static — it degrades or strengthens over time based on observed behavior.

Layer 4 addresses this by aggregating behavioral signals across the ecosystem. The Dominion Observatory proposal — currently the most detailed public articulation of this concept — envisions a reputation network that tracks runtime behavior across the 4,400+ public MCP servers in the registry. Anomalous update patterns, sudden tool definition changes, unusual argument distributions, and community reports feed into dynamic trust scores.

This layer does not replace layer 3. It informs layer 3. When a server’s reputation score drops below a threshold, the quarantine gate can automatically re-engage, pulling the server out of production and requiring fresh human review. When a new server has strong ecosystem-level reputation signals, the approval decision in layer 3 is faster and better-informed.

Today, layer 4 is a proposal. Tomorrow it is infrastructure. The signals are already being generated — they just need aggregation and a trust framework to make them actionable. Every time a server updates its tool definitions, every time a tool call returns an anomalous response shape, every time a community member flags suspicious behavior — that is a reputation data point waiting to be captured.

Why Order Matters

The natural instinct is to deploy these layers in the order they were built: static scanning first, runtime scanning second, and worry about the rest later. This is backwards.

Layer 3 should deploy first. If you have quarantine-by-default, unauthorized servers never reach the state where scanning matters. You have eliminated the largest class of attacks — unauthorized server execution — before writing a single scanning rule.

Layer 1 deploys second. Static scanning validates the servers you are considering approving. It is part of the quarantine review process, not a standalone defense.

Layer 2 deploys third. Runtime scanning catches what static scanning cannot: dynamic payloads, response injection, and behaviors that only manifest during live execution.

Layer 4 is the long game. Reputation accumulates over time and feeds back into layer 3 decisions, creating a closed loop where trust is continuously evaluated rather than granted once and forgotten.

The comparison article this week recommended stacking Cisco, Snyk, and Pipelock. That is correct as far as it goes. But stacking three tools across two layers leaves two layers unaddressed. The server that was never supposed to be running is not caught by any scanner. The server whose maintainer sold the project to an unknown entity last month is not caught by any runtime filter.

The full stack is four layers deep. Deploy quarantine first. Add static scanning to your approval workflow. Wire runtime scanning into your execution path. Build toward reputation as the ecosystem matures.

That is the stack. The order is not negotiable.