Three Sandboxes, Three Problems: Cloudflare, Anthropic, and MCPProxy

Algis Dumbris • 2026/03/31

Three Announcements, One Week, Zero Consensus

The last week of March 2026 delivered three major sandbox announcements from three very different organizations. Cloudflare introduced Dynamic Workers, extending its V8 isolate architecture to support AI tool execution at the edge. Anthropic open-sourced sandbox-runtime, a library that wraps OS-native sandboxing primitives — sandbox-exec on macOS, Landlock on Linux — into a clean API for isolating MCP server processes. And Cisco’s DefenseClaw project, alongside the open-source MCPProxy, demonstrated that Docker containers remain the production-grade answer for multi-tenant MCP gateway deployments.

Three sandboxes. Three entirely different isolation mechanisms. And all three are correct.

This is not a product comparison. It is a map of the design space. Every sandbox answers the same three questions differently: what are you isolating, from whom, and for how long? The answers to those questions determine whether you reach for V8, OS primitives, or Docker — and choosing the wrong one does not just cost performance. It leaves attack surface uncovered.

Three sandbox architectures compared side by side -- V8 isolates, OS-native sandboxing, and Docker containers

Cloudflare Dynamic Workers: V8 Isolates for Stateless Tool Calls

Cloudflare has been running V8 isolates at scale since Workers launched in 2017. The architecture is well understood: each request gets its own V8 isolate, a lightweight execution context that shares the V8 engine’s compiled code but maintains strict memory separation. There is no filesystem. There is no persistent state between requests. There is no network access except through Cloudflare’s own APIs. The isolate starts in sub-millisecond time, executes, and vanishes.

Dynamic Workers extends this model to MCP tool execution. When an AI agent calls a tool hosted on Cloudflare, the tool’s code runs inside a V8 isolate that is spun up on demand, executed, and torn down before the next request arrives. The isolation guarantees are strong for what they cover: one tool invocation cannot read or write memory belonging to another, cannot access the host filesystem, and cannot make arbitrary network connections.

Where V8 Isolates Excel

The performance characteristics are remarkable. Sub-millisecond cold starts mean there is no meaningful latency penalty for isolation. The memory overhead per isolate is measured in kilobytes, not megabytes. Cloudflare can run thousands of isolated tool invocations per second on a single edge node because V8 isolates share the engine’s compiled code and JIT infrastructure. For stateless tool calls — a currency conversion, a weather lookup, a database query that returns JSON — this is the ideal sandbox. The isolation is total within its scope, the overhead is negligible, and the developer experience is seamless.

The edge distribution matters too. When an AI agent in Tokyo calls a tool, that tool executes on a Cloudflare node in Tokyo. The latency between agent and sandbox is measured in single-digit milliseconds. For real-time agent interactions where tool calls happen in the critical path of a conversation, this latency advantage compounds across multi-step reasoning chains that might invoke dozens of tools.

Where V8 Isolates Fall Short

The absence of a filesystem is both the strength and the limitation. Tools that need to read configuration files, write temporary data, or maintain state between invocations cannot run in a V8 isolate without significant re-architecture. An MCP server that wraps a command-line tool, reads from a local database, or maintains a session cache is fundamentally incompatible with the isolate model.

Network restrictions add another constraint. V8 isolates on Cloudflare can make outbound HTTP requests, but they cannot open arbitrary TCP connections, bind ports, or communicate with other isolates except through Cloudflare’s service bindings. An MCP server that needs to connect to a local Redis instance, communicate over gRPC, or tunnel through a corporate VPN cannot operate within these boundaries.

The execution time limits are real. Cloudflare Workers have a CPU time limit (typically 10-50ms for free tier, up to 30 seconds for paid) that works well for fast tool calls but makes long-running operations impossible. An MCP tool that needs to process a large file, run a complex computation, or wait for a slow upstream API will hit these limits.

Perhaps most critically, V8 isolates provide no protection against a malicious MCP server changing its tool declarations mid-session. The isolation prevents data leakage between invocations, but the trust model assumes that the code running inside the isolate is the code that was deployed. The rug pull attack — where an MCP server changes its tool descriptions after initial approval — operates at a layer above the sandbox. V8 isolates contain the blast radius of a single invocation, but they do not address the question of whether that invocation should have been allowed in the first place.

Anthropic’s sandbox-runtime: OS-Native Primitives for Local Development

Anthropic’s approach starts from a different premise. Instead of building a new execution environment, sandbox-runtime wraps the operating system’s existing sandboxing mechanisms into a consistent API. On macOS, it uses sandbox-exec with custom profiles that restrict filesystem access, network operations, and process creation. On Linux, it uses Landlock LSM (Linux Security Modules) to enforce fine-grained access control policies on files, directories, and network operations.

The design philosophy is minimal interposition. The sandboxed process runs as a native process on the host operating system, with full access to the CPU, memory, and installed libraries. The sandbox only restricts what that process can reach — which files it can read, which network endpoints it can contact, which syscalls it can make. This is not virtualization. It is permission restriction applied at the kernel level.

Where OS-Native Sandboxes Excel

The overhead is essentially zero. There is no VM startup, no container image pull, no isolate initialization. The sandboxed process starts as fast as any other process because it is a native process. The kernel enforces restrictions through security policy checks on syscalls, which adds nanoseconds of overhead per restricted operation. For a developer running MCP servers locally — the primary use case for Claude Code, Cursor, and similar tools — this means sandbox protection without any perceptible performance impact.

The granularity of control is superior to both V8 isolates and Docker containers. A Landlock policy can grant read access to /usr/lib while denying write access, allow connections to localhost:5432 while blocking all other network traffic, and permit the read syscall while denying execve. This per-syscall, per-path granularity means the sandbox can be precisely shaped to match the tool’s legitimate needs without over-permissioning.

For local development, the integration story is compelling. Developers do not need Docker installed. They do not need a Cloudflare account. They run sandbox-runtime wrap -- my-mcp-server and the server starts with restrictions enforced by the kernel they are already running. The barrier to adoption is effectively zero for anyone on a supported operating system.

Where OS-Native Sandboxes Fall Short

The platform fragmentation is the primary weakness. sandbox-exec on macOS and Landlock on Linux are different mechanisms with different capabilities and different configuration formats. Landlock is still evolving — its ABI version determines which restrictions are available, and older kernels support fewer controls. Windows has no equivalent mechanism that sandbox-runtime supports, which means a significant portion of the developer population is excluded.

Multi-tenant isolation is the harder problem. OS-native sandboxes restrict what a process can access on the host, but they operate within the host’s process space. A sandboxed process and an unsandboxed process share the same kernel, the same /proc filesystem (on Linux), and the same user namespace unless additional isolation is configured. For a single developer running their own MCP servers, this is fine. For a production gateway handling requests from multiple users, each potentially connecting to untrusted MCP servers, the isolation boundary is insufficient.

Resource limits are another gap. Landlock controls access permissions but does not limit CPU time, memory consumption, or disk I/O bandwidth. A malicious MCP server running inside a Landlock sandbox can still consume all available memory, spin up CPU-intensive computations, or fill the disk with temporary files. Preventing resource exhaustion requires additional mechanisms — cgroups, ulimits, or external monitoring — that sandbox-runtime does not currently manage.

The trust bootstrapping problem remains unsolved. Like V8 isolates, OS-native sandboxes contain what a process can do but do not address whether the process should be trusted in the first place. A developer still needs to decide which MCP servers to run, and a compromised npm package that ships a malicious MCP server will execute inside the sandbox with whatever permissions the policy allows.

MCPProxy and Docker Containers: Production-Grade Blast Radius Containment

MCPProxy takes the third path. Each upstream MCP server runs in its own Docker container with resource limits, network policies, and a managed lifecycle. The container is not a deployment convenience — it is the security boundary. A compromised MCP server cannot access the host filesystem, cannot communicate with other MCP server containers, cannot consume unbounded resources, and cannot persist beyond its configured lifetime.

Docker container isolation providing blast-radius containment for MCP server processes

The architecture is straightforward. MCPProxy runs as a single Go binary that acts as a gateway between MCP clients and upstream MCP servers. When you add an upstream server with mcpproxy upstream add, MCPProxy launches it inside a Docker container with a predefined security profile. The container gets its own network namespace, its own filesystem (an ephemeral overlay), its own PID namespace, and strict resource limits on CPU, memory, and disk.

Where Docker Containers Excel

The blast radius containment is the strongest of the three approaches. A V8 isolate prevents memory access between invocations. An OS-native sandbox restricts syscalls and file access. A Docker container provides all of that plus network isolation, PID namespace separation, filesystem isolation, and resource limits — enforced by the Linux kernel’s cgroup and namespace mechanisms that have been battle-tested in production for over a decade.

Network policies are particularly important for MCP security. The most dangerous MCP attacks involve data exfiltration — a compromised tool reads sensitive data from one context and sends it to an attacker-controlled endpoint. Docker’s network isolation means MCPProxy can configure per-server network policies: this MCP server can reach api.github.com on port 443 and nothing else. No other server, no other port, no DNS resolution for domains outside the allowlist. This level of network control is impossible with V8 isolates (which have no outbound network restrictions beyond Cloudflare’s own) and requires significant additional configuration with OS-native sandboxes.

The lifecycle management addresses the temporal attacks that the other sandboxes leave open. MCPProxy uses SIGINT-based graceful shutdown to manage container lifecycles. Containers that exceed their resource limits are terminated. Containers that have been running beyond their configured maximum lifetime are recycled. Containers that fail health checks are restarted in a clean state. This lifecycle management means that even if an attacker compromises an MCP server inside its container, the compromise is automatically time-bounded.

MCPProxy also solves the trust bootstrapping problem that the other approaches punt on. New upstream servers are placed in quarantine by default. Their tools are registered internally but invisible to AI agents until an administrator explicitly approves them with mcpproxy upstream approve. This deny-by-default model means the sandbox is not just containing a potentially malicious process — it is preventing the process from being invoked until trust is established. Combined with BM25-based tool discovery that only surfaces approved tools, the quarantine system creates a layered defense that starts before the sandbox and continues after it.

Where Docker Containers Fall Short

The overhead is real. A Docker container takes hundreds of milliseconds to start, compared to sub-millisecond for V8 isolates and near-zero for OS-native sandboxes. The memory overhead per container is measured in tens of megabytes, compared to kilobytes for V8 isolates. For a developer running three MCP servers locally, this overhead is irrelevant. For a platform serving thousands of concurrent agent sessions, each with multiple MCP servers, the resource cost of per-server containers adds up.

Docker itself is a dependency. Unlike Anthropic’s sandbox-runtime, which uses kernel features that are always available, and unlike Cloudflare’s isolates, which are managed by the platform, Docker containers require Docker to be installed and running. This is a reasonable assumption for production server deployments but adds friction for developer workstations, CI/CD environments, and edge deployments where Docker may not be available or desirable.

The cold start latency makes Docker containers unsuitable for the stateless, request-per-isolate model that Cloudflare uses. MCPProxy keeps containers running for the lifetime of an MCP server session rather than spinning up a new container per tool invocation. This amortizes the startup cost but means the container persists state between invocations — which is both the feature (tools that need state can have it) and the risk (a compromised container maintains its compromised state across invocations until recycled).

The Design Question: What, From Whom, For How Long?

Every sandbox decision reduces to three questions. Getting them wrong does not just degrade performance — it creates the illusion of security while leaving the actual threat model unaddressed.

What are you sandboxing? If you are sandboxing a stateless function call — a tool that takes JSON in and returns JSON out — V8 isolates are the right answer. The isolation is total, the overhead is minimal, and the execution model matches the trust model perfectly. If you are sandboxing a process that needs filesystem access, network connectivity, and persistent state, you need containers. If you are sandboxing a local development environment where the developer is the only user and trusts most of their own MCP servers, OS-native sandboxes provide the best cost-to-benefit ratio.

From whom? Single-tenant isolation (protecting a developer from their own MCP servers) has different requirements than multi-tenant isolation (protecting users from each other’s MCP servers). OS-native sandboxes are sufficient for single-tenant. Multi-tenant requires the stronger isolation boundaries of containers or V8 isolates, where a kernel namespace or V8 heap boundary separates one tenant’s execution from another’s.

For how long? If each tool invocation is independent and ephemeral, V8 isolates provide the strongest temporal isolation — there is literally no state to carry over between invocations. If the sandbox must persist for the lifetime of an agent session (minutes to hours), containers with lifecycle management provide bounded persistence. If the sandbox is a long-running development environment, OS-native sandboxes impose the least overhead on the ongoing process.

	V8 Isolates	OS-Native Sandbox	Docker Containers
Startup latency	Sub-millisecond	Near-zero	Hundreds of milliseconds
Memory overhead	Kilobytes	Zero	Tens of megabytes
Filesystem access	None	Restricted	Isolated overlay
Network isolation	Platform-managed	Per-policy	Per-container namespace
Resource limits	Platform-enforced	Not included	cgroup-enforced
Multi-tenant safety	Strong	Weak	Strong
Trust bootstrapping	None	None	Quarantine + approval
Best for	Stateless edge tools	Local development	Production gateways

The Convergence Thesis

These three approaches are not competing. They are converging on a layered architecture that most production MCP deployments will eventually adopt.

Consider the trajectory. A developer starts with Anthropic’s sandbox-runtime on their laptop, using OS-native sandboxing to safely experiment with new MCP servers. When they deploy to production, their MCP gateway runs MCPProxy with Docker container isolation, quarantine, and network policies. Some of their tools — the stateless, performance-critical ones — run on Cloudflare Workers with V8 isolation at the edge, while the stateful, complex tools run in containers behind the gateway.

This layered model is not hypothetical. It is what the MCP ecosystem is building toward. The question is not which sandbox wins. The question is which sandbox belongs at which layer of your stack, and whether your current deployment has the right sandbox at each layer — or no sandbox at all.

The last option is the one that should worry you. In March 2026, with MCPwned demonstrations fresh in every CISO’s memory and real exfiltration attacks documented in the wild, running MCP servers without any sandbox is not a trade-off. It is a gap. Pick the sandbox that matches your threat model. Pick any of these three. But pick one.

MCPProxy is available at github.com/smart-mcp-proxy/mcpproxy-go. One binary, zero configuration, quarantine by default. To start securing your MCP servers today:

mcpproxy serve
mcpproxy upstream add my-server -- npx @modelcontextprotocol/my-server
mcpproxy upstream approve my-server
mcpproxy upstream list