Managed Agents Self-Hosted Sandbox | Comparing 4 Providers and How to Choose

Anthropic released the Self-Hosted Sandbox for Managed Agents in public beta on May 19, 2026, offering a choice of four providers: Cloudflare, Daytona, Modal, and Vercel. This marks the moment when officially supported options finally came together for enterprise outsourcing and client proposals with strict audit requirements — configurations where you can immediately answer "where does our data actually go?" This article compares pricing, latency, SDK languages, and domestic data residency support across all four providers, and consolidates the selection criteria and a 7-step migration process into a single reference.

Article Summary by AI Chatpowered by Claude
結論powered by Claude

Anthropic released the Self-Hosted Sandbox for Managed Agents on 2026-05-19, making it possible to shift tool execution, file system, and network egress to your own infrastructure with a choice of four providers: Cloudflare, Daytona, Modal, and Vercel. Orchestration and model inference continue to run on Anthropic's side — 3 out of 5 responsibility blocks move to the self-hosted side.

The golden rule for selection is to work backwards from the customer's audit requirements and primary language. Cloudflare is the default for immediate edge startup, Daytona for dev container and Python-heavy pipelines, Modal for Python ML pipelines, and Vercel for Node-centric web projects. Combining with MCP Tunnels (released the same day) allows you to confine even egress paths to your own network, making Cloudflare + Tunnels the practical solution for customers with SOC2 / ISO27001 certification.

The standard migration approach during beta consists of 7 steps: inventory dependent tools → select provider → minimize Sandbox image permissions → validate tunnel connections → design parallel execution and retry logic → integrate audit logs → plan GA fallback. Since image specs and permission boundaries may change when switching from beta to GA, it is recommended to set up dual-system operations with immediate fallback to the Anthropic-hosted version before going live.

目次 (22)

What Is the Managed Agents Self-Hosted Sandbox — Overview of the 2026-05-19 Beta Release

Anthropic released the Self-Hosted Sandbox feature for Managed Agents as a public beta on May 19, 2026. The primary sources are the Anthropic Platform Docs release notes and the Cloudflare official Blog announcement from the same day, supplemented by English-language coverage from the-decoder and 9to5Mac. Until now, Managed Agents were only available in the Anthropic-hosted version, with tool execution and temporary file writes happening inside an Anthropic-managed sandbox. Since the beta release, an officially supported option exists to route the tool execution layer to a chosen provider — Cloudflare, Daytona, Modal, or Vercel — while maintaining the same API pathway.

Relationship with MCP Tunnels Released the Same Day

At the same time as the Self-Hosted Sandbox, MCP Tunnels were also released — a feature that closes off the routing path to MCP servers through a dedicated tunnel. While the Sandbox controls the "location" of tool execution, Tunnels controls the "path" from that execution to MCP servers, making them complementary. Combining both allows you to construct a topology where customer data never passes through Anthropic's managed network. The 9to5Mac writeup also frames these as "two privacy and security features," and enterprise proposals should treat this as a milestone release where both features must always be mentioned together.

"Running on Your Infrastructure" vs. "Reachable from Your Infrastructure" Are Different Things

A common misconception here is that "self-hosted = running orchestration on your own infrastructure." In reality, the Self-Hosted Sandbox only covers the tool execution side — the orchestration logic that drives the agent and model inference continue to run on Anthropic's side. The responsibility breakdown table is organized in a later section, but in proposals it is safest to frame "what's different from the Anthropic-hosted version" using 5 responsibility blocks. During the beta period, usage is accepted on an application basis, and the flow requires enabling a feature flag at the Console Workspace level before you can select a sandbox provider.

Responsibility Breakdown Between Anthropic-Hosted and Self-Hosted Versions — Orchestration Stays on Anthropic's Side

When explaining the behavior of the Self-Hosted Sandbox to customers, the most effective approach is to break Managed Agents into 5 responsibility blocks and show on one page "what comes to your side." Cross-referencing the Anthropic Platform Docs and Cloudflare Blog descriptions, the responsibility breakdown can be organized as follows:

Responsibility Block Anthropic-Hosted Self-Hosted Sandbox Primary Data Involved
Orchestration (agent control) Anthropic Anthropic Agent planning, step management
Model inference Anthropic Anthropic Prompts, model output
Tool execution Anthropic Customer-selected provider Command execution, code execution
File system (temporary files) Anthropic Customer-selected provider Intermediate artifacts, working directories
Network egress (external communication) Anthropic Customer-selected provider API calls, MCP connections

The 3 Blocks That Move to the Self-Hosted Side

What comes to the self-hosted side is 3 blocks: tool execution, file system, and network egress. Temporary files generated when the agent runs bash or python, external API calls, and communication with MCP servers all complete within your own infrastructure. For enterprise customers with SOC2 or ISO27001 certification, you can explain the "data reach" of these 3 blocks at the provider region level, and the practical benefit of making it easier to fill out customer audit Q&A sheets is the most important point in actual operations.

The 2 Blocks That Remain on Anthropic's Side

On the other hand, orchestration and model inference continue to run on Anthropic's side. Prompts and model output are exchanged via the Anthropic API, and Anthropic's contract terms continue to include not using inference logs for training by default. For customers who require "model inference on our own infrastructure as well," a design that combines Managed Agents with Claude via Amazon Bedrock or Vertex AI is more realistic. The Self-Hosted Sandbox is optimized for cases where you "only want to bring the tool execution layer to your own side."

Cloudflare / Daytona / Modal / Vercel — Comparison Table of 4 Providers: Pricing, Latency, SDK Languages

The differences between the 4 providers directly affect proposal unit pricing and audit compliance. The following comparison table was compiled by cross-referencing each company's official blog with the supported provider list in Anthropic Platform Docs. The pricing structure is based on beta announcements at the time of writing (2026-05-22) — be sure to note in customer proposals that pricing may change when switching to GA.

Item Cloudflare Daytona Modal Vercel
Pricing model Requests + CPU time Monthly seat + execution time Per-second billing (cores + memory) Execution time + bandwidth
Startup latency Edge-level ~50ms 200–500ms range Hundreds of ms to seconds 100–300ms range
Supported SDK languages TypeScript / Python Python / Go / Node Python-centric TypeScript / Node-centric
Concurrent execution limit High (distributed) Per project Flexible with core reservation Plan-dependent
Primary regions 330+ edge locations US / EU US / EU Global CDN

Strengths of Each Provider

Cloudflare's strengths are instant edge startup and 330+ locations, making it well-suited for embedding in customer-facing SaaS with strict latency requirements. Daytona's approach of turning development environments into sandboxes based on dev containers is its distinguishing feature — it's easy to work with for outsourced projects that mix Python and Go pipelines or deliver the IDE experience to customers. Modal is optimized for Python ML / data processing pipelines with the flexibility of per-second billing and core reservation, while Vercel's strength lies in easy integration with existing Vercel projects for Node-centric web-focused agent execution.

Monthly Cost Estimates: 3 Scenarios

The basic approach to estimating actual monthly cost is: "average tool execution time × concurrent sessions × operating days per month."

Scenario 1: Individual developer (30 minutes of tool execution per day, 1 concurrent) All 4 providers will come in around $1–10/month. Vercel and Cloudflare have lower minimums, making them suitable for trials and PoC stages.

Scenario 2: Small/medium SI (4 hours per day, 5 concurrent) Modal's per-second billing tends to be more cost-effective, with a rough estimate of $20–60/month. It works well with Python ML pipelines and can handle sudden spikes with flexible core reservation.

Scenario 3: Enterprise (24/7 operation, 20+ concurrent) Cloudflare's edge-resident model tends to maintain consistently low latency. Costs upwards of $200–500/month are expected, but high availability through edge distribution and SLA add value to proposals.

For proposals, show latency and monthly cost on a 2-axis comparison and narrow down to one provider that fits the customer's SLA.

Security Audit Requirements and Domestic Data Residency Selection Criteria

For enterprise outsourcing, what often matters more than pricing is audit compliance and data residency. Cloudflare, Modal, and Vercel have obtained SOC2 Type II (third-party certification for information security management systems) and ISO27001 (international standard for information security management systems). Daytona announced its SOC2 Type II certification and continued renewal in 2026, meaning all 4 providers can meet basic enterprise audit requirements. However, there are differences in the supported regions for domestic data residency among providers, and if a customer requires "tool execution to be completed within Japan," advance confirmation is essential.

Alignment with Government Generative AI Procurement Guidelines DS-920

The guidelines to reference for domestic projects are the Basic Policy on Procurement and Utilization of Generative AI DS-920, which the Digital Agency fully implemented on April 1, 2026, and the AI Business Operator Guidelines Version 1.2 published by the Ministry of Economy, Trade and Industry and the Ministry of Internal Affairs and Communications at the end of March 2026. DS-920 organizes requirements related to data sovereignty, inference geolocation, and audit log requirements, and for government and local government procurement projects, it is becoming necessary to specify the "regions for tool execution, temporary files, and egress" in contracts. The Self-Hosted Sandbox is designed to accommodate these requirements, but since Anthropic's orchestration / inference still depends on Anthropic-managed regions, a decision to switch to Claude via Bedrock may be necessary if a customer requires "inference fixed within Japan as well."

Scenarios Where Self-Hosted Sandbox Has an Advantage for Domestic Projects

When domestic data residency requirements target only "tool execution and temporary files," the Self-Hosted Sandbox becomes a powerful option. Running edge-resident through Cloudflare's Tokyo region allows handling domestic projects with ~50ms latency, and you can also configure Modal on US / EU while keeping only customer data within a domestic RDS. On the other hand, for financial and public sector projects that require "model inference fixed within Japan" or "single vendor window," using Claude via AWS / Microsoft Foundry / Vertex AI is more realistic — it's safer to narrow the Self-Hosted Sandbox proposal to "only shifting audit log integration and egress control to your own side."

7-Step Practical Checklist for Migrating from Anthropic-Hosted to Self-Hosted

Migration, given that we are still in beta, should follow the golden rule of proceeding gradually with dual-system operations. The flow of maintaining the Anthropic-hosted version while trialing the Self-Hosted Sandbox, then switching traffic after confirming stability, is organized into 7 steps.

Step 1: Inventory Dependencies of Existing Managed Agents

First, create a list of all tools, MCP servers, external APIs, and network egress destinations used by the current Managed Agents. Extract the tool names called and egress connection destinations over the past 30 days from the agent settings and logs in the Anthropic Console, and create a dependency map for customer audits. At this point, classify "dependencies that can leave Anthropic management" and "egress to integrate into the in-house SIEM (Security Information and Event Management system)."

Step 2: Select Provider

Based on the dependency inventory results, narrow down to one provider using the comparison table in this article. If immediate edge startup is needed, Cloudflare; if primarily Python ML, Modal; if Node web project, Vercel; if dev container with multi-language mixed pipeline, Daytona is the default. If there are domestic residency requirements, prioritize Cloudflare with Tokyo region support as the primary candidate, and apply for beta sandbox provider in the Anthropic Console Workspace settings.

Step 3: Build Sandbox Image and Configure Minimum Privileges

Build a Sandbox image for the selected provider containing only the minimum runtime needed for tool execution (Python / Node / bash, etc.). Choose a base image with a small attack surface such as Distroless (container images with minimal OS libraries) or Alpine, and configure it to start with a non-root user prohibiting root execution. The standard approach is write permissions only for the working directory, with egress using an allowlist that only passes the necessary external hosts.

Step 4: Validate Tunnel Connection with Orchestration Endpoint

Validate the path from Anthropic's orchestration endpoint to the Sandbox of the selected provider. If using MCP Tunnels in combination, confirm that MCP server can be reached via Tunnels. Start by running one agent in the dev environment and verify in logs that all 3 blocks — tool execution, temporary file writes, and egress — complete on the self-hosted side.

Step 5: Design Parallel Execution, Timeouts, and Retry Policy

During the beta phase, concurrent execution limits and startup latency may behave unexpectedly. Measure the concurrent execution limit, startup latency, and cold start rate by provider in the beta environment, and adjust the agent's timeout and retry count. Appropriate timeouts differ between edge-resident (Cloudflare) and per-second billing (Modal) models, so finalize SLO numbers before going to production.

Step 6: Integrate Audit Logs and Traces into In-House SIEM

Aggregate all logs for tool execution, temporary file writes, and egress connections into your in-house SIEM (Splunk / Datadog / Sumo Logic / OpenSearch, etc.). Since OTLP (OpenTelemetry Protocol) / Webhook / S3 export support differs by provider, a safe design is to output OpenTelemetry traces on the agent side during the image build in Step 3, and propagate trace_ids so they can be correlated with Anthropic-side orchestration logs in the SIEM.

Step 7: Fallback Plan for Beta → GA Migration

Image specs, permission boundaries, and billing structures may change when switching from beta to GA. When going to production, maintain dual-system operations with immediate fallback to the Anthropic-hosted version, and prepare procedures to revert the sandbox provider setting in Console in a single operation. Checking Anthropic's changelog and News page once a week, and re-evaluating dual-system operations within 30 days of the GA migration announcement, is a realistic approach.

Sources

参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。