Claude Managed Agents — Keys to Reducing TTFT and Decoupled Design

News

Clauder Navi 編集部 / 最終更新 2026-04-26

For developers exhausted by rewriting their harness every time a model is updated, this article breaks down the decoupled design of Managed Agents as published by Anthropic. We cover — in an order that directly informs implementation decisions — how to carve out the brain (reasoning), hands (execution environment), and session (state) into independent abstraction layers; the mechanism that cuts TTFT by p50 60% / p95 90%; and the thinking behind Many Brains / Many Hands scaling.

Article Summary by AI Chatpowered by Claude

結論powered by Claude

Managed Agents adopts a 3-layer separation of brain / hands / session. By applying the virtualization principles of OS design to AI agents and moving the harness outside the container, it minimizes the cost of rewriting the harness with every model update.

The core of the implementation is transparent connectivity via the execute(name, input) interface. Containers, custom tools, and MCP servers can all be treated as "hands" through the same function call, achieving p50 60% / p95 90% TTFT reduction while enabling horizontal scaling in both directions.

The key caveat is that sessions must be externalized as durable event logs. While this ensures state is not lost even when a container fails, the harness must guarantee log persistence and rollback design — meaning systems built on a stateless assumption will need to be redesigned.

目次 (22)

Key to Decoupled Design 1: Why Separate the Execution Environment from Reasoning?
The Problem: Harnesses Were Designed to "Compensate for Model Shortcomings"
The Solution — Apply OS Virtualization Principles Across 3 Components
Key to Decoupled Design 2: The Architecture Is 3 Layers — Brain / Hands / Session
Separating the 3 Components — Brain (Reasoning) / Hands (Execution) / Session (State)
The Harness Position Has Changed — Separated Outside the Container, Transparently Connected via execute(name, input)
Context Externalization via Session — Durable Log Without Compression, Enabling Rollback
Key to Decoupled Design 3: Shifting the Failure Model from "Pets" to "Cattle"
The Vulnerability of Single-Container Design — Total Session Loss on Failure, Requiring Manual Recovery
Fault Tolerance Through Separation — Restart via wake(sessionId), State Restored from Log
Key to TTFT Reduction — ~60% at p50, Over 90% at p95
Key to Decoupled Design 4: Two Patterns for Prompt Injection Defense — Resource Bundling and Vault
Pattern 1: Resource-Bundled Authentication — Token Used Only at Initialization, Then Accessed via Git Remote
Pattern 2: Vault Authentication — Securely Obtain OAuth Tokens via a Dedicated Proxy
Scaling Patterns — Many Brains (Shared Resources) and Many Hands (Multiple Execution Targets)
Many Brains — Multiple Harnesses Share a Common Sandbox, Tools, and Session
Many Hands — A Single Brain Works Across Heterogeneous Execution Environments (Containers / External Services / MCP)
Differences from Traditional Agent Implementations — Managed Agents Wins on 6 Dimensions
Practical Application — When to Choose Managed Agents
4 Use Cases Where Managed Agents Excel — Long-Running / Parallel / High-Security / Model Update Tracking
Cases That Warrant Consideration — Short Tasks / Strong Dependency on Existing Harness
Sources (Primary Information)

Key to Decoupled Design 1: Why Separate the Execution Environment from Reasoning?

The Problem: Harnesses Were Designed to "Compensate for Model Shortcomings"

When building agents, developers must design not only Claude itself but also the surrounding harness (the entire execution control framework). The harness includes prompt control, context management, tool invocation, retry logic, and more.

Anthropic has identified a fundamental problem with this approach:

"Harnesses embed assumptions about what the model cannot do. But as models improve, those assumptions become outdated." Source

As a concrete example, Claude Sonnet 4.5 exhibited "context anxiety" near token limits, requiring workarounds implemented in the harness. However, Claude Opus 4.5 resolved this issue, making those harness workarounds unnecessary Source.

The more model assumptions are baked into the harness, the more the entire harness must be revisited with each model improvement. This is unsustainable from the perspectives of scalability, safety, and cost.

The Solution — Apply OS Virtualization Principles Across 3 Components

The solution Anthropic adopted is the virtualization principle proven in OS design. Just as an OS separates processes from hardware, Managed Agents virtualize 3 components as independent, stable abstraction layers Source.

Key to Decoupled Design 2: The Architecture Is 3 Layers — Brain / Hands / Session

Separating the 3 Components — Brain (Reasoning) / Hands (Execution) / Session (State)

The key points of this section are summarized below.

┌─────────────────────────────────────────────────┐
│                 Managed Agents                  │
│                                                 │
│  ┌──────────────┐   ┌──────────────────────┐   │
│  │    Brain     │   │        Hands          │   │
│  │   (Brain)    │   │       (Hands)         │   │
│  │              │   │                       │   │
│  │  Claude      │   │  Sandbox              │   │
│  │  + Harness   │──▶│  (Container)          │   │
│  │              │   │  Custom Tools         │   │
│  │  Reasoning   │   │  MCP Servers          │   │
│  └──────────────┘   └──────────────────────┘   │
│          │                    │                 │
│          └────────┬───────────┘                 │
│                   │                             │
│          ┌────────▼───────┐                     │
│          │    Session     │                     │
│          │   (Session)    │                     │
│          │                │                     │
│          │ Durable Event  │                     │
│          │      Log       │                     │
│          └────────────────┘                     │
└─────────────────────────────────────────────────┘

Source: Architecture diagram based on Anthropic Engineering: Scaling Managed Agents

Component	Role	Technical Entity
Brain	Reasoning, planning, decision-making	Claude + harness logic
Hands	Execution, side effects, I/O	Containers, tools, MCP servers
Session	State persistence	Durable event log

The Harness Position Has Changed — Separated Outside the Container, Transparently Connected via `execute(name, input)`

In the traditional design, Claude, the harness, and the sandbox all coexisted within a single container. In Managed Agents, the harness is moved outside the container and communicates with the sandbox via the execute() interface Source.

The hands interface is kept simple and unified:

execute(name, input) → string

Any implementation of this interface — containers, custom tools, MCP servers, or Anthropic-provided tools — can be treated transparently as "hands" Source.

Context Externalization via Session — Durable Log Without Compression, Enabling Rollback

The session functions as "a context object that exists outside the context window" Source. Via the getEvents() interface, Claude can:

Retrieve any slice of the event stream at an arbitrary position
Rewind to a specific point in time and reload from there
Re-examine relevant events before executing an action

Unlike traditional context compression (an irreversible operation that discards information), the session log retains all information durably. The harness controls how data is fitted into the context window, making future improvements straightforward.

Key to Decoupled Design 3: Shifting the Failure Model from "Pets" to "Cattle"

The Vulnerability of Single-Container Design — Total Session Loss on Failure, Requiring Manual Recovery

In early implementations, all components were colocated in a single container. If the container went down, the entire session was lost and manual recovery of the unresponsive container was required Source.

Fault Tolerance Through Separation — Restart via `wake(sessionId)`, State Restored from Log

By separating components, each element can fail and be replaced independently.

The harness becomes stateless → crashes can be recovered with wake(sessionId)
After restart, state is restored from the event log via getSession(id)
Container re-initialization functions as a standard tool

A system that once required manual "pet" management has transformed into one that can be automatically managed like "cattle" Source.

Key to TTFT Reduction — ~60% at p50, Over 90% at p95

The most concrete quantitative result of separating brain and hands is the reduction in Time to First Token (TTFT) Source.

Percentile	Improvement
p50 TTFT	~60% reduction
p95 TTFT	Over 90% reduction

Source: Anthropic Engineering: Scaling Managed Agents

The reason for the improvement is straightforward. In the traditional design, reasoning could not begin until container initialization was complete. By separating the harness outside the container, reasoning can now begin immediately without waiting for container provisioning.

Key to Decoupled Design 4: Two Patterns for Prompt Injection Defense — Resource Bundling and Vault

The logical separation of components also plays an important role in security. Anthropic uses two patterns to prevent credential leakage Source.

Pattern 1: Resource-Bundled Authentication — Token Used Only at Initialization, Then Accessed via Git Remote

Taking Git repository access as an example: the repository access token is used at sandbox initialization to clone the repository and is wired as a local git remote. Subsequent git operations from inside the sandbox can be executed without handling the token directly.

Pattern 2: Vault Authentication — Securely Obtain OAuth Tokens via a Dedicated Proxy

Custom tools and OAuth tokens are stored in an external secure vault. A dedicated proxy receives session-related tokens, retrieves credentials from the vault, and handles the processing.

Through structural separation, even if a prompt injection attack succeeds inside the container, it cannot reach the credentials.

Scaling Patterns — Many Brains (Shared Resources) and Many Hands (Multiple Execution Targets)

The design philosophy Anthropic makes explicit is: "Have strong opinions around the interface, but make no assumptions about the number or location of brains and hands" Source.

Multiple stateless harnesses share common resources. Containers are provisioned only when actually needed, optimizing the cost of parallel execution.

Brain 1 ──┐
Brain 2 ──┼── Shared Sandbox / Tools / Session
Brain N ──┘

Many Hands — A Single Brain Works Across Heterogeneous Execution Environments (Containers / External Services / MCP)

A single brain assigns work across heterogeneous execution environments. Claude reasons about and selects the appropriate execution target.

                      ┌── Container A
Brain (Claude) ──────┼── Container B
                      ├── External Service
                      └── MCP Server

Differences from Traditional Agent Implementations — Managed Agents Wins on 6 Dimensions

The key points of this section are summarized below.

Dimension	Traditional Implementation	Managed Agents
Harness location	Colocated inside container	Separated outside container
Behavior on failure	Entire session is lost	Recoverable from session log
TTFT	Delayed by container initialization	Reasoning starts immediately
Scaling	Manual management, tightly coupled	Independent scale-out
Credential management	Tends to be mixed into harness	Isolated via vault / resource bundling
Model update compatibility	Harness assumptions must be revisited	Minimal impact thanks to stable interface

Practical Application — When to Choose Managed Agents

4 Use Cases Where Managed Agents Excel — Long-Running / Parallel / High-Security / Model Update Tracking

The key points of this section are summarized below.

Long-running, multi-step agent tasks: Session externalization allows tasks spanning hours or days to continue without losing state
Large-scale parallel agent execution: The Many Brains pattern enables efficient parallel processing of many jobs
Environments with strict security requirements: Structural separation of credentials serves as a defense against prompt injection
Cases where model updates need to be tracked: Loose coupling between the harness and the model minimizes harness changes when the model is updated

Cases That Warrant Consideration — Short Tasks / Strong Dependency on Existing Harness

The key points of this section are summarized below.

Short-duration, simple tasks: The architectural complexity may introduce overhead
Cases with strong dependency on an existing custom harness: Adaptation work to fit the execute() interface will be required

Sources (Primary Information)

The primary sources directly referenced in writing this article are listed below. Always verify the latest accurate information at each link.

Anthropic Engineering: Scaling Managed Agents — Decoupling the Brain from the Hands — Authors: Lance Martin, Gabe Cemaj, Michael Cohen
Source — Official documentation (Managed Agents-related pages published on a rolling basis)

参考になったら ♡

この記事は役立ちましたか?

ご注意: Clauder Navi は Anthropic 公式情報を直接参照し正確な内容に努めておりますが、本記事の内容に基づく投資判断・契約・利用結果による損害について責任を負いかねます。重要な意思決定の際は、必ず Anthropic 公式・ claude.com の一次情報をご自身でご確認ください。

Clauder Navi 編集部

@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務を毎日発信。運営方針はメディアについてをご覧ください。

プロフィール → 副社長コラム → レッスン一覧 →

Claude Managed Agents — Keys to Reducing TTFT and Decoupled Design

Key to Decoupled Design 1: Why Separate the Execution Environment from Reasoning?

The Problem: Harnesses Were Designed to "Compensate for Model Shortcomings"

The Solution — Apply OS Virtualization Principles Across 3 Components

Key to Decoupled Design 2: The Architecture Is 3 Layers — Brain / Hands / Session

Separating the 3 Components — Brain (Reasoning) / Hands (Execution) / Session (State)

The Harness Position Has Changed — Separated Outside the Container, Transparently Connected via execute(name, input)

Context Externalization via Session — Durable Log Without Compression, Enabling Rollback

Key to Decoupled Design 3: Shifting the Failure Model from "Pets" to "Cattle"

The Vulnerability of Single-Container Design — Total Session Loss on Failure, Requiring Manual Recovery

Fault Tolerance Through Separation — Restart via wake(sessionId), State Restored from Log

Key to TTFT Reduction — ~60% at p50, Over 90% at p95

Key to Decoupled Design 4: Two Patterns for Prompt Injection Defense — Resource Bundling and Vault

Pattern 1: Resource-Bundled Authentication — Token Used Only at Initialization, Then Accessed via Git Remote

Pattern 2: Vault Authentication — Securely Obtain OAuth Tokens via a Dedicated Proxy

Scaling Patterns — Many Brains (Shared Resources) and Many Hands (Multiple Execution Targets)

Many Brains — Multiple Harnesses Share a Common Sandbox, Tools, and Session

Many Hands — A Single Brain Works Across Heterogeneous Execution Environments (Containers / External Services / MCP)

Differences from Traditional Agent Implementations — Managed Agents Wins on 6 Dimensions

Practical Application — When to Choose Managed Agents

4 Use Cases Where Managed Agents Excel — Long-Running / Parallel / High-Security / Model Update Tracking

Cases That Warrant Consideration — Short Tasks / Strong Dependency on Existing Harness

Sources (Primary Information)

関連記事

Anthropic News | Claude Code v2.1.185 API Waiting Experience Improvements and SDK Update

Anthropic News | Claude Code v2.1.183 Safety Constraints and Deprecation Warnings

Anthropic News | Claude Code v2.1 /config and Python SDK Updates

Anthropic News Flash | Fable 5 Negotiations at G7 Summit and Seoul Office Opening

The Harness Position Has Changed — Separated Outside the Container, Transparently Connected via `execute(name, input)`

Fault Tolerance Through Separation — Restart via `wake(sessionId)`, State Restored from Log