Sonnet 4.5's 1M Token Beta Ends | Complete Migration Guide to Sonnet 4.6
What was deprecated is the Sonnet 4.5 beta header (context-1m-2025-08-07), not Sonnet 4.6 itself. Sonnet 4.6 is the successor model with 1M token support built in as a standard feature. This guide is for developers who can no longer use 1M tokens with Sonnet 4.5, and walks through the minimal steps needed to migrate to Sonnet 4.6. It covers how to identify affected code, along with Python and TypeScript rewrite examples — presented in an order that lets you complete the migration in minutes while avoiding production downtime.
On April 30, 2026, the context-1m-2025-08-07 beta header was deprecated, making 1M token usage unavailable for Sonnet 4.5 and Sonnet 4. Sonnet 4.6 natively supports 1M tokens, so migrating to it lets you continue large-scale context processing without any header.
The core of the migration is just two things: changing the model ID to claude-sonnet-4-6 and removing the anthropic-beta header. No changes to prompts or logic are needed — both Python and TypeScript require only a few line replacements.
To identify affected areas, run grep -r "context-1m" and grep -r "sonnet-4-5" across your source code, .env files, and YAML/JSON configuration. Check both hardcoded values and environment variables. It is safest to validate with prompts exceeding 200k tokens in staging before applying to production.
目次 (21)
- What Was Deprecated — How Beta Headers Work and Which Models Are Affected
- How to Find Code That Will Break Today — 4 Steps to Identify Impact
- String Search Across the Codebase
- Checking Environment Variables and Configuration Files
- What to Look for in Error Logs
- Verification Steps in Staging
- Migration Steps to Sonnet 4.6 / Opus 4.8 — Minimal Changes to Get Running
- Changing the Model Name
- Handling the Beta Header
- Checking the SDK Version
- Changes in Pricing and Performance — What Changes and What Improves After Migrating to Sonnet 4.6
- Pricing Comparison
- What It Means to Go from "Beta" to "Standard"
- Performance Changes
- Model Selection Guidelines for Cost Optimization
- Expanded Use Cases After Migration — A World Where 1M Tokens Are Standard
- Bulk Review of Large Codebases
- Summarization and Q&A Over Lengthy Documents
- Refactoring Assistance Across Multiple Files
- Tips for System Design That Leverages 1M Tokens
- Sources
What Was Deprecated — How Beta Headers Work and Which Models Are Affected
The Claude API has a mechanism called "beta headers" that enables experimental features. By specifying a particular string in the anthropic-beta header, developers can opt into features before their official release.
The value that was deprecated is context-1m-2025-08-07. Including this header extended the context window — normally capped at 200k tokens — up to a maximum of 1 million tokens. During the beta period, many implementations relied on this approach for large-scale context usage, and it was widely used for use cases such as processing lengthy documents and analyzing entire codebases in one pass.
The two models affected by the deprecation are:
claude-sonnet-4-5(and its aliases)claude-sonnet-4(and its aliases)
Anthropic announced the deprecation on March 30, 2026 (see the release notes), giving about one month's notice. However, many teams were not adequately informed, and some systems continued sending requests with the header even after the April 30 deprecation date.
After deprecation, the behavior is header invalidation. Even if the header is included, it is ignored and the context window reverts to 200k tokens. Prompts exceeding 200k tokens will receive an error, effectively causing the entire request to fail.
How to Find Code That Will Break Today — 4 Steps to Identify Impact
First, determine whether your codebase is affected. You can detect all occurrences of the context-1m-2025-08-07 header with a string search across the codebase, but if model names are managed via environment variables, you will also need to check configuration files separately. Since errors may already be occurring following the April 30 deprecation, it is also recommended to review logs and run staging validation. Follow the four steps below in order to understand the scope of impact before applying changes to production.
String Search Across the Codebase
Run the following commands in your terminal to check for beta header references:
grep -r "context-1m" . --include="*.py" --include="*.ts" \
--include="*.js" --include="*.rb" --include="*.go"
grep -r "anthropic-beta" . --include="*.py" --include="*.ts" \
--include="*.js" -l
If the files that match also use model names like claude-sonnet-4-5 or claude-sonnet-4, those implementations need to be migrated.
Checking Environment Variables and Configuration Files
If model names are managed via environment variables rather than hardcoded, check your .env files and configuration management services as well:
grep -r "CLAUDE_MODEL\|ANTHROPIC_MODEL\|MODEL_ID" . \
--include="*.env" --include="*.yaml" --include="*.json"
If any values contain sonnet-4-5 or sonnet-4, they need to be updated. Kubernetes and Docker Compose environment variable configurations are easy to overlook, so check those as well.
What to Look for in Error Logs
If errors are already occurring, logs should contain messages like the following:
- Status code:
400 Bad Request - Error type:
invalid_request_error - Example message:
prompt is too longorcontext window exceeded
Even for requests within 200k tokens, logic that depends on the header may behave unexpectedly. Also check for any warnings related to anthropic-beta in the logs.
Verification Steps in Staging
Before applying to production, validate behavior in staging by intentionally sending a prompt exceeding 200k tokens and checking the response. If an error is returned, the old model is still in use. After migration, confirm that the same prompt is processed successfully, and you are done.
Migration Steps to Sonnet 4.6 / Opus 4.8 — Minimal Changes to Get Running
The migration itself is straightforward — it is complete once you update the model specification and remove the beta header. Backward compatibility with the old models (Sonnet 4.5 / Sonnet 4) is maintained at the API level, so no changes to existing logic or prompt design are necessary. Addressing three things — verifying the SDK version, replacing the model name, and removing the header — typically completes migration within minutes. The steps are explained below with code examples for both Python and TypeScript.
Changing the Model Name
The code changes are minimal. Simply replace the value of the model parameter with the new model ID and remove the beta header specified in extra_headers or headers. No changes to prompts or logic are needed.
# Before (Sonnet 4.5 + beta header)
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=8192,
extra_headers={"anthropic-beta": "context-1m-2025-08-07"},
messages=[{"role": "user", "content": very_long_prompt}]
)
# After (Sonnet 4.6, no header needed)
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=8192,
messages=[{"role": "user", "content": very_long_prompt}]
)
The same applies for TypeScript / Node.js:
// Before
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 8192,
headers: { "anthropic-beta": "context-1m-2025-08-07" },
messages: [{ role: "user", content: veryLongPrompt }],
});
// After
const response = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 8192,
messages: [{ role: "user", content: veryLongPrompt }],
});
Handling the Beta Header
For Sonnet 4.6 and later, the context-1m-2025-08-07 header is not required. Sending it at this point will still work, but since it may be treated as an invalid header subject to warnings or errors in the future, removing it is recommended.
Checking the SDK Version
Python SDK (anthropic) version 0.20.0 or later, and Node.js SDK (@anthropic-ai/sdk) version 0.18.0 or later, officially support models that handle large-scale context without additional headers. If you are on an older version, update first:
# Python
pip install --upgrade anthropic
# Node.js
npm install @anthropic-ai/sdk@latest
Model selection criteria: Choose claude-sonnet-4-6 if you prioritize cost and speed over accuracy; choose claude-opus-4-8 if you want to maximize reasoning quality over large contexts.
Changes in Pricing and Performance — What Changes and What Improves After Migrating to Sonnet 4.6
The first concern after migrating is changes in cost and quality. Sonnet 4.6 maintains the same pricing structure as Sonnet 4.5 while offering 1M tokens as a standard feature. In other words, you get more stable large-scale context processing at the same price as during the beta period. Instruction-following accuracy over long contexts has also improved, meaning migration offers benefits not just in cost but in quality as well. The following covers model-by-model pricing comparisons, performance changes, and guidelines for cost optimization.
Pricing Comparison
| Model | Input token price | Output token price | Context |
|---|---|---|---|
| Claude Sonnet 4.5 (old) | $3 / 1M tokens | $15 / 1M tokens | 200k (1M during beta) |
| Claude Sonnet 4.6 (new) | $3 / 1M tokens | $15 / 1M tokens | 1M (standard) |
| Claude Opus 4.8 (new) | $15 / 1M tokens | $75 / 1M tokens | 1M (standard) |
Sonnet 4.6 maintains the same pricing as the old Sonnet 4.5 while offering 1M tokens as a standard feature. More stable large-scale context processing is now available at the same price as during the beta period.
What It Means to Go from "Beta" to "Standard"
Beta features carry the risk of being changed or deprecated without notice — this deprecation is exactly that kind of case. Starting with Sonnet 4.6, 1M tokens are provided as a standard specification, eliminating the risk of production environments going down due to sudden header deprecations. Because it falls within the scope of SLA and support coverage, reliability for production use has improved significantly.
Performance Changes
Sonnet 4.6 shows improved instruction-following accuracy over long contexts compared to Sonnet 4.5. In particular, document retrieval accuracy for inputs exceeding 500k tokens and the quality of answers combining multiple sources have improved. Latency remains at a comparable level, so performance degradation after migration is not typically expected.
Model Selection Guidelines for Cost Optimization
- Sonnet 4.6 recommended: High-volume batch processing, cost-sensitive use cases, direct migrations from Sonnet 4.5
- Opus 4.8 recommended: Tasks requiring complex reasoning and analysis, scenarios where accuracy directly impacts revenue, cases where deep understanding utilizing the full 1M token context is needed
Since the price difference is approximately 5x, a practical approach is to first evaluate quality with Sonnet 4.6, then switch to Opus 4.8 if the results are insufficient.
Expanded Use Cases After Migration — A World Where 1M Tokens Are Standard
With 1M tokens becoming "standard" rather than "beta," large-scale processing that previously required accepting risk can now be used at production quality. Implementations that intentionally limited context size out of fear of sudden header deprecations can now be expanded freely with Sonnet 4.6. Use cases like bulk review of large codebases, Q&A over lengthy documents, and refactoring assistance spanning multiple files — things that were "desired but risky" — can now be utilized as standard features.
Bulk Review of Large Codebases
You can pass a codebase of tens of thousands of lines all at once and detect dependency issues, security concerns, and design inconsistencies in bulk. No preprocessing like chunk splitting or vector search is needed — just pass the files directly as text. Since the model retains the full context of the codebase, it can accurately identify issues that span multiple files.
with open("entire_codebase.txt", "r") as f:
code_content = f.read()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
messages=[{
"role": "user",
"content": f"以下のコードベースを審査し、\n"
f"セキュリティ上の懸念点を列挙してください:\n\n"
f"{code_content}"
}]
)
Summarization and Q&A Over Lengthy Documents
You can feed hundreds of pages of specifications, legal documents, or research papers all at once and perform cross-document searches for content matching specific conditions. Previously this required splitting content into chunks for separate processing, but with 1M tokens you can get answers while preserving the full document context.
Refactoring Assistance Across Multiple Files
Even when functions, classes, and configuration files are distributed across multiple modules, you can pack all file contents into a single context and get consistent answers to questions like "How do I replace all calls to this API with the new interface?" — across all occurrences.
Tips for System Design That Leverages 1M Tokens
To make the most of large-scale context, combining it with prompt caching is important. Caching system prompts and reference documents that change infrequently can dramatically reduce costs. Sonnet 4.6 fully supports prompt caching, and maximizing the cache hit rate for repeatedly used context is the key to long-term cost management.
For ongoing AI industry developments related to the beta header deprecation, Anthropic's release notes provide continuous updates. Model access policy changes across vendors will continue to be worth watching.