What Is Claude Sonnet 4.6 | Specs, Capabilities, and Differences from Opus

Clauder Navi 編集部 / 最終更新 2026-06-18

Article Summary by AI Chatpowered by Claude

結論powered by Claude

Claude Sonnet 4.6 is Anthropic's core model, defined as "the best combination of speed and intelligence." The model ID is claude-sonnet-4-6, the context window is 1M tokens (approximately 3.4 million Unicode characters), and the maximum output is 64K tokens (up to 300K with the Message Batches API and a beta header). It supports both Extended Thinking and Adaptive Thinking.

API pricing is $3/MTok for input and $15/MTok for output — 60% of the cost of the higher-tier Opus 4.8 ($5/$25). It is available on four platforms: the Claude API, AWS Bedrock, Google Vertex AI, and Microsoft Foundry, making it the go-to model for production workloads.

Among the current standard three models (Opus 4.8 / Sonnet 4.6 / Haiku 4.5), Sonnet 4.6 is the only one that supports both Extended Thinking and Adaptive Thinking simultaneously.

目次 (8)

What Is Claude Sonnet 4.6 — Official Position and Model ID
Key Specs at a Glance
The Difference Between Extended Thinking and Adaptive Thinking
What Changes with a 1M Token Context Window
Comparison with Opus 4.8 and Haiku 4.5
Batch API and the 300K Output Beta
Available Platforms
When to Choose Sonnet 4.6 and Estimated Pricing

What Is Claude Sonnet 4.6 — Official Position and Model ID

Anthropic's official documentation defines Claude Sonnet 4.6 as "the best combination of speed and intelligence." While the Opus series handles complex reasoning and advanced autonomous coding, Sonnet is positioned to process production workloads quickly and with high precision.

The model ID to use in the API is claude-sonnet-4-6. Starting with the Claude 4.6 generation, alias-style IDs without date suffixes have been adopted, representing a fixed version snapshot. This is not a pointer that automatically switches to the next-generation version, so you can maintain consistent behavior without changing the model ID.

Source: Anthropic Models overview

Key Specs at a Glance

Below is a summary of the Sonnet 4.6 specifications listed in the official documentation.

Item	Claude Sonnet 4.6
Model ID (Claude API)	`claude-sonnet-4-6`
Context Window	1M tokens (approx. 750,000 words / approx. 3.4 million Unicode characters)
Max Output (standard)	64,000 tokens
Max Output (Batch API + beta header)	300,000 tokens
Reliable Knowledge Cutoff	August 2025
Training Data Cutoff	January 2026
Response Speed	Fast
Extended Thinking	Supported
Adaptive Thinking	Supported
Priority Tier	Supported

The 1M token context window equates to approximately 3.4 million Unicode characters. For Japanese text, where one character is roughly 1–1.5 tokens, this means you can practically process around 670,000 to 1,000,000 Japanese characters in a single prompt. Given that a typical paperback book runs about 100,000–150,000 characters, you can fit the equivalent of 7–10 novels into a single context.

The Difference Between Extended Thinking and Adaptive Thinking

Sonnet 4.6 supports two "thinking modes." Both relate to reasoning accuracy, but they work differently.

Extended Thinking is a mode in which the model explicitly outputs its reasoning process before generating an answer. It is activated via the API by enabling the thinking parameter. It is effective in situations that require transparency in the solution process — such as mathematical proofs, code debugging, and multi-step logical reasoning.

Adaptive Thinking is a feature that automatically adjusts the amount of internal deliberation based on the difficulty of the problem. It answers simple questions quickly, and automatically takes more time to think deeply about complex problems. No explicit activation is required from the user; it operates in an always-on state.

The feature support status for the current standard three models is as follows:

Model	Extended Thinking	Adaptive Thinking
Opus 4.8	No	Yes
Sonnet 4.6	Yes	Yes
Haiku 4.5	Yes	No

As the table shows, Sonnet 4.6 is the only standard model that supports both features simultaneously. If you need Extended Thinking but want to keep costs lower than Opus 4.8, Sonnet 4.6 is the clear choice.

What Changes with a 1M Token Context Window

Previous generations up to Sonnet 4.5 had a context window of 200K tokens. Sonnet 4.6's 1M tokens is 5 times the size of the previous generation.

Practical scenarios where this expansion makes a direct difference include:

Passing a large codebase (the combined total of multiple files) in a single prompt for bug investigation or specification review
Including hundreds of pages of PDFs, legal documents, or technical specifications all at once in the context
Continuing follow-up questions while retaining a long conversation history
Testing implementations that include large volumes of documents directly in the context without using RAG
Passing diffs from multiple files simultaneously for comprehensive code review

Note that Sonnet 4.6's 1M token support is separate from the beta header context-1m-2025-08-07 for Sonnet 4.5, which was deprecated on April 30, 2026. With Sonnet 4.6, 1M tokens is available as a standard feature without needing a beta header.

Comparison with Opus 4.8 and Haiku 4.5

Here is a side-by-side comparison of the three models' specs.

Comparison Item	Opus 4.8	Sonnet 4.6	Haiku 4.5
Recommended Use	Complex reasoning, advanced autonomous tasks	General production workloads	High-speed processing, cost-focused
Context Window	1M tokens	1M tokens	200K tokens
Max Output (standard)	128K tokens	64K tokens	64K tokens
Extended Thinking	No	Yes	Yes
Adaptive Thinking	Yes	Yes	No
Response Speed	Moderate	Fast	Fastest
API Pricing (input)	$5/MTok	$3/MTok	$1/MTok
API Pricing (output)	$25/MTok	$15/MTok	$5/MTok

Sonnet 4.6's API pricing is 60% of Opus 4.8's. However, Opus 4.8's maximum output is 128K tokens (twice that of Sonnet 4.6). Opus 4.8 has an advantage for generating extremely long text in a single response or for complex tasks requiring frontier-level accuracy.

Haiku 4.5 is even cheaper and faster, but its context window is limited to 200K tokens and it does not support Adaptive Thinking. It is suited for simple classification, summarization, and short conversations where speed and cost are the top priorities.

Source: Anthropic Models overview

Batch API and the 300K Output Beta

With the standard Messages API, Sonnet 4.6's maximum output is 64K tokens. However, by specifying the beta header output-300k-2026-03-24 in the Message Batches API, you can achieve a maximum output of 300K tokens. This is useful for batch-processing large volumes of long reports or detailed code generation.

The Batch API also offers a pricing discount — 50% off the standard rate (input $1.5/MTok, output $7.5/MTok). For large-scale processing tasks where an immediate response is not required, the Batch API is the most cost-effective option.

Additionally, combining with prompt caching can reduce costs further. When a cache read hits, the price drops to $0.30/MTok (10% of the standard rate). Batch API and prompt caching discounts stack, so batch processing that repeatedly sends the same system prompt can result in a substantial reduction in effective costs.

Available Platforms

Sonnet 4.6 is available on the following four platforms.

Platform	Model ID
Claude API	`claude-sonnet-4-6`
AWS Bedrock	`anthropic.claude-sonnet-4-6`
Google Vertex AI	`claude-sonnet-4-6`
Microsoft Foundry	`claude-sonnet-4-6`

On AWS Bedrock, you can choose between a global endpoint (dynamic routing for maximum availability) and a regional endpoint (guaranteed data routing to a specific region). If compliance requirements restrict where your data can travel, choose the regional endpoint.

On Google Vertex AI, three options are available: global, multi-region, and regional. For details on using it with Vertex AI, see "Using Claude 4.6 on Vertex AI."

On Microsoft Foundry, the same model ID as the Claude API is used. Note, however, that on Foundry, Opus 4.8's context window is limited to 200K tokens (no specific restriction is documented for Sonnet 4.6).

When to Choose Sonnet 4.6 and Estimated Pricing

Based on Anthropic's recommendations, here is a guide for choosing the right model.

When Sonnet 4.6 is the best fit: Integrating into a production API, large-scale processing, tasks where balancing speed and accuracy matters, situations where you need a 1M token context and also want Extended Thinking
When to choose Opus 4.8: Complex reasoning requiring frontier-level accuracy, generating text exceeding 128K tokens, advanced autonomous tasks
When to choose Haiku 4.5: Simple classification, summarization, or short conversations; top priority on speed or cost; tasks that fit within a 200K context

The base API pricing is $3/MTok for input and $15/MTok for output (USD, excluding tax). It is also accessible through Claude.ai subscriptions (Pro at $20/month, Max from $100/month, Team at $25/seat/month). For pricing details and a comparison of each plan, see "How Much Does Claude Sonnet Cost."

参考になったら ♡

この記事は役立ちましたか?

ご注意: Clauder Navi は Anthropic 公式情報を直接参照し正確な内容に努めておりますが、本記事の内容に基づく投資判断・契約・利用結果による損害について責任を負いかねます。重要な意思決定の際は、必ず Anthropic 公式・ claude.com の一次情報をご自身でご確認ください。