How to Use Claude's 1 Million Token Context | Pricing, Practical Limits, and Use Cases

Claude's 1 million token (1M context window) support became a standard offering without any additional charge starting in March 2026 — a major update that dramatically expands what the model can handle in a single conversation. Where previous limits topped out at tens of thousands of tokens, you can now work with more than an entire book's worth of content at once. This article walks through how the 1M context works, practical use cases, supported plans, and pricing.

Article Summary by AI Chatpowered by Claude
結論powered by Claude

On March 13, 2026, Anthropic launched the 1 million token (1M context) window as a standard feature across Claude Opus 4.6 and Sonnet 4.6. The previous surcharge for long-context usage was eliminated, and a flat rate now applies regardless of context length. Each request can process up to 600 images or PDF pages.

As a practical reference, 1 million tokens corresponds to roughly 400,000–500,000 Japanese characters or about 750,000 words in English. This makes it realistic to load an entire book (300,000–400,000 characters) and ask questions about it, or to analyze hundreds of PDF pages in a single pass.

Users on Max, Team, and Enterprise plans can take advantage of large contexts on Claude.ai without any special configuration, and Claude Code also ships with the 1M context as a default. Key considerations when using large contexts are the increased response latency proportional to input size, and managing costs through prompt caching.

目次 (12)

What Is Claude's 1 Million Token (1M Context) Window?

A "token" is the smallest unit an AI uses to process language. In Japanese, one character is roughly 1–2 tokens, so 1 million tokens equals approximately 400,000–500,000 Japanese characters or about 750,000 English words. A typical book runs 300,000–400,000 characters, which means 1M tokens gives you the equivalent of an entire thick technical manual loaded into memory at once.

The "context window" refers to the span of text an AI can reference within a single conversation. The wider this window, the better the model can recall earlier exchanges and attached documents without losing track. Before the 1M context window arrived, Claude's upper limit was around 200,000 tokens — the new limit represents a fivefold increase.

Compared to GPT-4o's 128,000-token ceiling, Claude's 1M context window holds roughly 7.8 times as much. The practical impact goes beyond raw character count: the density of information that can be processed in a single instruction grows substantially, which has a meaningful effect on real-world workflows.

Supported Models and Pricing — How the No-Surcharge Model Works

On March 13, 2026, Anthropic made the 1 million token context window generally available for Claude Opus 4.6 and Claude Sonnet 4.6 (CodeZine report). The most significant change was the elimination of the previous long-context surcharge, replaced by a flat rate that applies regardless of context length.

Model Input Price (/1M tokens) Output Price (/1M tokens)
Claude Opus 4.6 $5.00 $25.00
Claude Sonnet 4.6 $3.00 $15.00

Supported platforms include the Claude Platform (API), Microsoft Azure Foundry, and Google Cloud Vertex AI. Claude Opus 4.8, released in May 2026, also defaults to the 1M context window — a trend toward larger contexts becoming standard with each new model release.

Note: Previously, Sonnet 4.5 supported 1M tokens in beta via the anthropic-beta: context-1m-2025-08-07 header. That header was retired on April 30, 2026. Current models (Sonnet 4.6 and later) support the 1M context natively without any header.

What You Can Do with 1 Million Tokens — Practical Use Cases

Here are the most common real-world scenarios where a large context window pays off.

Bulk Analysis of Large Document Sets

You can load multiple PDFs or contracts at once and ask cross-document questions. For example, "Compare these three contracts and tell me which has the most favorable termination terms" can be answered in a single prompt. The ability to process up to 600 images or PDF pages per request also dramatically speeds up document-heavy workflows.

Reviewing Large Codebases

Passing thousands of lines of source code in a single prompt to identify bugs or design issues becomes straightforward. You no longer need to split files and re-explain context repeatedly — cross-file consistency checks can be requested all at once.

Generating and Editing Long-Form Content

You can hand over an entire book manuscript, academic paper, or lengthy report for style unification or rewriting. Checking for contradictions between Chapter 1 and Chapter 5, or ensuring consistent terminology throughout, can be done in a single instruction.

Maintaining Continuity in Long-Running Projects

You no longer need to re-explain project context at the start of each session. With hundreds of previous turns retained in memory, complex requirements discussions and design conversations can continue without the gradual "context degradation" that shorter windows cause.

Eligible Plans and How to Use It on Claude.ai

On the Claude.ai web interface, the 1 million token context is available to Max, Team, and Enterprise plan subscribers (Free plan users face usage limits on token count).

If you are on one of these plans, no special configuration is required. Simply open a new chat and paste in your text or attachments — the large context is active automatically. PDF and image attachments are also processed within the 1M window by default.

As conversations grow longer, a context usage indicator may appear near the input field. When it exceeds 80%, consider switching to a new session or summarizing older exchanges and pasting them in fresh to maintain response quality.

Using 1 Million Tokens in Claude Code

Claude Code ships with the 1M context window built in, and the ability to load large codebases directly is its biggest advantage for developers.

Key ways to get the most out of Claude Code's large context:

  1. Load the entire project directory structure upfront so the model understands the overall design intent
  2. Paste long stack traces or dependency files (e.g., package-lock.json) directly and ask for root-cause analysis
  3. Request refactoring suggestions spanning multiple source files in a single prompt
  4. Pass test code and implementation code together to identify coverage gaps

For code review, pasting the entire PR diff and requesting a review in one shot is especially effective. Unlike reviewing individual files, this approach lets you check the coherence of the entire change set at once.

Comparing Context Window Sizes Across Major AI Models

Here is how leading generative AI models stack up on context window size as of 2026.

AI Maximum Context Length
Claude Opus/Sonnet 1,000,000 tokens
Gemini 1.5 Pro 1,000,000 tokens
GPT-4o 128,000 tokens
Llama 3.1 405B 128,000 tokens

Google's Gemini also offers a 1M context window, but Claude stands out for its availability across multiple platforms — the Claude API, Amazon Bedrock, and Google Cloud Vertex AI — and its flat-rate pricing with no long-context surcharge, a meaningful differentiator.

The practical gap with GPT-4o goes beyond the raw 7.8x capacity difference. Loading a 600,000–700,000-character internal knowledge base into Claude in a single request — for search and summarization — would require multiple separate calls with GPT-4o, while Claude handles it in one prompt.

Things to Keep in Mind When Using Large Contexts

A few important points to be aware of when working with large context windows.

Increased Response Latency The larger the input, the longer it takes to receive the first response. Inputs exceeding 500,000 characters may take anywhere from several seconds to over ten seconds. Plan for this extra time, or use the streaming API to improve the user experience while waiting.

Cost Management While there is no longer a long-context surcharge, sending 1 million tokens of text with every request still increases token consumption per call. You can keep costs down by selecting only the documents you need, and by using prompt caching for repetitive or boilerplate portions of your prompts.

Accuracy Impact with Very Long Contexts As context size grows, retrieval accuracy for specific pieces of information can vary. There are reports that information embedded near the middle of a large input tends to be overlooked. Placing critical instructions and conditions at the beginning or end of your prompt is an effective mitigation.

Summary

Claude's 1 million token context window has been available without any additional charge since it was standardized in March 2026. The ability to process an entire book's worth of text in a single pass opens up a wide range of applications — document analysis, code review, long-form content editing, and more.

If you are on a Max, Team, or Enterprise plan, no special setup is needed, and Claude Code also includes it as a default. The best way to appreciate the difference is to load a large document or codebase directly into Claude and experience firsthand how it eliminates the old "split and paste" workaround.

Source: CodeZine — "Claude Launches 1 Million Token Context Window at Standard Pricing"

参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。