How to Use the Context Window

Summary — Key Takeaways from This Lesson

  • The context window is the maximum amount of text Claude can read at one time.
  • Claude's API models support a context window of up to 1M tokens (roughly 750,000–1,000,000 words).
  • A key strength is the ability to paste entire long PDFs, codebases, or meeting transcripts and ask questions about them.
  • Extended Thinking is a mode where Claude thinks deeply internally before responding — especially effective for complex reasoning tasks.
  • Claude's attention tends to weaken toward the middle of the window, so place critical information at the beginning or end.
目次 (6)

What Is the Context Window?

Large language models (LLMs) generate responses by reading all the text visible to them within a conversation. The maximum amount they can process at once is called the context window. The unit of measurement is the token — roughly 1–2 tokens per Japanese kana character, and approximately 1–2 tokens per English word.

As a conversation grows longer, the combined total of "prompt + conversation history + responses" fills up the window. Once the window limit is exceeded, older parts of the conversation fall out of view.

Claude's Window Size

Claude's models are known for having larger context windows compared to other leading LLMs.

  • claude.ai (Web UI): For standard conversations, roughly 200K tokens (approximately 150,000 characters) is the practical guide.
  • Via API: Models with a context window of up to 1M tokens are available (source: Anthropic official documentation).
  • 1M tokens is equivalent to roughly 750 pages of English text on A4 paper, or about 4–5 paperback novels.

Using the Context Window for Long-Document Comprehension

A large context window is particularly powerful for tasks such as the following:

  • Summarizing entire PDFs: Paste reports or academic papers exceeding 100 pages directly for summarization and Q&A.
  • Reviewing an entire codebase: Paste multiple files together and have Claude analyze the overall architecture.
  • Analyzing meeting transcripts: Provide hours of transcribed text and extract "who decided what."
  • Checking consistency in long works: Have Claude read an entire novel or manual and look for contradictions.

Tips for Effective Placement

Research suggests that LLMs tend to forget information located in the middle of the window. This is known as the "Lost in the Middle" problem.

Practical countermeasures:

  • Place the most critical instructions and constraints at the beginning or end of the prompt.
  • After pasting a long document, restate the task explicitly at the end with something like "Based on this document, please…"
  • When providing multiple documents, place the highest-priority one last.

Extended Thinking

Extended Thinking is a mode where Claude runs a longer internal reasoning process before generating a response. It takes more time than a standard response, but improves accuracy for tasks such as:

  • Problems requiring step-by-step reasoning, such as math or logic puzzles
  • Planning tasks that must satisfy multiple constraints simultaneously
  • Identifying code bugs and proposing refactoring solutions

It can be enabled via the thinking parameter in the API. In the claude.ai web UI, it may activate automatically depending on the model and settings used.

The Relationship Between the Window and Token Costs

When using the API, the number of tokens in the context window directly affects your costs. Passing large amounts of text every time can cause costs to spike rapidly. A feature called "Prompt Caching" can reduce the processing cost of repeatedly used long text. This is covered in detail in Level 5: "API / SDK."

参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。