How to Use Claude Thinking Mode | Boost Accuracy with ultrathink

Article Summary by AI Chatpowered by Claude

Claude thinking is a feature that allows Claude to run through an internal reasoning process before generating a response. Unlike conventional instant-answer responses, it works through complex problems step by step before arriving at an answer, dramatically improving accuracy.

This article provides a systematic breakdown of every way to make Claude "think deeply" — from how to choose between think / think harder / ultrathink in Claude Code, to Extended Thinking parameters at the API level, to supported models and token limits.

結論powered by Claude
Claude thinking is a feature that allows Claude to run through an internal reasoning process before generating a response. Unlike conventional instant-answer responses, it works through complex problems step by step before arriving at an answer, dramatically improving accuracy.
目次 (11)

What Is Claude Thinking?

Claude thinking (Extended Thinking) is an "internal reasoning before answering" feature that Anthropic has implemented in Claude. Normally, Claude generates a response immediately upon receiving a question. When thinking mode is enabled, however, Claude first runs through an internal reasoning process and then constructs its final answer based on those results.

This internal reasoning process is surfaced as a thinking block in the API. Users can see "what Claude was thinking" and trace the steps that led to the answer.

According to Anthropic's official documentation (see: platform.claude.com/docs/en/docs/build-with-claude/extended-thinking), thinking mode delivers particularly notable accuracy improvements for mathematical reasoning, coding, and complex analytical tasks.

How to Use Thinking Keywords in Claude Code

In Claude Code, you can trigger thinking mode by including specific keywords in your prompt. Simply write them in the command line or editor as-is.

think: <question or task>
think hard: <complex question>
think harder: <very difficult problem>
ultrathink: <task requiring maximum reasoning>

For example, when consulting on an architecture design, write it like this:

ultrathink: Please review the architecture of this authentication module
and identify any security issues.

Just include the keyword at the beginning of your prompt or anywhere within it, and Claude Code will automatically enable Extended Thinking. No special configuration changes are required.

Differences Between think / think harder / ultrathink

The thinking keywords in Claude Code have graduated levels of depth.

Keyword Reasoning Depth Processing Time Suitable Tasks
think Light reasoning Short General questions, simple bug fixes
think hard / think harder Moderate to deep reasoning Medium Design decisions, complex logic
ultrathink Maximum reasoning Long Architecture design, tricky bugs, large-scale refactoring

One important point: "it's not always better to use ultrathink." Using ultrathink on simple tasks only increases processing time and cost without any meaningful accuracy gain. As a rule, use think as your baseline and escalate gradually based on the complexity of the problem.

Supported Models and Token Limits for Extended Thinking

According to Anthropic's official documentation (platform.claude.com/docs/en/docs/build-with-claude/extended-thinking), thinking support varies by model.

Models Supporting Manual Extended Thinking

  • Claude Sonnet 4.6
  • Claude Haiku 4.5
  • Older Claude 4-series models
  • Claude Opus 4.8, 4.7, 4.6
  • Claude Sonnet 4.6
  • Claude Fable 5, Mythos 5

Note that for Claude Fable 5 and Mythos 5, the manual type: "enabled" setting cannot be used (it returns a 400 error). These newer models require Adaptive Thinking.

Output token limits:

  • Claude Opus 4.8 / 4.7 / 4.6: up to 128,000 tokens
  • Claude Sonnet 4.6 / Haiku 4.5: up to 64,000 tokens

How to Implement Extended Thinking via the API

Below is a Python implementation example. It works simply by adding the thinking parameter.

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # thinking に使う最大トークン数
    },
    messages=[{
        "role": "user",
        "content": "このアルゴリズムの計算量を最適化してください..."
    }]
)

for block in response.content:
    if block.type == "thinking":
        print(f"[思考プロセス] {block.thinking}")
    elif block.type == "text":
        print(f"[回答] {block.text}")

budget_tokens is the maximum number of tokens allocated for thinking. It must be set to a value smaller than max_tokens. Even if you set a large value, Claude will only use as many tokens as the complexity of the task requires. Budgets exceeding 32,000 tokens are often not fully consumed.

Differences from Adaptive Thinking

For the latest models, using type: "adaptive" is the recommended approach.

thinking={
    "type": "adaptive"
    # Claude がタスクの複雑さを自動判断して thinking 量を決定
}

There is no need to manually set budget_tokens — Claude itself determines the optimal amount of thinking based on the task. In Claude Opus 4.8, Adaptive Thinking operates as the default, and you can also control the depth with the effort parameter.

The ultrathink keyword in Claude Code behaves similarly to Adaptive Thinking, with Claude Code controlling the depth of thinking internally.

When to Use Thinking Mode

Here is a summary of typical scenarios where thinking mode is most effective.

High-impact scenarios:

  • Implementing mathematical proofs or complex calculation logic
  • Architecture design decisions (such as determining microservice boundaries)
  • Identifying root causes of elusive bugs
  • Decision-making to find the optimal solution among multiple options
  • Planning refactoring of large codebases

Scenarios where thinking is unnecessary:

  • Simple questions (reading/writing files, renaming variables, etc.)
  • Repetitive routine tasks
  • Tasks that already have a clear procedure

Using ultrathink in unnecessary situations will add tens of seconds to processing time and inflate token costs. Knowing when to use it is crucial.

Cost and Processing Time Considerations

Extended Thinking is billed based on the total number of thinking tokens actually generated. Even if the display is set to summarized (display: "summarized"), you are billed for the full amount before summarization.

As a billing example: if thinking generates 10,000 tokens but only a 500-token summary is shown to the user, the charge is for 10,000 tokens (before summarization).

In Claude Code as well, heavy use of ultrathink will consume your Claude Max plan's monthly quota early. It is recommended to limit its use to complex tasks only.

Summary

Claude thinking mode is a powerful feature that dramatically improves the accuracy of Claude's responses. In Claude Code, you can control the depth in three stages — thinkthink harderultrathink — and via the API you can fine-tune it with the budget_tokens or effort parameter.

Key takeaways:

  1. Everyday tasksthink (or no thinking)
  2. Design decisions and complex bugsthink harder
  3. When maximum accuracy is neededultrathink
  4. API implementationtype: "adaptive" is recommended for the latest models

By mastering the thinking feature, Claude Code becomes an even more capable development partner. When you're faced with a difficult problem, start by giving ultrathink a try.

参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。