Claude Fast Mode | How to Use 2.5x Speed, Pricing & Subscription Details

Clauder Navi 編集部 / 最終更新 2026-06-16

AI-generated article summarypowered by Claude

"Claude's output is too slow." "I wish it would respond faster." — The official answer to these frustrations is Fast Mode. It runs the same model in a speed-optimized configuration, boosting output by up to 2.5x. The feature launched as a research preview in February 2026. This article summarizes supported models, how to use it, pricing, and its relationship to subscriptions — all based solely on primary Anthropic sources.

結論powered by Claude

![Claude Fast Mode | How to Use 2.5x Speed, Pricing & Subscription Details](/assets/generated/placeholder-claude-speed-hero.jpg)

目次 (10)

Quick Summary — "Making Claude Faster" = Fast Mode for 2.5x Output
What Is Fast Mode — Running the Same Model at 2.5x Speed
Supported Models: Opus 4.8, 4.7, and 4.6 Only
Using Fast Mode in Claude Code — Toggle with /fast
Using Fast Mode via the API — The speed: "fast" Parameter
Pricing — Opus 4.8 at $10 Input / $50 Output per 1M Tokens
Usage Credits Required Even on Subscriptions
Fast Mode vs. Effort Level — Understanding the Difference
Rate Limits and Fallback Behavior
Summary — Know When Speed Actually Matters

Quick Summary — "Making Claude Faster" = Fast Mode for 2.5x Output

Here are the key points at a glance.

Item	Details
What speeds up	Output tokens per second (OTPS), up to 2.5x
Supported models	Claude Opus 4.8 / 4.7 / 4.6 only
Quality	Same model weights and behavior. No degradation in intelligence or capability
Enable (Claude Code)	`/fast` command
Enable (API)	`speed: "fast"` parameter
Pricing (Opus 4.8)	$10 input / $50 output per 1M tokens
Subscriptions	Billed directly to usage credits, even on Pro/Max plans
Status	Research preview (specs and pricing subject to change)

Two key points to remember: what speeds up is the output volume per second, not the time until the first response, and Fast Mode is not included in subscription plans — it is billed separately.

What Is Fast Mode — Running the Same Model at 2.5x Speed

Fast Mode is not a switch to a different, lighter model. It runs Claude Opus in an inference configuration that prioritizes speed over cost efficiency. The model weights and behavior are identical, so there is no change in intelligence or functionality — only the response speed increases. This is explicitly stated in the official documentation (API Docs).

The metric that improves is OTPS (output tokens per second), up to 2.5x the standard speed. However, TTFT (time to first token) is not affected by Fast Mode. The perceived time to finish a long response or code generation shrinks, but the initial response latency stays the same — keeping this in mind will help set the right expectations.

Supported Models: Opus 4.8, 4.7, and 4.6 Only

Fast Mode is available only for these three Opus models.

Claude Opus 4.8 (claude-opus-4-8)
Claude Opus 4.7 (claude-opus-4-7)
Claude Opus 4.6 (claude-opus-4-6)

It is not available for Sonnet, Haiku, or other models — sending speed: "fast" to an unsupported model via the API will return an error. Note that Fast Mode for Opus 4.6 is deprecated as of the Opus 4.8 release and is scheduled for removal approximately 30 days later. After removal, sending speed: "fast" with claude-opus-4-6 will not produce an error but will fall back to standard speed and standard pricing. To maintain fast performance, migrate to Opus 4.8 or 4.7.

Using Fast Mode in Claude Code — Toggle with `/fast`

In Claude Code, switching is a single command. The steps are as follows (Claude Code Docs).

Type /fast in the terminal and press Tab to toggle on/off (or set "fastMode": true in the config file).
If you were using a different model, it automatically switches to Opus, and a "Fast mode ON" confirmation appears.
While active, a small ↯ icon is displayed next to the prompt.
Run /fast again to disable it (the model stays on Opus; use /model to switch back to your previous model).

Claude Code v2.1.36 or later is required; the VS Code extension is not supported. The default model is Opus 4.8 (versions 2.1.142–2.1.153 default to Opus 4.7; Opus 4.8 becomes the default from v2.1.154 onward). To maximize cost efficiency, the best practice is to enable Fast Mode at the start of a session rather than mid-conversation.

Using Fast Mode via the API — The `speed: "fast"` Parameter

To use it via the API, include the beta header anthropic-beta: fast-mode-2026-02-01 and add speed: "fast" to your request.

curl https://api.anthropic.com/v1/messages \
  --header "x-api-key: $ANTHROPIC_API_KEY" \
  --header "anthropic-version: 2023-06-01" \
  --header "anthropic-beta: fast-mode-2026-02-01" \
  --header "content-type: application/json" \
  --data '{
    "model": "claude-opus-4-8",
    "max_tokens": 4096,
    "speed": "fast",
    "messages": [{"role": "user", "content": "Refactor this module"}]
  }'

You can verify which speed was actually used by checking the usage.speed field in the response ("fast" or "standard"). Note that Fast Mode is not available with the Batch API, Priority Tier, or Claude Platform on AWS, and is also not offered on third-party platforms such as Vertex AI, Amazon Bedrock, or Microsoft Foundry (it is limited to the Claude API research preview only).

Pricing — Opus 4.8 at $10 Input / $50 Output per 1M Tokens

Fast Mode uses a pricing multiplier applied to standard rates, and the rate is flat across the entire 1M context.

Model	Input (per 1M tokens)	Output (per 1M tokens)
Opus 4.8	$10	$50
Opus 4.7 / 4.6	$30	$150

The Fast Mode price for Opus 4.8 is roughly one-third of the previous generation (4.7/4.6), making speed more accessible. Prompt caching and data residency multipliers are applied on top of these Fast Mode prices.

Usage Credits Required Even on Subscriptions

This is the most important caveat. Fast Mode in Claude Code is available to Pro, Max, Team, and Enterprise plan users as well as Console users, but it is not covered by subscription rate limits — it is billed directly to usage credits. Even if you have remaining subscription quota, Fast Mode tokens are billed at Fast Mode rates from the very first token, as explicitly stated in the documentation.

To use it, you need the following:

Enable usage credits on your account (individual users can do this from billing settings in the Console).
For Team / Enterprise accounts, Fast Mode is disabled by default — an administrator must explicitly enable it.

Organizations looking to control costs can add fastModePerSessionOptIn: true to their management settings so that each session starts with Fast Mode off, requiring users to explicitly enable it with /fast when needed. To disable it entirely, set the environment variable CLAUDE_CODE_DISABLE_FAST_MODE=1.

Fast Mode vs. Effort Level — Understanding the Difference

There is another way to speed up Claude: adjusting the effort level. The two approaches work differently.

Setting	Effect
Fast Mode	Lower latency at the same quality, but higher cost
Lower effort level	Speeds up by reducing thinking time. May reduce quality on complex tasks

If you want speed without sacrificing quality, use Fast Mode. If you're dealing with simple tasks where reduced thinking is acceptable, use a lower effort level. You can also combine both to target "simple tasks at maximum speed."

Rate Limits and Fallback Behavior

Fast Mode has its own dedicated rate limits separate from standard Opus, and Opus 4.8, 4.7, and 4.6 share the same pool. When the limit is exceeded or usage credits run out, the following occurs:

Fast Mode automatically falls back to standard speed.
The ↯ icon turns gray to indicate a cooldown.
Work continues at standard speed and standard pricing.
Once the cooldown ends, Fast Mode is automatically re-enabled.

Via the API, exceeding the limit returns a 429 response with a retry-after header, and the SDK will automatically retry up to 2 times by default. To switch to standard speed immediately, catch the 429 and resend the request without the speed parameter. Be aware, however, that switching speeds causes a prompt cache miss — different speeds do not share cached prefixes.

Summary — Know When Speed Actually Matters

Fast Mode is a clear, official way to speed up Claude, but it is not a universal solution. It works best for rapid iteration on code changes, live debugging, and time-sensitive work where latency matters more than cost — interactive use cases where speed is worth the price. Conversely, for long-running automated tasks, batch/CI jobs, and cost-sensitive workloads where speed is less critical, standard mode is the better fit. Keep in mind that it is billed separately from your subscription, and enable it with /fast at the start of a session — that is the most efficient way to get the speed boost when you need it.

Sources: Fast Mode (Research Preview) - Claude API Docs / Speed up responses with fast mode - Claude Code Docs

参考になったら ♡

この記事は役立ちましたか?

ご注意: Clauder Navi は Anthropic 公式情報を直接参照し正確な内容に努めておりますが、本記事の内容に基づく投資判断・契約・利用結果による損害について責任を負いかねます。重要な意思決定の際は、必ず Anthropic 公式・ claude.com の一次情報をご自身でご確認ください。

Clauder Navi 編集部

@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務を毎日発信。運営方針はメディアについてをご覧ください。

プロフィール → 副社長コラム → レッスン一覧 →

Claude Fast Mode | How to Use 2.5x Speed, Pricing & Subscription Details

Quick Summary — "Making Claude Faster" = Fast Mode for 2.5x Output

What Is Fast Mode — Running the Same Model at 2.5x Speed

Supported Models: Opus 4.8, 4.7, and 4.6 Only

Using Fast Mode in Claude Code — Toggle with /fast

Using Fast Mode via the API — The speed: "fast" Parameter

Pricing — Opus 4.8 at $10 Input / $50 Output per 1M Tokens

Usage Credits Required Even on Subscriptions

Fast Mode vs. Effort Level — Understanding the Difference

Rate Limits and Fallback Behavior

Summary — Know When Speed Actually Matters

関連記事

Claude Haiku 4.5 Performance | Speed and Cost-Efficiency Verified with Benchmarks

How Much Does Claude Sonnet Cost | API and Pro/Max Monthly Plans Explained

Claude Opus 4.1 / 4.5 / 4.6 / 4.7 | Performance & Pricing Comparison

What Is Claude Sonnet 4.6 | Specs, Capabilities, and Differences from Opus

Using Fast Mode in Claude Code — Toggle with `/fast`

Using Fast Mode via the API — The `speed: "fast"` Parameter