Claude Haiku 4.5 Performance | Speed and Cost-Efficiency Verified with Benchmarks
Claude Haiku 4.5 is positioned as the "fastest and most affordable compact model," yet it delivers SWE-bench Verified coding performance of 73.3% — nearly on par with the previous Sonnet 4 — at one-third the cost and roughly twice the speed. This article draws on Anthropic's official announcements and benchmarks to break down Haiku 4.5's performance across four dimensions — speed, coding, computer use, and pricing — so you can make an informed decision about when to choose it.
Claude Haiku 4.5's coding performance reaches 73.3% on SWE-bench Verified, putting it nearly on par with the previous mid-tier Sonnet 4 (72.7%). Despite being the smallest model in the family, it handles real-world bug fixing at a practical level — overturning the conventional wisdom that "small = low performance."
On speed, the standout advantage is a response time that feels instant: more than 2x faster than Sonnet 4, and 4–5x faster than Sonnet 4.5. And with pricing at $1 input / $5 output per million tokens (USD, excl. tax) — roughly one-third of Sonnet 4 — delivering equivalent performance at one-third the cost is the core of its value proposition.
The trade-off is a context window capped at 200K tokens, which falls short of the 1M available in Opus and Sonnet. Its training data cutoff of July 2025 is also older than other models, so tasks involving very long documents processed in bulk, or tasks requiring the latest knowledge, are better handled by higher-tier models. The basic rule of thumb is: use Haiku for high-speed, high-volume workloads, and hand off to Opus or Sonnet when you need more.
目次 (8)
- Quick Summary — Sonnet 4-Level Performance, 2x Speed, 1/3 the Cost
- SWE-bench Verified 73.3% — Coding Performance Nearly on Par with Sonnet 4
- Speed 2x vs. Sonnet 4, 4–5x vs. Sonnet 4.5 — Feels Instantaneous
- OSWorld 50.7% — Computer Use and Agent Performance
- Pricing: $1 Input / $5 Output — What the Cost-Efficiency Actually Means
- Full Specs — Context Window, Output, and Thinking Features
- Where Haiku 4.5 Excels — and Where It Doesn't
- Summary — Performance Is Production-Ready; Choose Based on Speed and Cost
Quick Summary — Sonnet 4-Level Performance, 2x Speed, 1/3 the Cost
Claude Haiku 4.5 is the fastest and most affordable model in the Claude family, released on October 15, 2025. Its defining characteristic is delivering coding performance equivalent to the previous mid-tier Sonnet 4 at approximately twice the speed and one-third the cost. Here are the key figures at a glance.
| Dimension | Claude Haiku 4.5 |
|---|---|
| Coding (SWE-bench Verified) | 73.3% (nearly on par with Sonnet 4's 72.7%) |
| Computer use (OSWorld) | 50.7% (surpassing Sonnet 4's 42.2%) |
| Speed | 2x+ vs. Sonnet 4 / 4–5x vs. Sonnet 4.5 |
| Pricing | $1 input / $5 output (per million tokens, USD excl. tax) |
| Cost ratio | Approximately 1/3 of Sonnet 4 |
| Context window | 200K tokens |
Source: Anthropic "Introducing Claude Haiku 4.5" (accessed: 2026-05-30)
SWE-bench Verified 73.3% — Coding Performance Nearly on Par with Sonnet 4
Claude Haiku 4.5 recorded 73.3% on the industry-standard SWE-bench Verified benchmark (average over 50 trials, 128K thinking budget). SWE-bench Verified automatically evaluates real bug-fixing tasks from GitHub repositories, measuring how accurately an AI can fix real-world codebases.
This score surpasses Claude Sonnet 4's 72.7%, which was itself a top-tier model just a few months ago. In other words, Haiku 4.5 — as the "smallest model" in the lineup — inherits flagship-grade coding performance from the previous generation. Anthropic itself describes it as offering "Sonnet 4-equivalent coding performance at one-third the cost," signaling a meaningful step up in the practical capability of compact models.
Source: Anthropic "Introducing Claude Haiku 4.5"
Speed 2x vs. Sonnet 4, 4–5x vs. Sonnet 4.5 — Feels Instantaneous
What makes Haiku 4.5 stand out even more than its benchmark numbers is its speed. Anthropic has announced it is more than 2x faster than Sonnet 4.5, and reviews from Japan report response times roughly 2x faster than the previous Sonnet 4 and 4–5x faster than Sonnet 4.5.
This speed matters most in use cases where response time directly affects user experience. Specific examples include:
- Real-time responses in chatbots and customer support
- Code completion and inline suggestions in IDEs
- Batch processing of large volumes of documents for classification or summarization
- Automation where agents execute many sequential steps
In practice, GitHub Copilot integration has been praised as "higher quality than Sonnet 4 and faster," and Haiku's advantage grows the more your workload demands low latency.
Source: Anthropic "Introducing Claude Haiku 4.5"
OSWorld 50.7% — Computer Use and Agent Performance
Beyond coding, Haiku 4.5 also performs well in other areas. On the OSWorld benchmark, which measures desktop automation capability, it recorded 50.7%, clearly surpassing Sonnet 4's 42.2%. OSWorld evaluates "computer use" — the ability to observe a screen, control a mouse and keyboard, and complete real tasks.
This result shows that Haiku is not just a lightweight text generation model, but also capable enough for agentic use cases involving autonomous multi-step workflows using tools. Combined with its speed advantage, it becomes a realistic option as the computational backbone for running complex, multi-step workflows in a short time.
Source: Anthropic "Introducing Claude Haiku 4.5"
Pricing: $1 Input / $5 Output — What the Cost-Efficiency Actually Means
Claude Haiku 4.5's API pricing is $1 input / $5 output per million tokens (USD, excl. tax). This is approximately one-third of Sonnet 4.6 ($3 input / $15 output) and one-fifth of Opus 4.7 ($5 input / $25 output). Note that the higher-tier Opus 4.8 was released after this article was written, so for the latest price comparisons with top-tier models, refer to the note in the table below and Anthropic Pricing.
| Model | Input (/MTok) | Output (/MTok) |
|---|---|---|
| Claude Opus 4.8 (※) | Check official site | Check official site |
| Claude Opus 4.7 | $5 | $25 |
| Claude Sonnet 4.6 | $3 | $15 |
| Claude Haiku 4.5 | $1 | $5 |
(※) Claude Opus 4.8 was released after this article was written (2026-05-30). For the latest pricing, please check Anthropic Pricing.
The picture of "Sonnet 4-level performance at one-third the cost" is directly reflected in the pricing. Additionally, combining Batch API and prompt caching discounts can lower the effective per-unit cost even further, making it possible to keep monthly costs at a realistic level for services handling large volumes of ongoing requests. For a full overview of pricing, see also our article summarizing Claude's pricing structure.
Source: Anthropic Pricing / Anthropic Models overview (accessed: 2026-05-30)
Full Specs — Context Window, Output, and Thinking Features
To properly estimate performance, you also need to understand the spec-level constraints beyond the benchmarks. In optimizing for speed and cost, Haiku 4.5 does concede some ground to higher-tier models on context window size and knowledge freshness.
| Spec | Claude Haiku 4.5 |
|---|---|
| API ID | claude-haiku-4-5-20251001 |
| Context window | 200K tokens |
| Max output | 64K tokens |
| Latency | Fastest |
| Extended Thinking | Yes |
| Adaptive Thinking | No |
| Training data cutoff | July 2025 |
Extended Thinking is an advanced reasoning feature that allocates a thinking budget for step-by-step inference. Adaptive Thinking automatically adjusts the amount of thinking based on task difficulty — a feature Haiku 4.5 lacks, meaning that for tasks requiring deep reasoning, you will need to manually specify a thinking budget.
Key limitations to be aware of: the context window is capped at 200K tokens, and the training data cutoff of July 2025 is older than Opus and Sonnet (January 2026). If you need to process very long documents of over one million tokens in a single pass, or if your task depends on very recent information, you will need to opt for a higher-tier model.
Source: Anthropic Models overview (accessed: 2026-05-30)
Where Haiku 4.5 Excels — and Where It Doesn't
Based on the performance reviewed above, Haiku 4.5 has a clear profile: it excels at tasks where speed and cost efficiency matter, and yields to higher-tier models for maximum-difficulty reasoning or very long documents.
Ideal use cases include:
- Chat and support applications requiring real-time responses
- IDE integrations for code completion and lightweight refactoring
- Batch processing of large volumes of documents for classification, summarization, or extraction
- Agent-based automation systems running many steps at high speed
Use cases to avoid:
- Top-tier design decisions and complex long-horizon reasoning (Opus 4.7 is better suited)
- Processing very long documents exceeding 200K tokens in bulk (use Sonnet / Opus with 1M context)
- Tasks that depend on knowledge from August 2025 or later
For a full overview of how to choose between Opus, Sonnet, and Haiku — including a decision chart — see our article comparing the Claude models.
Summary — Performance Is Production-Ready; Choose Based on Speed and Cost
Claude Haiku 4.5 is a practical model that, as its SWE-bench Verified score of 73.3% and OSWorld score of 50.7% demonstrate, matches the capability of the previous mid-tier Sonnet 4 while running at approximately twice the speed and one-third the cost. Top-tier reasoning and very long document processing remain the domain of Opus and Sonnet, but for workloads where speed and cost efficiency matter — real-time responses, large-scale batch processing, and agentic automation — Haiku 4.5 is the most cost-effective option in the current lineup. A good approach is to start with Haiku, then switch to a higher-tier model only where performance falls short. That is the most practical path to keeping costs down while maintaining quality.
Source: Anthropic "Introducing Claude Haiku 4.5" / Anthropic Models overview (accessed: 2026-05-30)