Prompt Caching

その他

Clauder Navi 編集部 / 最終更新 2026-04-24

Summary — Key Points of This Lesson

Prompt Caching is a feature that caches the first portion of a prompt on Anthropic's servers, reducing token processing costs on reuse.
The cost to read from cache is 10% of normal input tokens (90% reduction). This is especially effective when repeatedly sending long system prompts or reference documents (source: anthropic.com/pricing — USD, tax excluded).
Setup is as simple as adding cache_control: {"type": "ephemeral"} to your request. The SDK automatically determines what to cache.
Cache writes incur 1.25x to 2x the normal cost. The structure is designed to pay off through repeated use.
Code examples are sourced from the official documentation as the primary reference.

目次 (6)

What Is Prompt Caching?
Pricing Structure
Python — Minimal Code Example
When Caching Is Effective
Relationship to Claude Opus 4.7 Migration
Level 5 Complete

What Is Prompt Caching?

When using Claude via the API, sending long system prompts or large volumes of reference documents with every request causes input token costs to accumulate. Prompt Caching is a feature that caches the leading portion of a request on Anthropic's servers, allowing subsequent requests to reuse that cache (source).

The cost of reading from cache is held to 10% of normal input tokens (USD, tax excluded). Always check the official site for the latest pricing details (anthropic.com/pricing).

Pricing Structure

Token Type	Cost Multiplier (vs. Normal Input)
Normal input tokens	1× (baseline)
Cache write tokens (5-minute retention)	1.25×
Cache write tokens (1-hour retention)	2×
Cache read tokens	0.1× (90% reduction)

※ USD, tax excluded. For specific per-model pricing, see anthropic.com/pricing.

Python — Minimal Code Example

The following configuration follows the official documentation code example (source). By adding cache_control at the top level of the request, the SDK automatically sets a cache breakpoint at the last cacheable block.

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    cache_control={"type": "ephemeral"},  # キャッシュを有効化
    system="あなたは文学作品を分析する AI アシスタントです。"
           "テーマ・登場人物・文体について深い考察を提供してください。",
    messages=[
        {
            "role": "user",
            "content": "「源氏物語」の主要テーマを分析してください。",
        }
    ],
)

# 使用トークン数を確認
print(response.usage.model_dump_json())

response.usage includes cache_creation_input_tokens and cache_read_input_tokens, allowing you to verify how the cache is operating.

When Caching Is Effective

Long system prompts — when sending hundreds to thousands of tokens of operational instructions with every request
RAG with reference documents — when referencing the same documents across multiple conversations
Few-shot example reuse — when repeatedly sending prompts that contain many examples

The cache is invalidated when the matching prefix changes. To maximize the benefit of caching, it is important to design your requests so that the unchanging portions are grouped at the beginning.

Relationship to Claude Opus 4.7 Migration

Prompt Caching continues to be available with Claude Opus 4.7. For notes on migration (breaking changes and deprecated parameters), see Claude Opus 4.7 Migration Practical Guide.

Level 5 Complete

Congratulations. You have completed all 5 lessons of Level 5 "API / SDK." You now have a solid understanding of the fundamentals for using Claude programmatically, covering API key setup, SDK basics, Tool Use, streaming, and Prompt Caching. In Level 6 "Business Use," you will learn practical approaches for integrating Claude into business workflows.

参考になったら ♡

この記事は役立ちましたか?

ご注意: Clauder Navi は Anthropic 公式情報を直接参照し正確な内容に努めておりますが、本記事の内容に基づく投資判断・契約・利用結果による損害について責任を負いかねます。重要な意思決定の際は、必ず Anthropic 公式・ claude.com の一次情報をご自身でご確認ください。

Clauder Navi 編集部

@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務を毎日発信。運営方針はメディアについてをご覧ください。

プロフィール → 副社長コラム → レッスン一覧 →

Prompt Caching

Summary — Key Points of This Lesson

What Is Prompt Caching?

Pricing Structure

Python — Minimal Code Example

When Caching Is Effective

Relationship to Claude Opus 4.7 Migration

Level 5 Complete

関連記事

What Is Claude?

How to Choose a Pricing Plan

KPMG Deploys Claude to 276,000 Employees | Inside Digital Gateway and Blaze

Model Differences