Streaming Responses

Summary — Key Takeaways for This Lesson

  • Streaming is a method that receives Claude's response incrementally as it is being generated. Users can start reading without waiting for the full response.
  • Use client.messages.stream() in the SDK. Internally, it uses Server-Sent Events (SSE).
  • In Python, you can iterate over text chunks via stream.text_stream. In TypeScript, you can use the .on("text", ...) event handler.
  • Streaming itself incurs no additional cost. The number of tokens consumed is identical to non-streaming.
  • Code examples treat the source as the primary reference.
目次 (5)

Why Use Streaming

The standard messages.create() returns a response only after the entire output has been generated. For long outputs, users may have to wait several seconds to tens of seconds. With streaming, generated text arrives instantly, dramatically improving perceived speed in chat UIs and similar interfaces.

Streaming is implemented using SSE (Server-Sent Events) source.

Python — Streaming Code Example

The following is based on the official documentation's code example source.

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    max_tokens=1024,
    messages=[{"role": "user", "content": "日本の歴史について教えてください"}],
    model="claude-opus-4-7",
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

text_stream is a generator that yields text chunks (strings) in sequence. Specifying flush=True ensures immediate output without buffering.

TypeScript — Streaming Code Example

In TypeScript, chunks are received via the .on("text", ...) event handler source.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

await client.messages
  .stream({
    messages: [{ role: "user", content: "日本の歴史について教えてください" }],
    model: "claude-opus-4-7",
    max_tokens: 1024
  })
  .on("text", (text) => {
    process.stdout.write(text);
  });

Retrieving the Final Message

If you need the complete Message object (including usage) after the stream finishes, you can retrieve it with stream.get_final_message() (Python) or stream.finalMessage() (TypeScript).

# Python — Retrieving the complete message after stream completion
with client.messages.stream(...) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

final = stream.get_final_message()
print(final.usage)  # Check token usage

When to Use Each Approach

Method Best Use Cases
Non-streaming Batch processing, heavy post-processing, when the full response is needed
Streaming Chat UIs, terminal output, real-time display of long generated text
参考になったら ♡
Clauder Navi 編集部
@clauder_navi

Anthropic の Claude / Claude Code を中心に、日本のエンジニア向けに最新動向と実務 を毎日発信。 運営方針 は メディアについて をご覧ください。