Lesson 2e Streaming Prompt | Module 2

Lesson 2e: Streaming Responses with Anthropic Claude

Learn how to stream Claude's responses in real-time for better user experience. Anthropic's streaming uses event handlers, different from OpenAI's async iterator approach.

Key Difference from OpenAI

OpenAI: Uses async iterator with for await

const stream = await openai.responses.stream({
  model: "gpt-5-nano",
  input: "Hello"
});

for await (const chunk of stream) {
  process.stdout.write(chunk.delta || '');
}

Anthropic: Uses .stream() method with event handlers

const response = await anthropic.messages
  .stream({ ... })
  .on("text", (text) => {
    process.stdout.write(text);
  });

Code Example

Create src/anthropic/stream-prompt.ts:

import Anthropic from "@anthropic-ai/sdk";
import dotenv from "dotenv";

// Load environment variables
dotenv.config();

// Create Anthropic client
const anthropic = new Anthropic();

// Async function with proper return type
async function streamPrompt(): Promise<void> {
  try {
    console.log("Testing Anthropic connection...");

    // Make API call - response is automatically typed!
    // Using a system prompt along with user prompt
    console.log("---------Streaming event data start-------");
    const response = await anthropic.messages
      .stream({
        model: "claude-haiku-4-5",
        max_tokens: 1000,
        system:
          "You are a helpful travel assistant. Provide detailed travel suggestions based on user preferences and give a guide to the destination and include distance from the airport.",
        messages: [
          {
            role: "user",
            content:
              "Suggest a travel destination within Europe where there is a Christmas market that is famous but is not in a big city. I would like to go somewhere that is less than 2 hours from a major airport and has good public transport links.",
          },
        ],
      })
      .on("text", (text) => {
        process.stdout.write(text);
      });
    console.log("\n\n---------Streaming event data end-------");
    const finalResponse = await response.finalMessage();


    let usageInfo = finalResponse.usage;

    console.log("✅ Stream Prompt Success!");

    // Extract text from response
    const textBlocks = finalResponse.content.filter(
      (block) => block.type === "text"
    );

    if (textBlocks.length === 0) {
      throw new Error("No text content in response");
    }

    console.log(
      "AI Final Response:",
      textBlocks.map((block) => block.text).join("\n")
    );
    console.log("Tokens used:");
    console.dir(usageInfo, { depth: null });
    console.log("✅ Stream Prompt Completed!");
  } catch (error) {
    // Proper error handling with type guards
    if (error instanceof Anthropic.APIError) {
      console.log("❌ API Error:", error.status, error.message);
    } else if (error instanceof Error) {
      console.log("❌ Error:", error.message);
    } else {
      console.log("❌ Unknown error occurred");
    }
  }
}

// Run the test
streamPrompt().catch((error) => {
  console.error("Error:", error);
});

Run It

pnpm tsx src/anthropic/stream-prompt.ts

You'll see the response appear word-by-word in real-time!

Understanding Anthropic Streaming

Event Handlers

Anthropic streaming provides multiple event types:

const stream = await anthropic.messages
  .stream({ ... })
  .on("text", (text) => {
    // Text delta - most common event
    process.stdout.write(text);
  })
  .on("content_block_start", (block) => {
    // New content block starting
    console.log("Block started:", block.type);
  })
  .on("content_block_delta", (delta) => {
    // Content block update
    if (delta.type === "text_delta") {
      process.stdout.write(delta.text);
    }
  })
  .on("content_block_stop", () => {
    // Content block finished
    console.log("\nBlock completed");
  })
  .on("message_start", (message) => {
    // Message started
    console.log("Message ID:", message.id);
  })
  .on("message_stop", () => {
    // Message completed
    console.log("\nMessage finished");
  })
  .on("error", (error) => {
    // Error occurred
    console.error("Stream error:", error);
  });

Most Common Pattern

For most use cases, you only need the text event:

const stream = await anthropic.messages
  .stream({
    model: "claude-haiku-4-5",
    max_tokens: 1000,
    messages: [{ role: "user", content: "Tell me a story" }]
  })
  .on("text", (text) => {
    process.stdout.write(text);
  });

// Wait for completion and get final message
const finalMessage = await stream.finalMessage();
console.log("\n\nTokens used:", finalMessage.usage);

Getting the Final Message

After streaming, get the complete response:

const stream = await anthropic.messages
  .stream({ ... })
  .on("text", (text) => {
    process.stdout.write(text);
  });

// Get the complete final message
const finalMessage = await stream.finalMessage();

console.log("\nFull response:", finalMessage.content[0].text);
console.log("Tokens:", finalMessage.usage);
console.log("Stop reason:", finalMessage.stop_reason);

Practical Examples

1. Chat Interface

async function chatWithStreaming(userMessage: string) {
  let fullResponse = "";

  const stream = await anthropic.messages
    .stream({
      model: "claude-haiku-4-5",
      max_tokens: 1000,
      messages: [{ role: "user", content: userMessage }]
    })
    .on("text", (text) => {
      fullResponse += text;
      process.stdout.write(text);  // Display in real-time
    });

  await stream.finalMessage();
  return fullResponse;
}

// Usage
const response = await chatWithStreaming("What's the weather like?");

2. Progress Indicator

async function streamWithProgress(prompt: string) {
  let wordCount = 0;

  const stream = await anthropic.messages
    .stream({
      model: "claude-haiku-4-5",
      max_tokens: 1000,
      messages: [{ role: "user", content: prompt }]
    })
    .on("text", (text) => {
      wordCount += text.split(/\s+/).length;
      process.stdout.write(text);

      // Show progress every 10 words
      if (wordCount % 10 === 0) {
        console.log(`\n[${wordCount} words generated]`);
      }
    });

  await stream.finalMessage();
}

3. Accumulate Response

async function streamAndAccumulate(prompt: string) {
  const chunks: string[] = [];

  const stream = await anthropic.messages
    .stream({
      model: "claude-haiku-4-5",
      max_tokens: 1000,
      messages: [{ role: "user", content: prompt }]
    })
    .on("text", (text) => {
      chunks.push(text);
      process.stdout.write(text);
    });

  const finalMessage = await stream.finalMessage();

  return {
    chunks,  // Individual pieces
    full: chunks.join(""),  // Complete text
    usage: finalMessage.usage
  };
}

Error Handling with Streams

Always handle streaming errors:

async function streamWithErrorHandling() {
  try {
    const stream = await anthropic.messages
      .stream({
        model: "claude-haiku-4-5",
        max_tokens: 1000,
        messages: [{ role: "user", content: "Hello" }]
      })
      .on("text", (text) => {
        process.stdout.write(text);
      })
      .on("error", (error) => {
        console.error("❌ Stream error:", error);
        throw error;
      });

    await stream.finalMessage();
  } catch (error) {
    if (error instanceof Anthropic.APIError) {
      console.log("API Error:", error.status, error.message);
    } else {
      console.log("Error:", error);
    }
  }
}

OpenAI vs Anthropic Streaming

OpenAI Approach

const stream = await openai.responses.stream({
  model: "gpt-5-nano",
  input: "Hello"
});

for await (const chunk of stream) {
  const content = chunk.delta || "";
  process.stdout.write(content);
}

Anthropic Approach

const stream = await anthropic.messages
  .stream({
    model: "claude-haiku-4-5",
    max_tokens: 1000,
    messages: [{ role: "user", content: "Hello" }]
  })
  .on("text", (text) => {
    process.stdout.write(text);
  });

await stream.finalMessage();

Key Differences

Feature	OpenAI	Anthropic
Method	`.stream()` method	`.stream()` method
Iteration	`for await` loop	Event handlers
Events	Single delta stream	Multiple event types
Final message	Last chunk	`.finalMessage()`

When to Use Streaming

✅ Use Streaming For:

Chat interfaces - Show responses as they generate
Long responses - Improve perceived performance
Real-time UIs - Interactive applications
User engagement - Keep users engaged during generation

❌ Don't Use Streaming For:

Batch processing - No user watching
API responses - Wait for complete response
Complex parsing - Easier with full text
Caching - Cache complete responses

Best Practices

Always await finalMessage()

const stream = await anthropic.messages.stream({ ... });
const final = await stream.finalMessage();  // Don't forget!

Handle errors in stream

.on("error", (error) => {
  console.error("Stream error:", error);
})

Show loading indicators

console.log("🤔 Thinking...");
const stream = await anthropic.messages.stream({ ... });

Consider user experience
- Don't stream too slowly (chunking)
- Show completion indicators
- Allow cancellation if needed

Key Takeaways

✅ Use .stream() method with event handlers
✅ Most common: .on("text", callback)
✅ Always await .finalMessage() for completion
✅ Different from OpenAI's async iterator approach
✅ Great for chat interfaces and long responses

Next Steps

Learn how to get structured, validated JSON output from Claude!

Next: Lesson 2f - Structured Output →

Quick Reference

// Basic streaming
const stream = await anthropic.messages
  .stream({
    model: "claude-haiku-4-5",
    max_tokens: 1000,
    messages: [{ role: "user", content: "Hello" }]
  })
  .on("text", (text) => {
    process.stdout.write(text);
  });

const final = await stream.finalMessage();

Module 2 - Lesson 2e: Streaming Responses with Anthropic