GuideBeginnerAgents2026-05-22

Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens

Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable applications. Includes code examples for empty responses, tool loops, and streaming.

Quick Answer

This guide explains Claude API stop_reason values (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each one in your application. You'll learn to prevent empty responses, manage tool loops, detect truncation, and build robust streaming logic.

stop_reasonMessages APItool_useerror handlingstreaming

Introduction

Every time you call the Claude Messages API, the response includes a stop_reason field. This small piece of data tells you why the model stopped generating—whether it finished naturally, requested a tool call, hit a token limit, or encountered a stop sequence. Ignoring stop_reason is like driving without paying attention to traffic lights: you might get where you're going, but you'll eventually crash.

In this guide, you'll learn:

The four possible stop_reason values and what each means
How to handle each stop reason in Python and TypeScript
How to prevent and recover from empty responses
How to build a robust tool-use loop
How to handle truncation in streaming and non-streaming modes

By the end, you'll be able to write Claude-powered applications that gracefully handle every stopping scenario.

Understanding the stop_reason Field

The stop_reason field appears in every successful Messages API response. It's not an error—it's a signal. Here's the anatomy of a response:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The Four Stop Reasons

Value	Meaning	When It Occurs
`end_turn`	Claude finished naturally	After a complete response, or after tool results when Claude decides its turn is done
`tool_use`	Claude wants to call a tool	When the model determines it needs external data or computation
`max_tokens`	Claude hit the token limit	When the response was truncated due to `max_tokens`
`stop_sequence`	Claude encountered a custom stop sequence	When the model generates one of your specified `stop_sequences`

Handling `end_turn` — The Natural Stop

end_turn is the most common stop reason. It means Claude completed its response without needing a tool or being cut off. In most cases, you can simply display the response content.

Basic Handling (Python)

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
    # Process the complete response
    print(response.content[0].text)

Basic Handling (TypeScript)

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello!' }]
});
if (response.stop_reason === 'end_turn') {
  console.log(response.content[0].text);
}

The Empty Response Problem

Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens in tool-use scenarios when:

You add text blocks immediately after tool results — Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern.
You send Claude's completed response back without adding anything — Claude already decided it's done, so it remains done.

#### How to Prevent Empty Responses Incorrect: Adding text after tool_result

messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]

Correct: Send tool results directly without additional text

messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}  # ✅ Just the result
    ]}
]

#### Recovering from Empty Responses

If you still get empty responses after fixing the above, use a continuation prompt:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # ❌ Don't just retry with the same messages
        # Claude already decided it's done
        
        # ✅ Add a continuation prompt in a NEW user message
        messages.append({
            "role": "user",
            "content": "Please continue with your response."
        })
        
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    
    return response

Handling `tool_use` — Building the Tool Loop

When stop_reason is "tool_use", Claude has decided it needs to call a tool. Your application must:

Execute the requested tool
Return the result as a tool_result block
Continue the conversation

Complete Tool Loop (Python)

from anthropic import Anthropic
client = Anthropic()
messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]
while True:
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=messages,
        tools=[{
            "name": "get_weather",
            "description": "Get current weather for a city",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }]
    )
    
    if response.stop_reason == "tool_use":
        # Extract the tool use block
        tool_use = next(block for block in response.content if block.type == "tool_use")
        
        # Execute the tool (in real code, call your API)
        tool_result = execute_tool(tool_use.name, tool_use.input)
        
        # Add assistant response and tool result to messages
        messages.append({"role": "assistant", "content": response.content})
        messages.append({
            "role": "user",
            "content": [{
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": str(tool_result)
            }]
        })
        # Loop continues...
    
    elif response.stop_reason == "end_turn":
        # Claude has finished
        print(response.content[0].text)
        break

Handling `max_tokens` — Detecting Truncation

When stop_reason is "max_tokens", Claude's response was cut off because it hit the max_tokens limit. This is common for long responses or complex reasoning.

What to Do

Increase max_tokens if the response is consistently truncated
Use a continuation prompt to ask Claude to finish
Enable extended thinking for complex tasks that need more tokens

Continuation Pattern (Python)

def handle_truncation(client, messages, max_tokens=4096):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=max_tokens,
        messages=messages
    )
    
    while response.stop_reason == "max_tokens":
        # Add the partial response to messages
        messages.append({"role": "assistant", "content": response.content})
        # Ask Claude to continue
        messages.append({"role": "user", "content": "Please continue."})
        
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=max_tokens,
            messages=messages
        )
    
    return response

Streaming with max_tokens

When streaming, you can detect truncation by checking the final message_stop event:

stream = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=256,
    messages=[{"role": "user", "content": "Write a long story"}],
    stream=True
)
for event in stream:
    if event.type == "message_stop":
        if event.message.stop_reason == "max_tokens":
            print("\n[Response was truncated — consider increasing max_tokens]")

Handling `stop_sequence` — Custom Stop Conditions

When you define custom stop_sequences in your API call, Claude will stop generating as soon as it produces one of those sequences. This is useful for:

Extracting structured data (stop at </output>)
Limiting response length in a controlled way
Building chat interfaces that stop at user-like messages

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["</answer>", "\n\nHuman:"],
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence.</answer>"}
    ]
)
if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    print(response.content[0].text)

Building a Complete Stop Reason Handler

Here's a production-ready function that handles all stop reasons:

from anthropic import Anthropic
from typing import List, Dict
client = Anthropic()
def handle_claude_response(
    messages: List[Dict],
    tools: List[Dict] = None,
    max_tokens: int = 1024
) -> str:
    """
    Handle all stop reasons and return the final text response.
    """
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=max_tokens,
            messages=messages,
            tools=tools
        )
        
        if response.stop_reason == "end_turn":
            # Check for empty response
            if not response.content:
                messages.append({
                    "role": "user",
                    "content": "Please continue."
                })
                continue
            return response.content[0].text
        
        elif response.stop_reason == "tool_use":
            # Execute tools and continue
            messages.append({"role": "assistant", "content": response.content})
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    messages.append({
                        "role": "user",
                        "content": [{
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": str(result)
                        }]
                    })
        
        elif response.stop_reason == "max_tokens":
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": "Please continue."})
        
        elif response.stop_reason == "stop_sequence":
            return response.content[0].text

Best Practices Summary

Always check stop_reason — Don't assume end_turn means success. It could be an empty response.
Never add text after tool_result — This causes empty responses. Send only the result.
Use continuation prompts for truncation — When max_tokens stops the response, ask Claude to continue.
Build a loop for tool_use — Keep calling the API until you get end_turn.
Log stop_reason in production — It's invaluable for debugging unexpected behavior.

Key Takeaways

Four stop reasons exist: end_turn (natural), tool_use (needs tool), max_tokens (truncated), stop_sequence (custom stop).
Empty responses with end_turn are caused by adding text after tool_result blocks or sending back Claude's own response unchanged. Fix by sending only tool results and using continuation prompts.
Tool loops require explicit handling: When stop_reason is tool_use, execute the tool, return the result, and continue the conversation until end_turn.
Truncation from max_tokens can be handled by appending the partial response and asking Claude to continue in a new user message.
Streaming applications should check stop_reason in the final message_stop event to detect truncation or tool requests.