BeClaude
GuideBeginnerAgents2026-05-22

Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens

Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable applications. Includes code examples for empty responses, tool loops, and streaming.

Quick Answer

This guide explains Claude API stop_reason values (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each one in your application. You'll learn to prevent empty responses, manage tool loops, detect truncation, and build robust streaming logic.

stop_reasonMessages APItool_useerror handlingstreaming

Introduction

Every time you call the Claude Messages API, the response includes a stop_reason field. This small piece of data tells you why the model stopped generating—whether it finished naturally, requested a tool call, hit a token limit, or encountered a stop sequence. Ignoring stop_reason is like driving without paying attention to traffic lights: you might get where you're going, but you'll eventually crash.

In this guide, you'll learn:

  • The four possible stop_reason values and what each means
  • How to handle each stop reason in Python and TypeScript
  • How to prevent and recover from empty responses
  • How to build a robust tool-use loop
  • How to handle truncation in streaming and non-streaming modes
By the end, you'll be able to write Claude-powered applications that gracefully handle every stopping scenario.

Understanding the stop_reason Field

The stop_reason field appears in every successful Messages API response. It's not an error—it's a signal. Here's the anatomy of a response:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The Four Stop Reasons

ValueMeaningWhen It Occurs
end_turnClaude finished naturallyAfter a complete response, or after tool results when Claude decides its turn is done
tool_useClaude wants to call a toolWhen the model determines it needs external data or computation
max_tokensClaude hit the token limitWhen the response was truncated due to max_tokens
stop_sequenceClaude encountered a custom stop sequenceWhen the model generates one of your specified stop_sequences

Handling end_turn — The Natural Stop

end_turn is the most common stop reason. It means Claude completed its response without needing a tool or being cut off. In most cases, you can simply display the response content.

Basic Handling (Python)

from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

Basic Handling (TypeScript)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); const response = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello!' }] });

if (response.stop_reason === 'end_turn') { console.log(response.content[0].text); }

The Empty Response Problem

Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens in tool-use scenarios when:

  • You add text blocks immediately after tool results — Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern.
  • You send Claude's completed response back without adding anything — Claude already decided it's done, so it remains done.
#### How to Prevent Empty Responses Incorrect: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]
Correct: Send tool results directly without additional text
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}  # ✅ Just the result
    ]}
]

#### Recovering from Empty Responses

If you still get empty responses after fixing the above, use a continuation prompt:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # ❌ Don't just retry with the same messages
        # Claude already decided it's done
        
        # ✅ Add a continuation prompt in a NEW user message
        messages.append({
            "role": "user",
            "content": "Please continue with your response."
        })
        
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    
    return response

Handling tool_use — Building the Tool Loop

When stop_reason is "tool_use", Claude has decided it needs to call a tool. Your application must:

  • Execute the requested tool
  • Return the result as a tool_result block
  • Continue the conversation

Complete Tool Loop (Python)

from anthropic import Anthropic

client = Anthropic() messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]

while True: response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=messages, tools=[{ "name": "get_weather", "description": "Get current weather for a city", "input_schema": { "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] } }] ) if response.stop_reason == "tool_use": # Extract the tool use block tool_use = next(block for block in response.content if block.type == "tool_use") # Execute the tool (in real code, call your API) tool_result = execute_tool(tool_use.name, tool_use.input) # Add assistant response and tool result to messages messages.append({"role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": tool_use.id, "content": str(tool_result) }] }) # Loop continues... elif response.stop_reason == "end_turn": # Claude has finished print(response.content[0].text) break

Handling max_tokens — Detecting Truncation

When stop_reason is "max_tokens", Claude's response was cut off because it hit the max_tokens limit. This is common for long responses or complex reasoning.

What to Do

  • Increase max_tokens if the response is consistently truncated
  • Use a continuation prompt to ask Claude to finish
  • Enable extended thinking for complex tasks that need more tokens

Continuation Pattern (Python)

def handle_truncation(client, messages, max_tokens=4096):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=max_tokens,
        messages=messages
    )
    
    while response.stop_reason == "max_tokens":
        # Add the partial response to messages
        messages.append({"role": "assistant", "content": response.content})
        # Ask Claude to continue
        messages.append({"role": "user", "content": "Please continue."})
        
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=max_tokens,
            messages=messages
        )
    
    return response

Streaming with max_tokens

When streaming, you can detect truncation by checking the final message_stop event:

stream = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=256,
    messages=[{"role": "user", "content": "Write a long story"}],
    stream=True
)

for event in stream: if event.type == "message_stop": if event.message.stop_reason == "max_tokens": print("\n[Response was truncated — consider increasing max_tokens]")

Handling stop_sequence — Custom Stop Conditions

When you define custom stop_sequences in your API call, Claude will stop generating as soon as it produces one of those sequences. This is useful for:

  • Extracting structured data (stop at </output>)
  • Limiting response length in a controlled way
  • Building chat interfaces that stop at user-like messages
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["</answer>", "\n\nHuman:"],
    messages=[
        {"role": "user", "content": "Explain quantum computing in one sentence.</answer>"}
    ]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") print(response.content[0].text)

Building a Complete Stop Reason Handler

Here's a production-ready function that handles all stop reasons:

from anthropic import Anthropic
from typing import List, Dict

client = Anthropic()

def handle_claude_response( messages: List[Dict], tools: List[Dict] = None, max_tokens: int = 1024 ) -> str: """ Handle all stop reasons and return the final text response. """ while True: response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=max_tokens, messages=messages, tools=tools ) if response.stop_reason == "end_turn": # Check for empty response if not response.content: messages.append({ "role": "user", "content": "Please continue." }) continue return response.content[0].text elif response.stop_reason == "tool_use": # Execute tools and continue messages.append({"role": "assistant", "content": response.content}) for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": str(result) }] }) elif response.stop_reason == "max_tokens": messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": "Please continue."}) elif response.stop_reason == "stop_sequence": return response.content[0].text

Best Practices Summary

  • Always check stop_reason — Don't assume end_turn means success. It could be an empty response.
  • Never add text after tool_result — This causes empty responses. Send only the result.
  • Use continuation prompts for truncation — When max_tokens stops the response, ask Claude to continue.
  • Build a loop for tool_use — Keep calling the API until you get end_turn.
  • Log stop_reason in production — It's invaluable for debugging unexpected behavior.

Key Takeaways

  • Four stop reasons exist: end_turn (natural), tool_use (needs tool), max_tokens (truncated), stop_sequence (custom stop).
  • Empty responses with end_turn are caused by adding text after tool_result blocks or sending back Claude's own response unchanged. Fix by sending only tool results and using continuation prompts.
  • Tool loops require explicit handling: When stop_reason is tool_use, execute the tool, return the result, and continue the conversation until end_turn.
  • Truncation from max_tokens can be handled by appending the partial response and asking Claude to continue in a new user message.
  • Streaming applications should check stop_reason in the final message_stop event to detect truncation or tool requests.