BeClaude
GuideBeginnerAgents2026-05-18

Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, max_tokens & tool_use

Learn how to handle Claude API stop_reason values like end_turn, max_tokens, and tool_use. Includes code examples, empty response fixes, and best practices for production apps.

Quick Answer

This guide explains Claude API stop_reason values (end_turn, max_tokens, tool_use, stop_sequence) and how to handle each in your code. You'll learn to detect empty responses, recover from max_tokens truncation, and properly chain tool calls.

stop_reasonClaude APIerror handlingtool useMessages API

Mastering Claude API Stop Reasons: Build Robust Applications

When you call the Claude Messages API, every successful response includes a stop_reason field. This tiny piece of data tells you why Claude stopped generating—and understanding it is the difference between a brittle prototype and a production-ready application.

In this guide, you'll learn:

  • What each stop_reason value means
  • How to handle them in Python and TypeScript
  • How to prevent and recover from empty responses
  • Best practices for tool-using agents

What Is stop_reason?

The stop_reason field is part of every successful Messages API response. Unlike error codes (which indicate failures), stop_reason tells you why Claude successfully completed its response generation.

Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The Four Stop Reason Values

1. end_turn – Natural Completion

This is the most common stop reason. Claude finished its response naturally—it said everything it wanted to say and handed control back to you.

How to handle it:
from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

⚠️ Empty responses with end_turn

Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete—especially after tool results.

Common causes:
  • Adding text blocks immediately after tool_result blocks
  • Sending Claude's completed response back without adding anything new
How to prevent empty responses:
# INCORRECT: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [{
        "type": "tool_use",
        "id": "toolu_123",
        "name": "calculator",
        "input": {"operation": "add", "a": 1234, "b": 5678}
    }]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]

CORRECT: Send tool results directly

messages = [ {"role": "user", "content": "Calculate 1234 + 5678"}, {"role": "assistant", "content": [{ "type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678} }]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} # ✅ No extra text ]} ]
Recovering from empty responses:
def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # ❌ Don't just retry with the same messages
        # ✅ Add a continuation prompt in a NEW user message
        messages.append({"role": "user", "content": "Please continue"})
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    
    return response

2. max_tokens – Token Limit Reached

Claude stopped because it hit the max_tokens limit you set. The response is truncated—Claude had more to say but ran out of space.

How to handle it:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=200,  # Low limit for demonstration
    messages=[{"role": "user", "content": "Write a long story"}]
)

if response.stop_reason == "max_tokens": # The response is incomplete. Append it and ask Claude to continue. messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": "Please continue from where you left off."}) # Make a new request to get the rest continuation = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=2000, messages=messages )

TypeScript version:
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function getCompleteResponse() { let response = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 200, messages: [{ role: 'user', content: 'Write a long story' }] });

if (response.stop_reason === 'max_tokens') { const allContent = [...response.content]; while (response.stop_reason === 'max_tokens') { response = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 2000, messages: [ ...messages, { role: 'assistant', content: response.content }, { role: 'user', content: 'Please continue from where you left off.' } ] }); allContent.push(...response.content); } return allContent; } return response.content; }

3. tool_use – Tool Call Requested

Claude stopped because it wants to call a tool. The response content will contain one or more tool_use blocks.

How to handle it:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }],
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)

if response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": # Execute the tool result = execute_tool(block.name, block.input) # Add tool result and continue messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": str(result) }]}) # Let Claude continue with the result final_response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[...], messages=messages )

4. stop_sequence – Custom Stop Sequence Hit

Claude stopped because it encountered one of your custom stop_sequences. This is useful for structured outputs where you want Claude to stop at a specific delimiter.

How to handle it:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nEND"],
    messages=[{"role": "user", "content": "List 3 colors and then write END"}]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") # The content ends right before the stop sequence print(response.content[0].text)

Best Practices for Production Applications

1. Always Check stop_reason

Never assume a response is complete. Always check stop_reason before processing:

def process_response(response):
    if response.stop_reason == "end_turn":
        return handle_complete(response)
    elif response.stop_reason == "max_tokens":
        return handle_truncated(response)
    elif response.stop_reason == "tool_use":
        return handle_tool_calls(response)
    elif response.stop_reason == "stop_sequence":
        return handle_stop_sequence(response)
    else:
        raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

2. Build a Retry Loop for max_tokens

For long-form generation, implement a loop that continues until you get end_turn:

def generate_complete_response(client, messages, max_tokens=4096):
    all_content = []
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=max_tokens,
            messages=messages
        )
        
        all_content.extend(response.content)
        
        if response.stop_reason != "max_tokens":
            break
        
        # Continue from where we left off
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": "Please continue."})
    
    return all_content

3. Handle Tool Chains Properly

When using tools, you may get multiple tool_use blocks in one response (parallel tool use). Process all of them before continuing:

def handle_tool_chain(client, messages, tools):
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )
        
        if response.stop_reason != "tool_use":
            return response
        
        # Process all tool calls in this turn
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result)
                })
        
        # Add assistant response and tool results
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Common Pitfalls to Avoid

PitfallSolution
Ignoring stop_reasonAlways check it before processing content
Adding text after tool_resultSend only the tool_result block
Retrying empty responses without changesAdd a continuation prompt
Forgetting to append assistant contentInclude Claude's response in the next request
Not handling parallel tool callsIterate over all content blocks

Key Takeaways

  • stop_reason tells you why Claude stopped – always check it before processing a response. The four values are end_turn, max_tokens, tool_use, and stop_sequence.
  • end_turn can sometimes produce empty responses – prevent this by never adding text after tool_result blocks, and recover by sending a continuation prompt.
  • max_tokens means the response is truncated – implement a retry loop that appends the partial response and asks Claude to continue.
  • tool_use requires you to execute tools and feed results back – handle all tool calls in a single turn before continuing.
  • Build a state machine around stop_reason for robust, production-ready applications that handle all scenarios gracefully.