GuideBeginnerAgents2026-05-18

Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, max_tokens & tool_use

Learn how to handle Claude API stop_reason values like end_turn, max_tokens, and tool_use. Includes code examples, empty response fixes, and best practices for production apps.

Quick Answer

This guide explains Claude API stop_reason values (end_turn, max_tokens, tool_use, stop_sequence) and how to handle each in your code. You'll learn to detect empty responses, recover from max_tokens truncation, and properly chain tool calls.

stop_reasonClaude APIerror handlingtool useMessages API

Mastering Claude API Stop Reasons: Build Robust Applications

When you call the Claude Messages API, every successful response includes a stop_reason field. This tiny piece of data tells you why Claude stopped generating—and understanding it is the difference between a brittle prototype and a production-ready application.

In this guide, you'll learn:

What each stop_reason value means
How to handle them in Python and TypeScript
How to prevent and recover from empty responses
Best practices for tool-using agents

What Is `stop_reason`?

The stop_reason field is part of every successful Messages API response. Unlike error codes (which indicate failures), stop_reason tells you why Claude successfully completed its response generation.

Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The Four Stop Reason Values

1. `end_turn` – Natural Completion

This is the most common stop reason. Claude finished its response naturally—it said everything it wanted to say and handed control back to you.

How to handle it:

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
    # Process the complete response
    print(response.content[0].text)

⚠️ Empty responses with end_turn

Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete—especially after tool results.

Common causes:

Adding text blocks immediately after tool_result blocks
Sending Claude's completed response back without adding anything new

How to prevent empty responses:

# INCORRECT: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [{
        "type": "tool_use",
        "id": "toolu_123",
        "name": "calculator",
        "input": {"operation": "add", "a": 1234, "b": 5678}
    }]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]
CORRECT: Send tool results directly
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [{
        "type": "tool_use",
        "id": "toolu_123",
        "name": "calculator",
        "input": {"operation": "add", "a": 1234, "b": 5678}
    }]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
        # ✅ No extra text
    ]}
]

Recovering from empty responses:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # ❌ Don't just retry with the same messages
        # ✅ Add a continuation prompt in a NEW user message
        messages.append({"role": "user", "content": "Please continue"})
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    
    return response

2. `max_tokens` – Token Limit Reached

Claude stopped because it hit the max_tokens limit you set. The response is truncated—Claude had more to say but ran out of space.

How to handle it:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=200,  # Low limit for demonstration
    messages=[{"role": "user", "content": "Write a long story"}]
)
if response.stop_reason == "max_tokens":
    # The response is incomplete. Append it and ask Claude to continue.
    messages.append({"role": "assistant", "content": response.content})
    messages.append({"role": "user", "content": "Please continue from where you left off."})
    
    # Make a new request to get the rest
    continuation = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=2000,
        messages=messages
    )

TypeScript version:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function getCompleteResponse() {
  let response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 200,
    messages: [{ role: 'user', content: 'Write a long story' }]
  });
if (response.stop_reason === 'max_tokens') {
    const allContent = [...response.content];
    
    while (response.stop_reason === 'max_tokens') {
      response = await client.messages.create({
        model: 'claude-sonnet-4-20250514',
        max_tokens: 2000,
        messages: [
          ...messages,
          { role: 'assistant', content: response.content },
          { role: 'user', content: 'Please continue from where you left off.' }
        ]
      });
      allContent.push(...response.content);
    }
    
    return allContent;
  }
  
  return response.content;
}

3. `tool_use` – Tool Call Requested

Claude stopped because it wants to call a tool. The response content will contain one or more tool_use blocks.

How to handle it:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }],
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
if response.stop_reason == "tool_use":
    for block in response.content:
        if block.type == "tool_use":
            # Execute the tool
            result = execute_tool(block.name, block.input)
            
            # Add tool result and continue
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": [{
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": str(result)
            }]})
    
    # Let Claude continue with the result
    final_response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=[...],
        messages=messages
    )

4. `stop_sequence` – Custom Stop Sequence Hit

Claude stopped because it encountered one of your custom stop_sequences. This is useful for structured outputs where you want Claude to stop at a specific delimiter.

How to handle it:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nEND"],
    messages=[{"role": "user", "content": "List 3 colors and then write END"}]
)
if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    # The content ends right before the stop sequence
    print(response.content[0].text)

Best Practices for Production Applications

1. Always Check `stop_reason`

Never assume a response is complete. Always check stop_reason before processing:

def process_response(response):
    if response.stop_reason == "end_turn":
        return handle_complete(response)
    elif response.stop_reason == "max_tokens":
        return handle_truncated(response)
    elif response.stop_reason == "tool_use":
        return handle_tool_calls(response)
    elif response.stop_reason == "stop_sequence":
        return handle_stop_sequence(response)
    else:
        raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

2. Build a Retry Loop for `max_tokens`

For long-form generation, implement a loop that continues until you get end_turn:

def generate_complete_response(client, messages, max_tokens=4096):
    all_content = []
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=max_tokens,
            messages=messages
        )
        
        all_content.extend(response.content)
        
        if response.stop_reason != "max_tokens":
            break
        
        # Continue from where we left off
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": "Please continue."})
    
    return all_content

3. Handle Tool Chains Properly

When using tools, you may get multiple tool_use blocks in one response (parallel tool use). Process all of them before continuing:

def handle_tool_chain(client, messages, tools):
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )
        
        if response.stop_reason != "tool_use":
            return response
        
        # Process all tool calls in this turn
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result)
                })
        
        # Add assistant response and tool results
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})

Common Pitfalls to Avoid

Pitfall	Solution
Ignoring `stop_reason`	Always check it before processing content
Adding text after `tool_result`	Send only the `tool_result` block
Retrying empty responses without changes	Add a continuation prompt
Forgetting to append assistant content	Include Claude's response in the next request
Not handling parallel tool calls	Iterate over all content blocks

Key Takeaways

stop_reason tells you why Claude stopped – always check it before processing a response. The four values are end_turn, max_tokens, tool_use, and stop_sequence.
end_turn can sometimes produce empty responses – prevent this by never adding text after tool_result blocks, and recover by sending a continuation prompt.
max_tokens means the response is truncated – implement a retry loop that appends the partial response and asks Claude to continue.
tool_use requires you to execute tools and feed results back – handle all tool calls in a single turn before continuing.
Build a state machine around stop_reason for robust, production-ready applications that handle all scenarios gracefully.

Mastering Claude API Stop Reasons: Build Robust Applications

What Is stop_reason?

The Four Stop Reason Values

1. end_turn – Natural Completion

CORRECT: Send tool results directly

2. max_tokens – Token Limit Reached

3. tool_use – Tool Call Requested

4. stop_sequence – Custom Stop Sequence Hit

Best Practices for Production Applications

1. Always Check stop_reason

2. Build a Retry Loop for max_tokens

3. Handle Tool Chains Properly

Common Pitfalls to Avoid

Key Takeaways

What Is `stop_reason`?

1. `end_turn` – Natural Completion

2. `max_tokens` – Token Limit Reached

3. `tool_use` – Tool Call Requested

4. `stop_sequence` – Custom Stop Sequence Hit

1. Always Check `stop_reason`

2. Build a Retry Loop for `max_tokens`