GuideBeginner2026-05-06

Mastering Claude API Stop Reasons: Build Smarter, More Reliable Applications

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples, troubleshooting tips, and best practices for robust app development.

Quick Answer

This guide explains the stop_reason field in Claude API responses, covering values like end_turn, tool_use, and max_tokens. You'll learn how to handle each case with practical Python code, prevent empty responses, and build more robust conversational applications.

Claude APIstop_reasonerror handlingtool useAPI best practices

Introduction

When you send a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. This isn't an error—it's a signal. Understanding these signals is essential for building applications that respond intelligently, whether you're handling a simple Q&A bot, a multi-step tool-using agent, or a long-running conversation.

In this guide, we'll break down each stop_reason value, show you how to handle them in code, and share best practices to avoid common pitfalls like empty responses.

The stop_reason Field

The stop_reason field appears in every successful response from the Messages API. Here's a typical example:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

Stop Reason Values

end_turn

What it means: Claude finished its response naturally. This is the most common stop reason and usually indicates a complete, final answer. How to handle it: In most cases, you can simply display the response to the user. However, be aware of the empty response edge case (covered below).

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
    # Process the complete response
    print(response.content[0].text)

tool_use

What it means: Claude wants to call a tool. The response content will contain one or more tool_use blocks. Your application must execute the tool and return the result. How to handle it: Loop through response.content, identify tool_use blocks, execute the corresponding functions, and append the results as new tool_result messages.

if response.stop_reason == "tool_use":
    for block in response.content:
        if block.type == "tool_use":
            tool_name = block.name
            tool_input = block.input
            # Execute your tool logic here
            result = execute_tool(tool_name, tool_input)
            # Append tool_result to messages
            messages.append({
                "role": "user",
                "content": [{
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result)
                }]
            })
    # Continue the conversation
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=messages
    )

max_tokens

What it means: Claude hit the token limit you set. The response may be cut off mid-sentence or mid-thought. How to handle it: You have two options:

Increase max_tokens in your request.
Send the partial response back as a new user message with a continuation prompt like "Please continue."

if response.stop_reason == "max_tokens":
    # Option 1: Increase max_tokens and retry
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,  # Increased limit
        messages=messages
    )
    
    # Option 2: Ask Claude to continue
    messages.append({
        "role": "user",
        "content": "Please continue from where you left off."
    })
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=messages
    )

stop_sequence

What it means: Claude encountered a custom stop sequence you defined in your request. This is useful for structured outputs where you want to stop generation at a specific marker. How to handle it: Check the stop_sequence field to see which sequence was triggered, then process the response accordingly.

if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    # Process the response up to that point

content_filtered

What it means: Claude's response was filtered by content moderation. The response content will be empty or truncated. How to handle it: Log the event for review, and consider rephrasing the user's input to avoid triggering filters.

if response.stop_reason == "content_filtered":
    print("Response was filtered. Consider rephrasing the input.")
    # Optionally, ask the user to rephrase

Handling Empty Responses with end_turn

A common gotcha: Claude returns an empty response (2-3 tokens, no content) with stop_reason: "end_turn". This typically happens after tool results, when Claude interprets that the assistant's turn is complete.

Why It Happens

Adding text blocks immediately after tool_result messages (Claude learns to expect the user to always insert text after tool results, so it ends its turn).
Sending Claude's completed response back without adding anything new (Claude already decided it's done).

How to Prevent It

Incorrect: Adding text after tool_result

messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't do this
    ]}
]

Correct: Send tool results directly without additional text

messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}  # Just the tool_result
    ]}
]

Recovery Strategy

If you still get empty responses, don't just retry with the same messages—Claude has already decided it's done. Instead, add a continuation prompt in a new user message:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Add a continuation prompt
        messages.append({"role": "user", "content": "Please continue"})
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

Building a Robust Response Handler

Here's a complete example that handles all stop reasons gracefully:

from anthropic import Anthropic
client = Anthropic()
messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]
def handle_response(response, messages):
    if response.stop_reason == "end_turn":
        if response.content:
            return response.content[0].text
        else:
            # Empty response recovery
            messages.append({"role": "user", "content": "Please continue"})
            new_response = client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=1024,
                messages=messages
            )
            return handle_response(new_response, messages)
    
    elif response.stop_reason == "tool_use":
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                messages.append({
                    "role": "user",
                    "content": [{
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    }]
                })
        new_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
        return handle_response(new_response, messages)
    
    elif response.stop_reason == "max_tokens":
        messages.append({"role": "user", "content": "Please continue"})
        new_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2048,
            messages=messages
        )
        return handle_response(new_response, messages)
    
    elif response.stop_reason == "stop_sequence":
        return response.content[0].text if response.content else ""
    
    elif response.stop_reason == "content_filtered":
        return "Response was filtered. Please rephrase your question."
    
    return "Unknown stop reason"
Usage
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=messages
)
final_answer = handle_response(response, messages)
print(final_answer)

Key Takeaways

Understand each stop reason: end_turn means natural completion, tool_use means Claude wants to call a tool, max_tokens means you hit the limit, stop_sequence means a custom stop was triggered, and content_filtered means moderation intervened.
Handle empty responses gracefully: Never add text blocks after tool_result messages, and use a continuation prompt (not a retry) to recover from empty end_turn responses.
Build a recursive handler: A single function that processes all stop reasons and loops back for tool_use and max_tokens creates a robust, production-ready pipeline.
Log and monitor: Always log stop_reason and stop_sequence values during development to catch unexpected behavior early.
Test edge cases: Simulate empty responses, token limits, and tool calls to ensure your handler behaves correctly under all conditions.