Guide2026-04-26

Mastering Claude API Stop Reasons: A Practical Guide to Handling Response Endings

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples, empty response fixes, and best practices for robust applications.

Quick Answer

This guide explains the five stop_reason values in Claude's API responses (end_turn, tool_use, max_tokens, stop_sequence, content_filtered) and shows how to handle each one in your code, including preventing empty responses and building robust multi-turn conversations.

Claude APIstop_reasonerror handlingtool useAPI best practices

Introduction

When you make a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. Understanding these values is essential for building applications that handle different response types correctly—whether you're building a chatbot, a tool-using agent, or a content generation pipeline.

Unlike API errors (which indicate a failure in processing your request), stop_reason is part of every successful response. It gives you insight into Claude's internal decision-making and helps you decide what to do next.

This guide covers all five stop_reason values, how to handle them in code, and how to avoid common pitfalls like empty responses.

---

The `stop_reason` Field

The stop_reason field appears in every successful Messages API response. Here's a typical example:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

There are five possible values for stop_reason:

Value	Meaning
`end_turn`	Claude finished its response naturally
`tool_use`	Claude wants to call a tool
`max_tokens`	The response hit the `max_tokens` limit
`stop_sequence`	A custom stop sequence was encountered
`content_filtered`	The response was filtered by content moderation

Let's explore each one in detail.

---

`end_turn`: The Natural Completion

end_turn is the most common stop reason. It means Claude has finished its response and has nothing more to say. This is the ideal outcome for simple Q&A or single-turn conversations.

Handling `end_turn` in Python

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
if response.stop_reason == "end_turn":
    # Process the complete response
    print(response.content[0].text)

The Empty Response Problem

Sometimes Claude returns an empty response (exactly 2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.

Common causes:

Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)

How to prevent empty responses:

# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't add text after tool_result
    ]}
]
CORRECT: Send tool results directly without additional text
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
    ]}  # Just the tool_result, no additional text
]

If you still get empty responses after fixing the above, implement a retry loop:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    # Check if response is empty
    if response.stop_reason == "end_turn" and not response.content:
        # Add a gentle nudge and retry
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

---

`tool_use`: Claude Wants to Call a Tool

When Claude decides it needs to use a tool (like a calculator, database query, or web search), it returns stop_reason: "tool_use" along with one or more tool_use content blocks.

Handling `tool_use` in Python

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo?"}
    ]
)
if response.stop_reason == "tool_use":
    # Extract tool calls from content blocks
    for block in response.content:
        if block.type == "tool_use":
            tool_name = block.name
            tool_input = block.input
            tool_id = block.id
            
            # Execute the tool (your implementation)
            result = execute_tool(tool_name, tool_input)
            
            # Append tool result to messages
            messages.append({
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_id,
                        "content": str(result)
                    }
                ]
            })
    
    # Continue the conversation
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=tools,
        messages=messages
    )

Important: After handling a tool_use, you must continue the conversation by making another API call with the tool results appended. Claude will then either produce a final answer or request additional tool calls.

---

`max_tokens`: Hit the Token Limit

When Claude's response reaches the max_tokens limit you set, it stops with stop_reason: "max_tokens". This is common for long-form content generation or when Claude is in the middle of a thought.

Handling `max_tokens`

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=100,  # Deliberately low for demonstration
    messages=[
        {"role": "user", "content": "Write a detailed essay about AI safety."}
    ]
)
if response.stop_reason == "max_tokens":
    # The response is truncated
    partial_text = response.content[0].text
    
    # Option 1: Continue from where it left off
    messages.append({"role": "assistant", "content": partial_text})
    messages.append({"role": "user", "content": "Please continue."})
    
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,  # Increase limit
        messages=messages
    )
    
    # Option 2: Increase max_tokens and retry the original request
    # This is simpler but may produce slightly different output

Best practice: Set max_tokens generously (e.g., 4096 or higher) for tasks that might require long responses. For streaming applications, handle max_tokens by continuing the conversation.

---

`stop_sequence`: Custom Stop Sequences

If you define custom stop sequences in your API request, Claude will stop generating when it encounters one. This is useful for structured outputs like JSON or XML.

Example with Stop Sequences

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\n---END---"],
    messages=[
        {"role": "user", "content": "List three colors and end with ---END---"}
    ]
)
if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    print(response.content[0].text)

Use cases:

Extracting structured data (JSON, XML)
Controlling output length in specific formats
Building multi-step generation pipelines

---

`content_filtered`: Content Moderation

This stop reason indicates that Claude's response was filtered by content moderation systems. This is rare but can happen if the model generates content that violates safety policies.

Handling `content_filtered`

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Generate harmful content"}
    ]
)
if response.stop_reason == "content_filtered":
    # Log the incident for review
    logger.warning(f"Content filtered for request: {response.id}")
    
    # Return a safe fallback to the user
    return "I'm sorry, I cannot generate that type of content."

Note: If you encounter frequent content_filtered responses, review your prompts and system instructions to ensure they align with Claude's usage policies.

---

Building a Robust Response Handler

Here's a complete example that handles all stop reasons in a single function:

def handle_claude_response(response, client, messages, tools=None):
    """Handle all possible stop_reason values."""
    
    if response.stop_reason == "end_turn":
        # Natural completion
        if response.content:
            return response.content[0].text
        else:
            # Empty response - retry with nudge
            messages.append({"role": "user", "content": "Please continue."})
            new_response = client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=1024,
                tools=tools,
                messages=messages
            )
            return handle_claude_response(new_response, client, messages, tools)
    
    elif response.stop_reason == "tool_use":
        # Execute tools and continue
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                messages.append({
                    "role": "user",
                    "content": [{
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    }]
                })
        
        new_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=tools,
            messages=messages
        )
        return handle_claude_response(new_response, client, messages, tools)
    
    elif response.stop_reason == "max_tokens":
        # Continue from where we left off
        messages.append({"role": "assistant", "content": response.content[0].text})
        messages.append({"role": "user", "content": "Please continue."})
        
        new_response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,  # Increase limit
            tools=tools,
            messages=messages
        )
        return handle_claude_response(new_response, client, messages, tools)
    
    elif response.stop_reason == "stop_sequence":
        # Custom stop sequence encountered
        return response.content[0].text
    
    elif response.stop_reason == "content_filtered":
        # Content moderation triggered
        return "I cannot generate that content. Please rephrase your request."
    
    else:
        # Unknown stop reason (shouldn't happen)
        raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

---

Best Practices Summary

Always check stop_reason – Don't assume a response is complete just because you got a 200 status code.
Handle tool_use as a loop – Continue making API calls until you get end_turn or max_tokens.
Set appropriate max_tokens – For long-form content, use at least 4096 tokens.
Avoid empty responses – Don't add text blocks after tool_result content blocks.
Log content_filtered – Monitor for policy violations and adjust prompts accordingly.

---

Key Takeaways

Five stop reasons exist: end_turn, tool_use, max_tokens, stop_sequence, and content_filtered – each requires different handling logic.
Empty responses with end_turn are usually caused by adding text after tool results; fix by sending only the tool_result block.
tool_use requires a loop: Execute the tool, append the result, and make another API call until Claude finishes.
max_tokens means truncation: Either increase the limit and retry, or continue the conversation by appending a "please continue" message.
Build a unified handler that recursively processes responses until a natural completion (end_turn) or a final output is achieved.

Introduction

The stop_reason Field

end_turn: The Natural Completion

Handling end_turn in Python

The Empty Response Problem

CORRECT: Send tool results directly without additional text

tool_use: Claude Wants to Call a Tool

Handling tool_use in Python

max_tokens: Hit the Token Limit

Handling max_tokens

stop_sequence: Custom Stop Sequences