BeClaude
Guide2026-04-27

Mastering Claude's Stop Reasons: Build Robust API Applications

Learn how to interpret and handle Claude's stop_reason field in the Messages API. Includes code examples, troubleshooting empty responses, and best practices for tool use.

Quick Answer

This guide explains Claude's stop_reason field (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each in your application. You'll learn to prevent empty responses, manage tool calls, and build robust conversational flows.

Claude APIstop_reasonerror handlingtool useMessages API

Introduction

When you send a request to Claude via the Messages API, the response includes a stop_reason field that tells you why the model stopped generating. Understanding these values is essential for building applications that handle different response types correctly—whether it's a natural conversation end, a tool call, or a token limit hit.

This guide covers every stop reason, how to handle them in code, common pitfalls like empty responses, and best practices for production applications.

The stop_reason Field

The stop_reason field appears in every successful Messages API response. Unlike errors (which indicate request failures), stop_reason tells you why Claude completed its response generation successfully.

Example Response

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

Stop Reason Values

end_turn

The most common stop reason. Indicates Claude finished its response naturally—it decided the assistant's turn was complete.

How to handle:
from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

#### Empty Responses with end_turn

Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.

Common causes:
  • Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
  • Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)
How to prevent empty responses:
# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't add text after tool_result
    ]}
]

CORRECT: Send tool results directly without additional text

messages = [ {"role": "user", "content": "Calculate the sum of 1234 and 5678"}, {"role": "assistant", "content": [ { "type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678} } ]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} ]} # Just the tool_result, no additional text ]

If you still get empty responses after fixing the above, implement a retry loop:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Retry with a prompt that encourages a response
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

tool_use

Indicates Claude wants to call one or more tools. The response content will contain tool_use blocks with tool names and inputs.

How to handle:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    ],
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)

if response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": # Execute the tool and send results back tool_result = execute_tool(block.name, block.input) messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": [ {"type": "tool_result", "tool_use_id": block.id, "content": tool_result} ]}) # Continue the conversation response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=tools, messages=messages )

max_tokens

Indicates Claude's response was cut off because it reached the max_tokens limit you set. The response may be incomplete.

How to handle:
if response.stop_reason == "max_tokens":
    # The response is truncated. Continue the conversation to get more.
    messages.append({"role": "assistant", "content": response.content})
    messages.append({"role": "user", "content": "Please continue."})
    
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,  # Consider increasing this
        messages=messages
    )
Best practice: If you frequently hit max_tokens, increase the limit or implement automatic continuation logic.

stop_sequence

Indicates Claude stopped because it encountered one of the stop_sequences you specified in your API request. The stop_sequence field in the response will contain the actual sequence that triggered the stop.

How to handle:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
    messages=[{"role": "user", "content": "Tell me a story."}]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") # The content up to (but not including) the stop sequence print(response.content[0].text)

This is useful for role-playing or structured generation where you want to control when Claude stops.

Building a Robust Response Handler

Combine all stop reasons into a single handler for production applications:

def handle_claude_response(response, messages, tools=None):
    """
    Handle all possible stop reasons from Claude.
    Returns the final response after processing.
    """
    if response.stop_reason == "end_turn":
        # Natural end - return the response
        if not response.content:
            # Empty response - retry
            messages.append({"role": "user", "content": "Please continue."})
            return client.messages.create(
                model=response.model,
                max_tokens=response.usage.output_tokens + 100,
                messages=messages
            )
        return response
    
    elif response.stop_reason == "tool_use":
        # Execute tools and continue
        messages.append({"role": "assistant", "content": response.content})
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                messages.append({"role": "user", "content": [
                    {"type": "tool_result", "tool_use_id": block.id, "content": result}
                ]})
        return client.messages.create(
            model=response.model,
            max_tokens=response.usage.output_tokens + 100,
            tools=tools,
            messages=messages
        )
    
    elif response.stop_reason == "max_tokens":
        # Continue the conversation
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model=response.model,
            max_tokens=response.usage.output_tokens + 100,
            messages=messages
        )
    
    elif response.stop_reason == "stop_sequence":
        # Handle as needed - content is complete up to the stop sequence
        return response
    
    else:
        raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

Best Practices

  • Always check stop_reason before processing response content. Don't assume end_turn means the response is complete—check for empty content.
  • Handle tool_use explicitly in a loop. Claude may call multiple tools in one response, and each tool result must be sent back.
  • Increase max_tokens if you frequently see max_tokens stop reasons. For long-form content, consider setting it to 4096 or higher.
  • Use stop_sequences carefully—they can truncate responses mid-sentence if the sequence appears in generated text.
  • Log stop reasons in production to monitor response patterns and detect issues early.

Key Takeaways

  • Claude returns four stop reasons: end_turn (natural end), tool_use (wants to call a tool), max_tokens (response truncated), and stop_sequence (custom stop triggered).
  • Empty responses with end_turn are common after tool results—prevent them by sending tool results without additional text, or implement a retry mechanism.
  • tool_use requires a loop: Execute the tool, send results back, and continue the conversation until Claude returns end_turn.
  • max_tokens means incomplete output: Always continue the conversation or increase the token limit to get the full response.
  • Build a unified handler that processes all stop reasons to create robust, production-ready applications.