BeClaude
Guide2026-04-29

Mastering Claude’s Stop Reasons: Build Reliable API Applications

Learn how to interpret and handle Claude's stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—with practical Python examples to build robust, production-ready apps.

Quick Answer

This guide explains Claude’s stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—and shows how to handle each case in Python. You’ll learn to detect empty responses, continue tool loops, and manage token limits for reliable API integrations.

Claude APIstop_reasonerror handlingtool useAPI best practices

Introduction

When you call the Claude API, every successful response includes a stop_reason field. This small piece of data tells you why the model stopped generating—whether it finished naturally, requested a tool call, hit a token limit, or encountered a stop sequence. Ignoring it can lead to incomplete answers, broken tool loops, or silent failures.

In this guide, you’ll learn exactly what each stop reason means, how to handle it in your code, and how to avoid common pitfalls like empty responses. By the end, you’ll be able to build robust applications that gracefully handle every scenario.

Understanding the stop_reason Field

The stop_reason field appears in every successful Messages API response. It’s not an error—it’s a signal about how Claude completed its turn. Here’s a typical response snippet:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

There are four possible values for stop_reason:

ValueMeaning
end_turnClaude finished its response naturally.
tool_useClaude wants to call a tool (function).
max_tokensClaude stopped because it hit the max_tokens limit.
stop_sequenceClaude encountered a custom stop sequence you defined.
Let’s explore each one in detail.

end_turn: The Natural Finish

end_turn is the most common stop reason. It means Claude decided it had completed its response and didn’t need to say more. In most cases, you can simply return the response to the user.

Basic Handling in Python

from anthropic import Anthropic

client = Anthropic()

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[ {"role": "user", "content": "What is the capital of France?"} ] )

if response.stop_reason == "end_turn": print(response.content[0].text)

The Empty Response Gotcha

Sometimes Claude returns stop_reason: "end_turn" with an empty or near-empty response (2–3 tokens, no meaningful content). This typically happens in tool-use workflows when:

  • You add text blocks immediately after tool_result blocks.
  • You send Claude’s completed response back without adding anything new.
Why it happens: Claude learns patterns from your message history. If you always insert text after tool results, Claude may end its turn prematurely, expecting you to continue the pattern. How to prevent it:
# INCORRECT: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]

CORRECT: Send only the tool_result

messages = [ {"role": "user", "content": "Calculate 1234 + 5678"}, {"role": "assistant", "content": [ { "type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678} } ]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} ]} # ✅ Just the result ]

If you still get empty responses, add a retry loop:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Retry with a slight prompt adjustment
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

tool_use: Claude Wants to Call a Tool

When stop_reason is tool_use, Claude has decided it needs to use a tool to complete the task. Your application must:

  • Extract the tool call details from content.
  • Execute the tool (e.g., call an API, query a database).
  • Append the result as a tool_result block.
  • Send the updated message history back to Claude.

Complete Tool Loop Example

from anthropic import Anthropic

client = Anthropic() messages = [ {"role": "user", "content": "What's the weather in Tokyo?"} ]

while True: response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[{ "name": "get_weather", "description": "Get current weather for a city", "input_schema": { "type": "object", "properties": { "city": {"type": "string"} }, "required": ["city"] } }], messages=messages ) if response.stop_reason == "end_turn": print(response.content[0].text) break elif response.stop_reason == "tool_use": # Extract tool call tool_call = response.content[0] tool_name = tool_call.name tool_input = tool_call.input # Execute tool (simulated) if tool_name == "get_weather": result = f"25°C in {tool_input['city']}" # Append assistant's tool call and result messages.append({"role": "assistant", "content": response.content}) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": tool_call.id, "content": result }] }) # Loop continues

max_tokens: Hit the Token Limit

When stop_reason is max_tokens, Claude’s response was cut off because it reached the max_tokens limit you set. The response may be incomplete.

How to Handle It

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=100,  # Intentionally low for demonstration
    messages=[
        {"role": "user", "content": "Write a detailed essay on AI ethics."}
    ]
)

if response.stop_reason == "max_tokens": print("Response was truncated. Consider increasing max_tokens.") # Option 1: Increase max_tokens and retry # Option 2: Append the partial response and ask Claude to continue partial_text = response.content[0].text messages.append({"role": "assistant", "content": partial_text}) messages.append({"role": "user", "content": "Continue from where you left off."}) continued_response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=500, messages=messages )

Best practice: Set max_tokens generously (e.g., 4096 or higher) for open-ended tasks, or implement a continuation loop as shown above.

stop_sequence: Custom Stop Sequence Encountered

If you define custom stop sequences in your API request, Claude will stop generating when it encounters one. The stop_sequence field will contain the matched sequence.

Example

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nEND"],
    messages=[
        {"role": "user", "content": "List three fruits and then write END."}
    ]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") print(response.content[0].text)

This is useful for structured outputs where you want Claude to stop at a delimiter.

Building a Robust Response Handler

Combine all the logic into a single function:

def handle_claude_response(response, client, messages):
    """Route Claude's response based on stop_reason."""
    
    if response.stop_reason == "end_turn":
        if not response.content:
            # Handle empty response
            messages.append({"role": "user", "content": "Please continue."})
            return client.messages.create(
                model=response.model,
                max_tokens=response.usage.output_tokens + 100,
                messages=messages
            )
        return response.content[0].text
    
    elif response.stop_reason == "tool_use":
        # Extract and execute tool
        tool_call = response.content[0]
        result = execute_tool(tool_call.name, tool_call.input)
        messages.append({"role": "assistant", "content": response.content})
        messages.append({
            "role": "user",
            "content": [{
                "type": "tool_result",
                "tool_use_id": tool_call.id,
                "content": result
            }]
        })
        # Recursively call
        new_response = client.messages.create(
            model=response.model,
            max_tokens=response.usage.output_tokens + 500,
            messages=messages
        )
        return handle_claude_response(new_response, client, messages)
    
    elif response.stop_reason == "max_tokens":
        # Continue generation
        messages.append({"role": "assistant", "content": response.content[0].text})
        messages.append({"role": "user", "content": "Continue."})
        new_response = client.messages.create(
            model=response.model,
            max_tokens=response.usage.output_tokens + 500,
            messages=messages
        )
        return handle_claude_response(new_response, client, messages)
    
    elif response.stop_reason == "stop_sequence":
        # Custom handling
        return response.content[0].text

Key Takeaways

  • Always check stop_reason in every API response to determine the next action—don’t assume the response is final.
  • For tool_use, implement a loop that executes the tool and feeds the result back to Claude until you get end_turn.
  • For max_tokens, either increase the limit or implement a continuation pattern to avoid truncated responses.
  • Avoid empty end_turn responses by sending only tool_result blocks (no extra text) in tool workflows.
  • Use stop_sequences for structured outputs when you need Claude to stop at a specific delimiter.