BeClaude
GuideBeginnerAPI2026-05-22

Mastering Claude's Stop Reasons: A Practical Guide to Handling API Responses

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples and fixes for empty responses.

Quick Answer

This guide explains Claude's stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—and how to handle each in your application. You'll learn to detect empty responses, manage tool loops, and build robust conversational flows with practical Python examples.

stop_reasonMessages APItool_useerror handlingClaude API

Introduction

When you call the Claude Messages API, every successful response includes a stop_reason field. This small but critical piece of data tells you why Claude stopped generating—whether it finished naturally, requested a tool, hit a token limit, or matched a custom stop sequence. Misinterpreting these values can lead to broken conversations, infinite loops, or missed tool calls.

In this guide, you'll learn exactly what each stop_reason means, how to handle them in your code, and how to avoid common pitfalls like empty responses or stuck tool chains.

Understanding the stop_reason Field

The stop_reason field appears in every successful response from the Messages API. It is not an error—it indicates a normal completion. Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The four possible values are:

  • end_turn – Claude finished naturally.
  • tool_use – Claude wants to call a tool.
  • max_tokens – Claude hit the max_tokens limit.
  • stop_sequence – Claude encountered a custom stop sequence.
Let's explore each one.

end_turn: Natural Completion

This is the most common stop reason. Claude has finished its response and expects no further action from you. In a simple Q&A flow, you can safely display the response and wait for the next user input.

from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": print(response.content[0].text)

The Empty Response Problem

Sometimes Claude returns an empty response (2–3 tokens, no content) with stop_reason: "end_turn". This usually happens in tool-use scenarios when:

  • You add text blocks immediately after tool_result blocks.
  • You send Claude's own completed response back without adding anything new.
Why it happens: Claude learns from the conversation pattern. If you always insert text after tool results, Claude may decide its turn is complete prematurely. How to fix it:
# INCORRECT: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]

CORRECT: Send tool results directly

messages = [ {"role": "user", "content": "Calculate 1234 + 5678"}, {"role": "assistant", "content": [ {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}} ]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} # ✅ Just the result ]} ]

If you still get empty responses, add a continuation prompt in a new user message:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    if response.stop_reason == "end_turn" and not response.content:
        # Don't retry with the same messages—Claude already decided it's done
        messages.append({"role": "user", "content": "Please continue your response."})
        return client.messages.create(model="claude-opus-4-7", max_tokens=1024, messages=messages)
    return response

tool_use: Claude Wants to Call a Tool

When Claude decides it needs external data or computation, it returns stop_reason: "tool_use" along with one or more tool_use content blocks. Your application must:

  • Extract the tool name and input.
  • Execute the tool (e.g., call an API, query a database).
  • Return the result as a tool_result block in a new user message.
def handle_tool_call(client, messages):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=[
            {
                "name": "get_weather",
                "description": "Get current weather for a city",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string"}
                    },
                    "required": ["city"]
                }
            }
        ],
        messages=messages
    )

if response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": tool_name = block.name tool_input = block.input # Execute the tool (pseudo-code) result = execute_tool(tool_name, tool_input) # Append the result messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": str(result) }] }) # Continue the conversation return client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[...], messages=messages ) return response

Important: Always return tool results in a user message, not an assistant message. The tool_use_id must match the ID from Claude's request.

max_tokens: Hit the Output Limit

If Claude's response is cut off because it reached the max_tokens limit, you'll see stop_reason: "max_tokens". This is common for long-form content or complex reasoning.

How to handle it:
  • Increase max_tokens if you expect longer responses.
  • Use continuation – send Claude's partial response back and ask it to continue.
def handle_max_tokens(client, messages, response):
    if response.stop_reason == "max_tokens":
        # Append Claude's partial response to the conversation
        messages.append({"role": "assistant", "content": response.content})
        # Ask to continue
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2048,  # Increase limit
            messages=messages
        )
    return response
Pro tip: For very long outputs, consider using streaming to show partial results to the user while you request continuations in the background.

stop_sequence: Custom Stop Triggered

If you defined custom stop_sequences in your API request, Claude will stop when it encounters one. The stop_sequence field will contain the matched sequence.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nHuman:"],
    messages=[{"role": "user", "content": "Tell me a short story."}]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") # The content ends right before the stop sequence print(response.content[0].text)

This is useful for:

  • Building chatbots that stop before generating a user turn.
  • Extracting structured data by stopping at delimiters.
  • Preventing Claude from continuing beyond a certain point.

Building a Robust Response Handler

Here's a complete example that handles all four stop reasons:

from anthropic import Anthropic

client = Anthropic()

def process_response(client, messages, tools=None): response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=tools, messages=messages )

if response.stop_reason == "end_turn": if not response.content: # Handle empty response messages.append({"role": "user", "content": "Please continue."}) return process_response(client, messages, tools) return response.content[0].text

elif response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": str(result) }] }) return process_response(client, messages, tools)

elif response.stop_reason == "max_tokens": messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": "Please continue."}) return process_response(client, messages, tools)

elif response.stop_reason == "stop_sequence": # Content is complete up to the stop sequence return response.content[0].text

else: raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

Common Pitfalls and Best Practices

1. Don't Ignore tool_use

If you ignore a tool_use stop reason and just display the response, Claude's tool call will be lost. Always check for tool blocks.

2. Avoid Infinite Tool Loops

Set a maximum number of tool call iterations (e.g., 10) to prevent runaway loops.
MAX_TOOL_CALLS = 10
tool_call_count = 0

while response.stop_reason == "tool_use" and tool_call_count < MAX_TOOL_CALLS: # handle tool call tool_call_count += 1

3. Stream for Long Responses

For max_tokens scenarios, streaming gives users immediate feedback and lets you handle continuations gracefully.

4. Validate Tool Results

Always ensure tool results are properly formatted and include the correct tool_use_id. Mismatched IDs can cause Claude to ignore the result.

Key Takeaways

  • stop_reason is your guide – It tells you exactly why Claude stopped, enabling you to build the correct next step in your application logic.
  • Empty responses with end_turn are usually caused by adding text after tool results. Fix by sending only tool_result blocks, or add a continuation prompt.
  • tool_use requires a loop – Your application must execute the tool and return results in a user message. Always limit the number of iterations.
  • max_tokens means partial output – Increase the limit or use continuation prompts. Streaming helps manage user expectations.
  • stop_sequence gives you control – Use custom stop sequences to prevent unwanted generation or extract structured data.
By mastering these four stop reasons, you'll build Claude applications that handle every scenario gracefully—from simple Q&A to complex multi-tool workflows.