GuideBeginnerAPI2026-05-22

Mastering Claude's Stop Reasons: A Practical Guide to Handling API Responses

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples and fixes for empty responses.

Quick Answer

This guide explains Claude's stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—and how to handle each in your application. You'll learn to detect empty responses, manage tool loops, and build robust conversational flows with practical Python examples.

stop_reasonMessages APItool_useerror handlingClaude API

Introduction

When you call the Claude Messages API, every successful response includes a stop_reason field. This small but critical piece of data tells you why Claude stopped generating—whether it finished naturally, requested a tool, hit a token limit, or matched a custom stop sequence. Misinterpreting these values can lead to broken conversations, infinite loops, or missed tool calls.

In this guide, you'll learn exactly what each stop_reason means, how to handle them in your code, and how to avoid common pitfalls like empty responses or stuck tool chains.

Understanding the `stop_reason` Field

The stop_reason field appears in every successful response from the Messages API. It is not an error—it indicates a normal completion. Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

The four possible values are:

end_turn – Claude finished naturally.
tool_use – Claude wants to call a tool.
max_tokens – Claude hit the max_tokens limit.
stop_sequence – Claude encountered a custom stop sequence.

Let's explore each one.

`end_turn`: Natural Completion

This is the most common stop reason. Claude has finished its response and expects no further action from you. In a simple Q&A flow, you can safely display the response and wait for the next user input.

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
    print(response.content[0].text)

The Empty Response Problem

Sometimes Claude returns an empty response (2–3 tokens, no content) with stop_reason: "end_turn". This usually happens in tool-use scenarios when:

You add text blocks immediately after tool_result blocks.
You send Claude's own completed response back without adding anything new.

Why it happens: Claude learns from the conversation pattern. If you always insert text after tool results, Claude may decide its turn is complete prematurely. How to fix it:

# INCORRECT: Adding text after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # ❌ Don't do this
    ]}
]
CORRECT: Send tool results directly
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}  # ✅ Just the result
    ]}
]

If you still get empty responses, add a continuation prompt in a new user message:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    if response.stop_reason == "end_turn" and not response.content:
        # Don't retry with the same messages—Claude already decided it's done
        messages.append({"role": "user", "content": "Please continue your response."})
        return client.messages.create(model="claude-opus-4-7", max_tokens=1024, messages=messages)
    return response

`tool_use`: Claude Wants to Call a Tool

When Claude decides it needs external data or computation, it returns stop_reason: "tool_use" along with one or more tool_use content blocks. Your application must:

Extract the tool name and input.
Execute the tool (e.g., call an API, query a database).
Return the result as a tool_result block in a new user message.

def handle_tool_call(client, messages):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=[
            {
                "name": "get_weather",
                "description": "Get current weather for a city",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "city": {"type": "string"}
                    },
                    "required": ["city"]
                }
            }
        ],
        messages=messages
    )
if response.stop_reason == "tool_use":
        for block in response.content:
            if block.type == "tool_use":
                tool_name = block.name
                tool_input = block.input
                # Execute the tool (pseudo-code)
                result = execute_tool(tool_name, tool_input)
                # Append the result
                messages.append({
                    "role": "user",
                    "content": [{
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    }]
                })
        # Continue the conversation
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=[...],
            messages=messages
        )
    
    return response

Important: Always return tool results in a user message, not an assistant message. The tool_use_id must match the ID from Claude's request.

`max_tokens`: Hit the Output Limit

If Claude's response is cut off because it reached the max_tokens limit, you'll see stop_reason: "max_tokens". This is common for long-form content or complex reasoning.

How to handle it:

Increase max_tokens if you expect longer responses.
Use continuation – send Claude's partial response back and ask it to continue.

def handle_max_tokens(client, messages, response):
    if response.stop_reason == "max_tokens":
        # Append Claude's partial response to the conversation
        messages.append({"role": "assistant", "content": response.content})
        # Ask to continue
        messages.append({"role": "user", "content": "Please continue."})
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=2048,  # Increase limit
            messages=messages
        )
    return response

Pro tip: For very long outputs, consider using streaming to show partial results to the user while you request continuations in the background.

`stop_sequence`: Custom Stop Triggered

If you defined custom stop_sequences in your API request, Claude will stop when it encounters one. The stop_sequence field will contain the matched sequence.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nHuman:"],
    messages=[{"role": "user", "content": "Tell me a short story."}]
)
if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    # The content ends right before the stop sequence
    print(response.content[0].text)

This is useful for:

Building chatbots that stop before generating a user turn.
Extracting structured data by stopping at delimiters.
Preventing Claude from continuing beyond a certain point.

Building a Robust Response Handler

Here's a complete example that handles all four stop reasons:

from anthropic import Anthropic
client = Anthropic()
def process_response(client, messages, tools=None):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        tools=tools,
        messages=messages
    )
if response.stop_reason == "end_turn":
        if not response.content:
            # Handle empty response
            messages.append({"role": "user", "content": "Please continue."})
            return process_response(client, messages, tools)
        return response.content[0].text
elif response.stop_reason == "tool_use":
        for block in response.content:
            if block.type == "tool_use":
                result = execute_tool(block.name, block.input)
                messages.append({
                    "role": "user",
                    "content": [{
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    }]
                })
        return process_response(client, messages, tools)
elif response.stop_reason == "max_tokens":
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": "Please continue."})
        return process_response(client, messages, tools)
elif response.stop_reason == "stop_sequence":
        # Content is complete up to the stop sequence
        return response.content[0].text
else:
        raise ValueError(f"Unknown stop_reason: {response.stop_reason}")

Common Pitfalls and Best Practices

1. Don't Ignore `tool_use`

If you ignore a tool_use stop reason and just display the response, Claude's tool call will be lost. Always check for tool blocks.

2. Avoid Infinite Tool Loops

Set a maximum number of tool call iterations (e.g., 10) to prevent runaway loops.

MAX_TOOL_CALLS = 10
tool_call_count = 0
while response.stop_reason == "tool_use" and tool_call_count < MAX_TOOL_CALLS:
    # handle tool call
    tool_call_count += 1

3. Stream for Long Responses

For max_tokens scenarios, streaming gives users immediate feedback and lets you handle continuations gracefully.

4. Validate Tool Results

Always ensure tool results are properly formatted and include the correct tool_use_id. Mismatched IDs can cause Claude to ignore the result.

Key Takeaways

stop_reason is your guide – It tells you exactly why Claude stopped, enabling you to build the correct next step in your application logic.
Empty responses with end_turn are usually caused by adding text after tool results. Fix by sending only tool_result blocks, or add a continuation prompt.
tool_use requires a loop – Your application must execute the tool and return results in a user message. Always limit the number of iterations.
max_tokens means partial output – Increase the limit or use continuation prompts. Streaming helps manage user expectations.
stop_sequence gives you control – Use custom stop sequences to prevent unwanted generation or extract structured data.

By mastering these four stop reasons, you'll build Claude applications that handle every scenario gracefully—from simple Q&A to complex multi-tool workflows.

Introduction

Understanding the stop_reason Field

end_turn: Natural Completion