GuideBeginnerAgents2026-05-21

Mastering Claude API Stop Reasons: A Practical Guide to Handling Response Termination

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens to build robust, production-ready applications.

Quick Answer

This guide explains Claude's stop_reason field—why the model stops generating (end_turn, tool_use, max_tokens, stop_sequence)—and provides actionable code examples for handling each case, including empty responses and tool loops.

Claude APIstop_reasonerror handlingtool usebest practices

Introduction

When you send a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. This isn't an error—it's a signal. Understanding these signals is essential for building applications that respond intelligently to different scenarios, from natural conversation endings to tool call requests.

In this guide, you'll learn:

What each stop_reason value means
How to handle end_turn (including empty responses)
How to process tool_use and continue the conversation
How to manage max_tokens limits gracefully
Best practices for production systems

The `stop_reason` Field

The stop_reason field appears in every successful Messages API response. Unlike HTTP errors (which indicate a failed request), stop_reason tells you why Claude successfully finished its response.

Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

Stop Reason Values

`end_turn`

What it means: Claude finished its response naturally. It has nothing more to say and is waiting for the user to respond. When it occurs: This is the most common stop reason. You'll see it after Claude answers a question, completes a task, or decides its turn is over. How to handle it: In most cases, you can display the response to the user and wait for their next input.

from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
if response.stop_reason == "end_turn":
    # Process the complete response
    print(response.content[0].text)

#### Empty Responses with end_turn

Sometimes Claude returns an empty response (exactly 2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.

Common causes:

Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)

How to prevent empty responses:

# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't add text after tool_result
    ]}
]
CORRECT: Send tool results directly without additional text
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
    ]}  # Just the tool_result, no additional text
]

If you still get empty responses after fixing the above, use a continuation prompt:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Add a continuation prompt in a NEW user message
        messages.append({
            "role": "user",
            "content": "Please continue with your response."
        })
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

`tool_use`

What it means: Claude wants to use a tool. The response will contain one or more tool_use content blocks. When it occurs: When you've defined tools in your API request and Claude decides it needs to call one (or more) to complete the task. How to handle it: You must execute the tool, return the result as a tool_result block, and continue the conversation.

def handle_tool_use(response, messages):
    # Extract tool use blocks
    tool_use_blocks = [
        block for block in response.content 
        if block.type == "tool_use"
    ]
    
    # Execute each tool and collect results
    tool_results = []
    for tool_use in tool_use_blocks:
        result = execute_tool(tool_use.name, tool_use.input)
        tool_results.append({
            "type": "tool_result",
            "tool_use_id": tool_use.id,
            "content": str(result)
        })
    
    # Append assistant response and tool results to messages
    messages.append({"role": "assistant", "content": response.content})
    messages.append({"role": "user", "content": tool_results})
    
    # Continue the conversation
    return client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=messages
    )

`max_tokens`

What it means: Claude hit the max_tokens limit you set. The response is truncated. When it occurs: When Claude's full response would exceed the max_tokens parameter in your request. How to handle it: You can continue the conversation by sending a follow-up message asking Claude to finish its thought.

def handle_max_tokens(response, messages):
    if response.stop_reason == "max_tokens":
        # Append the partial response
        messages.append({"role": "assistant", "content": response.content})
        # Ask Claude to continue
        messages.append({
            "role": "user", 
            "content": "Please continue from where you left off."
        })
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
    return response

`stop_sequence`

What it means: Claude encountered a custom stop sequence you defined in your API request. When it occurs: When you've set the stop_sequences parameter (e.g., ["\n\nHuman:"]) and Claude generates that sequence. How to handle it: The response is complete up to the stop sequence. You can process it as-is or continue with a new user message.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["END"],
    messages=[
        {"role": "user", "content": "List three colors and then write END."}
    ]
)
if response.stop_reason == "stop_sequence":
    print(f"Stopped at sequence: {response.stop_sequence}")
    print(response.content[0].text)  # Will not include "END"

Building a Robust Handler

In production, you'll want a single function that handles all stop reasons gracefully:

def process_claude_response(client, messages, max_iterations=10):
    """
    Process Claude's response, handling all stop reasons.
    Automatically continues for tool_use and max_tokens.
    """
    iteration = 0
    
    while iteration < max_iterations:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
        
        if response.stop_reason == "end_turn":
            # Check for empty response
            if not response.content:
                messages.append({
                    "role": "user",
                    "content": "Please continue."
                })
                iteration += 1
                continue
            return response
            
        elif response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.content})
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    })
            messages.append({"role": "user", "content": tool_results})
            
        elif response.stop_reason == "max_tokens":
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": "Please continue."
            })
            
        elif response.stop_reason == "stop_sequence":
            return response
            
        iteration += 1
    
    raise Exception("Max iterations reached without completion")

Best Practices

Always check stop_reason – Never assume the response is final. Always inspect the stop_reason field to determine next steps.

Handle empty responses gracefully – Implement the continuation prompt pattern for empty end_turn responses, especially in tool-use workflows.

Set a maximum iteration limit – When handling tool_use or max_tokens in a loop, always set a limit to prevent infinite loops.

Log stop reasons – In production, log the stop_reason and usage fields for monitoring and debugging.

Test with different scenarios – Test your handler with short responses (to trigger max_tokens), tool-using prompts, and natural conversation endings.

Key Takeaways

end_turn means Claude finished naturally; watch for empty responses in tool-use workflows and use a continuation prompt if needed.
tool_use means Claude wants to call a tool; you must execute it and return the result to continue.
max_tokens means Claude's response was truncated; send a follow-up message to let it finish.
stop_sequence means a custom stop sequence was triggered; the response is complete up to that point.
Build a unified handler that loops through tool calls and truncated responses, with a maximum iteration limit to prevent infinite loops.

Introduction

The stop_reason Field

Stop Reason Values

end_turn

CORRECT: Send tool results directly without additional text

tool_use

max_tokens

stop_sequence