BeClaude
GuideBeginnerAgents2026-05-21

Mastering Claude API Stop Reasons: A Practical Guide to Handling Response Termination

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens to build robust, production-ready applications.

Quick Answer

This guide explains Claude's stop_reason field—why the model stops generating (end_turn, tool_use, max_tokens, stop_sequence)—and provides actionable code examples for handling each case, including empty responses and tool loops.

Claude APIstop_reasonerror handlingtool usebest practices

Introduction

When you send a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. This isn't an error—it's a signal. Understanding these signals is essential for building applications that respond intelligently to different scenarios, from natural conversation endings to tool call requests.

In this guide, you'll learn:

  • What each stop_reason value means
  • How to handle end_turn (including empty responses)
  • How to process tool_use and continue the conversation
  • How to manage max_tokens limits gracefully
  • Best practices for production systems

The stop_reason Field

The stop_reason field appears in every successful Messages API response. Unlike HTTP errors (which indicate a failed request), stop_reason tells you why Claude successfully finished its response.

Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

Stop Reason Values

end_turn

What it means: Claude finished its response naturally. It has nothing more to say and is waiting for the user to respond. When it occurs: This is the most common stop reason. You'll see it after Claude answers a question, completes a task, or decides its turn is over. How to handle it: In most cases, you can display the response to the user and wait for their next input.
from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[ {"role": "user", "content": "Hello!"} ] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

#### Empty Responses with end_turn

Sometimes Claude returns an empty response (exactly 2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.

Common causes:
  • Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
  • Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)
How to prevent empty responses:
# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {
            "type": "tool_use",
            "id": "toolu_123",
            "name": "calculator",
            "input": {"operation": "add", "a": 1234, "b": 5678}
        }
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't add text after tool_result
    ]}
]

CORRECT: Send tool results directly without additional text

messages = [ {"role": "user", "content": "Calculate the sum of 1234 and 5678"}, {"role": "assistant", "content": [ { "type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678} } ]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} ]} # Just the tool_result, no additional text ]

If you still get empty responses after fixing the above, use a continuation prompt:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Add a continuation prompt in a NEW user message
        messages.append({
            "role": "user",
            "content": "Please continue with your response."
        })
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

tool_use

What it means: Claude wants to use a tool. The response will contain one or more tool_use content blocks. When it occurs: When you've defined tools in your API request and Claude decides it needs to call one (or more) to complete the task. How to handle it: You must execute the tool, return the result as a tool_result block, and continue the conversation.
def handle_tool_use(response, messages):
    # Extract tool use blocks
    tool_use_blocks = [
        block for block in response.content 
        if block.type == "tool_use"
    ]
    
    # Execute each tool and collect results
    tool_results = []
    for tool_use in tool_use_blocks:
        result = execute_tool(tool_use.name, tool_use.input)
        tool_results.append({
            "type": "tool_result",
            "tool_use_id": tool_use.id,
            "content": str(result)
        })
    
    # Append assistant response and tool results to messages
    messages.append({"role": "assistant", "content": response.content})
    messages.append({"role": "user", "content": tool_results})
    
    # Continue the conversation
    return client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=messages
    )

max_tokens

What it means: Claude hit the max_tokens limit you set. The response is truncated. When it occurs: When Claude's full response would exceed the max_tokens parameter in your request. How to handle it: You can continue the conversation by sending a follow-up message asking Claude to finish its thought.
def handle_max_tokens(response, messages):
    if response.stop_reason == "max_tokens":
        # Append the partial response
        messages.append({"role": "assistant", "content": response.content})
        # Ask Claude to continue
        messages.append({
            "role": "user", 
            "content": "Please continue from where you left off."
        })
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
    return response

stop_sequence

What it means: Claude encountered a custom stop sequence you defined in your API request. When it occurs: When you've set the stop_sequences parameter (e.g., ["\n\nHuman:"]) and Claude generates that sequence. How to handle it: The response is complete up to the stop sequence. You can process it as-is or continue with a new user message.
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["END"],
    messages=[
        {"role": "user", "content": "List three colors and then write END."}
    ]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") print(response.content[0].text) # Will not include "END"

Building a Robust Handler

In production, you'll want a single function that handles all stop reasons gracefully:

def process_claude_response(client, messages, max_iterations=10):
    """
    Process Claude's response, handling all stop reasons.
    Automatically continues for tool_use and max_tokens.
    """
    iteration = 0
    
    while iteration < max_iterations:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
        
        if response.stop_reason == "end_turn":
            # Check for empty response
            if not response.content:
                messages.append({
                    "role": "user",
                    "content": "Please continue."
                })
                iteration += 1
                continue
            return response
            
        elif response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.content})
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    })
            messages.append({"role": "user", "content": tool_results})
            
        elif response.stop_reason == "max_tokens":
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": "Please continue."
            })
            
        elif response.stop_reason == "stop_sequence":
            return response
            
        iteration += 1
    
    raise Exception("Max iterations reached without completion")

Best Practices

  • Always check stop_reason – Never assume the response is final. Always inspect the stop_reason field to determine next steps.
  • Handle empty responses gracefully – Implement the continuation prompt pattern for empty end_turn responses, especially in tool-use workflows.
  • Set a maximum iteration limit – When handling tool_use or max_tokens in a loop, always set a limit to prevent infinite loops.
  • Log stop reasons – In production, log the stop_reason and usage fields for monitoring and debugging.
  • Test with different scenarios – Test your handler with short responses (to trigger max_tokens), tool-using prompts, and natural conversation endings.

Key Takeaways

  • end_turn means Claude finished naturally; watch for empty responses in tool-use workflows and use a continuation prompt if needed.
  • tool_use means Claude wants to call a tool; you must execute it and return the result to continue.
  • max_tokens means Claude's response was truncated; send a follow-up message to let it finish.
  • stop_sequence means a custom stop sequence was triggered; the response is complete up to that point.
  • Build a unified handler that loops through tool calls and truncated responses, with a maximum iteration limit to prevent infinite loops.