BeClaude
GuideBeginnerAgents2026-05-20

Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens

Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable applications. Includes code examples, empty response fixes, and best practices.

Quick Answer

This guide explains Claude's stop_reason field (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each case in your application. You'll learn to detect empty responses, continue tool loops, handle token limits, and build robust multi-turn conversations.

Claude APIstop_reasonerror handlingtool useMessages API

Introduction

Every time you call the Claude Messages API, the response includes a stop_reason field. This tiny piece of data tells you why Claude stopped generating—whether it finished naturally, wants to use a tool, hit a token limit, or encountered a stop sequence. Understanding these values is the difference between a brittle prototype and a production-ready application.

In this guide, you'll learn:

  • What each stop_reason value means
  • How to handle end_turn (including empty responses)
  • How to build tool-use loops with tool_use
  • How to manage max_tokens and stop_sequence gracefully
  • Best practices for robust multi-turn conversations

The stop_reason Field

The stop_reason field appears in every successful Messages API response. Unlike errors (which indicate failures), stop_reason tells you why Claude successfully completed its response generation.

Here's a typical response structure:

{
  "id": "msg_01234",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Here's the answer to your question..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 100,
    "output_tokens": 50
  }
}

Stop Reason Values

Claude can return four distinct stop_reason values:

ValueMeaningWhen It Occurs
end_turnClaude finished naturallyMost common; Claude believes the conversation turn is complete
tool_useClaude wants to call a toolThe response contains one or more tool_use content blocks
max_tokensClaude hit the token limitThe response was truncated because it reached max_tokens
stop_sequenceClaude encountered a custom stop sequenceOne of your provided stop_sequences was generated

Handling end_turn

end_turn is the simplest case: Claude has finished its response and expects you to either end the conversation or provide a new user message.

Basic Handling

from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

Empty Responses with end_turn

Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.

Common causes:
  • Adding text blocks immediately after tool_result blocks
  • Sending Claude's completed response back without adding anything new
How to prevent empty responses:
# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate the sum of 1234 and 5678"},
    {"role": "assistant", "content": [
        {"type": "tool_use", "id": "toolu_123", "name": "calculator",
         "input": {"operation": "add", "a": 1234, "b": 5678}}
    ]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't add text after tool_result!
    ]}
]

CORRECT: Send tool results directly without additional text

messages = [ {"role": "user", "content": "Calculate the sum of 1234 and 5678"}, {"role": "assistant", "content": [ {"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}} ]}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} ]} # Just the tool_result, no additional text ]

If you still get empty responses after fixing the above, use a continuation prompt:

def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # Add a continuation prompt in a NEW user message
        messages.append({
            "role": "user",
            "content": "Please continue with your response."
        })
        return client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    return response

Handling tool_use

When Claude decides it needs to call a tool, it returns stop_reason: "tool_use" along with one or more tool_use content blocks. Your application must:

  • Execute the tool(s)
  • Return the results as tool_result blocks
  • Continue the conversation

Tool Loop Pattern

def process_tool_calls(response, messages):
    """Handle tool_use responses and continue the conversation."""
    while response.stop_reason == "tool_use":
        # Collect all tool results
        tool_results = []
        for block in response.content:
            if block.type == "tool_use":
                # Execute the tool (your implementation)
                result = execute_tool(block.name, block.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": str(result)
                })
        
        # Add assistant response and tool results to messages
        messages.append({"role": "assistant", "content": response.content})
        messages.append({"role": "user", "content": tool_results})
        
        # Get next response from Claude
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
    
    return response

Parallel Tool Calls

Claude can request multiple tools in a single response. Your code should handle all of them before returning results:

def execute_parallel_tools(response):
    """Execute all tools in parallel and return results."""
    tool_results = []
    for block in response.content:
        if block.type == "tool_use":
            result = execute_tool(block.name, block.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": str(result)
            })
    return tool_results

Handling max_tokens

When Claude hits the max_tokens limit, the response is truncated. This is common in long conversations or when generating large outputs.

Detection and Recovery

def handle_max_tokens(response, messages):
    """Handle truncated responses by continuing the conversation."""
    if response.stop_reason == "max_tokens":
        # Add the partial response to the conversation
        messages.append({"role": "assistant", "content": response.content})
        
        # Ask Claude to continue
        messages.append({
            "role": "user",
            "content": "Please continue from where you left off."
        })
        
        return client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
    return response

Increasing Token Budget

For long outputs, consider increasing max_tokens:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,  # Increase for longer responses
    messages=messages
)

Handling stop_sequence

If you provide custom stop_sequences, Claude will stop when it encounters one. This is useful for structured outputs:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
    messages=[{"role": "user", "content": "Tell me a story."}]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") # Process the truncated response

Building a Complete Handler

Here's a robust handler that manages all stop reasons:

def handle_claude_response(client, messages, max_iterations=10):
    """Complete handler for all stop reasons."""
    iteration = 0
    
    while iteration < max_iterations:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=messages
        )
        
        if response.stop_reason == "end_turn":
            if not response.content:
                # Handle empty response
                messages.append({
                    "role": "user",
                    "content": "Please continue."
                })
                iteration += 1
                continue
            return response
            
        elif response.stop_reason == "tool_use":
            # Execute tools and continue
            messages.append({"role": "assistant", "content": response.content})
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    })
            messages.append({"role": "user", "content": tool_results})
            iteration += 1
            
        elif response.stop_reason == "max_tokens":
            # Continue from where we left off
            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": "Please continue."
            })
            iteration += 1
            
        elif response.stop_reason == "stop_sequence":
            # Custom handling for stop sequences
            return response
    
    raise Exception("Max iterations reached without completion")

Best Practices

  • Always check stop_reason – Never assume Claude finished naturally. Always inspect the field and handle each case.
  • Limit tool loops – Set a maximum number of tool call iterations to prevent infinite loops.
  • Handle empty responses – Implement the continuation prompt pattern for empty end_turn responses.
  • Log stop_reason – For debugging, log the stop reason and response metadata.
  • Test edge cases – Test with max_tokens=1, empty tool results, and rapid tool sequences.

Key Takeaways

  • Claude returns four stop reasons: end_turn (natural finish), tool_use (wants to call a tool), max_tokens (truncated), and stop_sequence (custom stop).
  • Empty responses with end_turn happen when Claude thinks the turn is complete; fix by not adding text after tool_result blocks and using continuation prompts.
  • Tool loops require careful handling: execute all tools, return results, and continue the conversation until end_turn.
  • max_tokens truncation can be handled by appending the partial response and asking Claude to continue.
  • Build a unified handler that manages all stop reasons to create robust, production-ready Claude applications.