BeClaude
Guide2026-05-06

Mastering Claude API Stop Reasons: A Practical Guide to Handling end_turn, tool_use, and max_tokens

Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples, troubleshooting for empty responses, and best practices.

Quick Answer

This guide explains the four Claude API stop_reason values—end_turn, tool_use, max_tokens, and stop_sequence—and shows how to handle each in your application. You'll learn to detect empty responses, chain tool calls, and avoid common pitfalls.

Claude APIstop_reasonerror handlingtool usebest practices

Introduction

When you call the Claude API, every successful response includes a stop_reason field. This field tells you why the model stopped generating—whether it finished naturally, wants to use a tool, hit a token limit, or encountered a stop sequence. Understanding these values is essential for building robust, production-ready applications.

Unlike errors (which indicate something went wrong), stop_reason is part of normal operation. Your code should handle each reason appropriately to create smooth user experiences and avoid infinite loops or incomplete responses.

The Four Stop Reasons

Claude can return one of four stop_reason values:

Stop ReasonMeaning
end_turnClaude finished its response naturally
tool_useClaude wants to call a tool
max_tokensClaude hit the max_tokens limit
stop_sequenceClaude encountered a custom stop sequence
Let's explore each in detail.

end_turn: Natural Completion

end_turn is the most common stop reason. It means Claude decided its response is complete and it's handing control back to the user.

Handling end_turn in Python

from anthropic import Anthropic

client = Anthropic() response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}] )

if response.stop_reason == "end_turn": # Process the complete response print(response.content[0].text)

The Empty Response Problem

Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens when:

  • You add text blocks immediately after tool_result blocks
  • You send Claude's completed response back without adding anything new
Why it happens: Claude learns patterns from your message history. If you always insert text after tool results, Claude learns to end its turn to follow that pattern. If you send back a response Claude already considered complete, it stays done.

How to Prevent Empty Responses

Correct pattern for tool results:
# INCORRECT: Adding text immediately after tool_result
messages = [
    {"role": "user", "content": "Calculate 1234 + 5678"},
    {"role": "assistant", "content": [{
        "type": "tool_use",
        "id": "toolu_123",
        "name": "calculator",
        "input": {"operation": "add", "a": 1234, "b": 5678}
    }]},
    {"role": "user", "content": [
        {"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
        {"type": "text", "text": "Here's the result"}  # Don't do this!
    ]}
]

CORRECT: Send tool results directly without additional text

messages = [ {"role": "user", "content": "Calculate 1234 + 5678"}, {"role": "assistant", "content": [{ "type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678} }]}, {"role": "user", "content": [{ "type": "tool_result", "tool_use_id": "toolu_123", "content": "6912" }]} # Just the tool_result, no extra text ]
If you still get empty responses:
def handle_empty_response(client, messages):
    response = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )
    
    if response.stop_reason == "end_turn" and not response.content:
        # INCORRECT: Don't just retry with the same messages
        # Claude already decided it's done
        
        # CORRECT: Add a continuation prompt in a NEW user message
        messages.append({"role": "user", "content": "Please continue"})
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=1024,
            messages=messages
        )
    
    return response

tool_use: Claude Wants to Call a Tool

When Claude decides it needs to use a tool (like a calculator, database query, or web search), it returns stop_reason: "tool_use". Your application must:

  • Detect the tool_use stop reason
  • Execute the tool call
  • Return the result in a new user message with role: "user" and type: "tool_result"

Handling tool_use in Python

from anthropic import Anthropic

client = Anthropic()

def process_tool_call(tool_name, tool_input): """Execute the tool and return results.""" if tool_name == "calculator": return str(eval(tool_input["operation"])) elif tool_name == "get_weather": # Call your weather API return "{\"temp\": 72, \"conditions\": \"sunny\"}" return "Tool not implemented"

messages = [{"role": "user", "content": "What's 1234 + 5678?"}]

while True: response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=[{ "name": "calculator", "description": "Perform arithmetic", "input_schema": { "type": "object", "properties": { "operation": {"type": "string"} }, "required": ["operation"] } }], messages=messages ) if response.stop_reason == "end_turn": print(response.content[0].text) break elif response.stop_reason == "tool_use": # Extract tool use from content for block in response.content: if block.type == "tool_use": result = process_tool_call(block.name, block.input) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": result }] })

max_tokens: Hit the Token Limit

When Claude reaches the max_tokens limit you set, it returns stop_reason: "max_tokens". This means the response is truncated—Claude had more to say but ran out of space.

Handling max_tokens

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=100,  # Low limit for demonstration
    messages=[{"role": "user", "content": "Write a long essay about AI"}]
)

if response.stop_reason == "max_tokens": print("Response was truncated. Consider increasing max_tokens.") # You can continue the conversation messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": "Please continue from where you left off"}) response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1000, messages=messages )

Best practice: Set max_tokens generously (e.g., 4096 or higher) to avoid truncation for most use cases. For streaming, you can detect max_tokens in the final message event.

stop_sequence: Custom Stop Sequence

If you define custom stop sequences in your API request, Claude will stop when it encounters one and return stop_reason: "stop_sequence".

Using stop_sequences

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
    messages=[{"role": "user", "content": "Tell me a story"}]
)

if response.stop_reason == "stop_sequence": print(f"Stopped at sequence: {response.stop_sequence}") # The response content ends right before the stop sequence

This is useful for:

  • Preventing role injection in multi-turn conversations
  • Extracting structured data (stop at a delimiter)
  • Building chat interfaces with custom turn-taking

Building a Robust Handler

Here's a complete example that handles all stop reasons:

from anthropic import Anthropic

client = Anthropic()

def handle_response(response, messages, max_iterations=10): """Handle all stop reasons with proper error handling.""" iteration = 0 while iteration < max_iterations: iteration += 1 if response.stop_reason == "end_turn": if not response.content: # Empty response - prompt to continue messages.append({"role": "user", "content": "Please continue"}) else: return response.content[0].text if response.content else "" elif response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) messages.append({ "role": "user", "content": [{ "type": "tool_result", "tool_use_id": block.id, "content": result }] }) elif response.stop_reason == "max_tokens": messages.append({"role": "assistant", "content": response.content}) messages.append({"role": "user", "content": "Please continue"}) elif response.stop_reason == "stop_sequence": return response.content[0].text # Make next API call response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=4096, messages=messages ) raise Exception("Max iterations reached without completion")

def execute_tool(name, input_data): """Execute a tool and return string result.""" # Your tool execution logic here return json.dumps({"result": "success"})

Common Pitfalls and Solutions

PitfallSolution
Empty end_turn responsesDon't add text after tool_result blocks; use continuation prompts
Infinite tool call loopsSet a maximum iteration limit (e.g., 10-20)
Truncated responsesIncrease max_tokens or implement continuation logic
Ignoring stop_sequenceAlways check response.stop_sequence to know what triggered the stop

Conclusion

Mastering stop_reason handling is a fundamental skill for Claude API developers. By understanding when and why Claude stops, you can build applications that gracefully handle tool calls, avoid empty responses, and recover from truncation. The key is to treat stop_reason not as an error, but as a signal that guides your application's next action.

Key Takeaways

  • Four stop reasons exist: end_turn (natural completion), tool_use (wants to call a tool), max_tokens (hit token limit), and stop_sequence (custom stop triggered).
  • Prevent empty responses by never adding text after tool_result blocks and using continuation prompts ("Please continue") instead of retrying with the same messages.
  • Always handle tool_use by executing the tool and returning results in a new user message with type: "tool_result".
  • Detect max_tokens to implement continuation logic—Claude's response is truncated and needs a prompt to continue.
  • Set iteration limits (10-20) when handling tool_use to prevent infinite loops in tool-calling scenarios.