Mastering Claude API Stop Reasons: A Practical Guide to Handling end_turn, tool_use, and max_tokens
Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples, troubleshooting for empty responses, and best practices.
This guide explains the four Claude API stop_reason values—end_turn, tool_use, max_tokens, and stop_sequence—and shows how to handle each in your application. You'll learn to detect empty responses, chain tool calls, and avoid common pitfalls.
Introduction
When you call the Claude API, every successful response includes a stop_reason field. This field tells you why the model stopped generating—whether it finished naturally, wants to use a tool, hit a token limit, or encountered a stop sequence. Understanding these values is essential for building robust, production-ready applications.
Unlike errors (which indicate something went wrong), stop_reason is part of normal operation. Your code should handle each reason appropriately to create smooth user experiences and avoid infinite loops or incomplete responses.
The Four Stop Reasons
Claude can return one of four stop_reason values:
| Stop Reason | Meaning |
|---|---|
end_turn | Claude finished its response naturally |
tool_use | Claude wants to call a tool |
max_tokens | Claude hit the max_tokens limit |
stop_sequence | Claude encountered a custom stop sequence |
end_turn: Natural Completion
end_turn is the most common stop reason. It means Claude decided its response is complete and it's handing control back to the user.
Handling end_turn in Python
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
The Empty Response Problem
Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens when:
- You add text blocks immediately after
tool_resultblocks - You send Claude's completed response back without adding anything new
How to Prevent Empty Responses
Correct pattern for tool results:# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't do this!
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [{
"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"
}]} # Just the tool_result, no extra text
]
If you still get empty responses:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# INCORRECT: Don't just retry with the same messages
# Claude already decided it's done
# CORRECT: Add a continuation prompt in a NEW user message
messages.append({"role": "user", "content": "Please continue"})
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
tool_use: Claude Wants to Call a Tool
When Claude decides it needs to use a tool (like a calculator, database query, or web search), it returns stop_reason: "tool_use". Your application must:
- Detect the
tool_usestop reason - Execute the tool call
- Return the result in a new
usermessage withrole: "user"andtype: "tool_result"
Handling tool_use in Python
from anthropic import Anthropic
client = Anthropic()
def process_tool_call(tool_name, tool_input):
"""Execute the tool and return results."""
if tool_name == "calculator":
return str(eval(tool_input["operation"]))
elif tool_name == "get_weather":
# Call your weather API
return "{\"temp\": 72, \"conditions\": \"sunny\"}"
return "Tool not implemented"
messages = [{"role": "user", "content": "What's 1234 + 5678?"}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[{
"name": "calculator",
"description": "Perform arithmetic",
"input_schema": {
"type": "object",
"properties": {
"operation": {"type": "string"}
},
"required": ["operation"]
}
}],
messages=messages
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
break
elif response.stop_reason == "tool_use":
# Extract tool use from content
for block in response.content:
if block.type == "tool_use":
result = process_tool_call(block.name, block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": result
}]
})
max_tokens: Hit the Token Limit
When Claude reaches the max_tokens limit you set, it returns stop_reason: "max_tokens". This means the response is truncated—Claude had more to say but ran out of space.
Handling max_tokens
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100, # Low limit for demonstration
messages=[{"role": "user", "content": "Write a long essay about AI"}]
)
if response.stop_reason == "max_tokens":
print("Response was truncated. Consider increasing max_tokens.")
# You can continue the conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue from where you left off"})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1000,
messages=messages
)
Best practice: Set max_tokens generously (e.g., 4096 or higher) to avoid truncation for most use cases. For streaming, you can detect max_tokens in the final message event.
stop_sequence: Custom Stop Sequence
If you define custom stop sequences in your API request, Claude will stop when it encounters one and return stop_reason: "stop_sequence".
Using stop_sequences
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
messages=[{"role": "user", "content": "Tell me a story"}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The response content ends right before the stop sequence
This is useful for:
- Preventing role injection in multi-turn conversations
- Extracting structured data (stop at a delimiter)
- Building chat interfaces with custom turn-taking
Building a Robust Handler
Here's a complete example that handles all stop reasons:
from anthropic import Anthropic
client = Anthropic()
def handle_response(response, messages, max_iterations=10):
"""Handle all stop reasons with proper error handling."""
iteration = 0
while iteration < max_iterations:
iteration += 1
if response.stop_reason == "end_turn":
if not response.content:
# Empty response - prompt to continue
messages.append({"role": "user", "content": "Please continue"})
else:
return response.content[0].text if response.content else ""
elif response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": result
}]
})
elif response.stop_reason == "max_tokens":
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue"})
elif response.stop_reason == "stop_sequence":
return response.content[0].text
# Make next API call
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=messages
)
raise Exception("Max iterations reached without completion")
def execute_tool(name, input_data):
"""Execute a tool and return string result."""
# Your tool execution logic here
return json.dumps({"result": "success"})
Common Pitfalls and Solutions
| Pitfall | Solution |
|---|---|
Empty end_turn responses | Don't add text after tool_result blocks; use continuation prompts |
| Infinite tool call loops | Set a maximum iteration limit (e.g., 10-20) |
| Truncated responses | Increase max_tokens or implement continuation logic |
Ignoring stop_sequence | Always check response.stop_sequence to know what triggered the stop |
Conclusion
Mastering stop_reason handling is a fundamental skill for Claude API developers. By understanding when and why Claude stops, you can build applications that gracefully handle tool calls, avoid empty responses, and recover from truncation. The key is to treat stop_reason not as an error, but as a signal that guides your application's next action.
Key Takeaways
- Four stop reasons exist:
end_turn(natural completion),tool_use(wants to call a tool),max_tokens(hit token limit), andstop_sequence(custom stop triggered). - Prevent empty responses by never adding text after
tool_resultblocks and using continuation prompts ("Please continue") instead of retrying with the same messages. - Always handle
tool_useby executing the tool and returning results in a newusermessage withtype: "tool_result". - Detect
max_tokensto implement continuation logic—Claude's response is truncated and needs a prompt to continue. - Set iteration limits (10-20) when handling
tool_useto prevent infinite loops in tool-calling scenarios.