Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, max_tokens & tool_use
Learn how to handle Claude API stop_reason values like end_turn, max_tokens, and tool_use. Includes code examples, empty response fixes, and best practices for production apps.
This guide explains Claude API stop_reason values (end_turn, max_tokens, tool_use, stop_sequence) and how to handle each in your code. You'll learn to detect empty responses, recover from max_tokens truncation, and properly chain tool calls.
Mastering Claude API Stop Reasons: Build Robust Applications
When you call the Claude Messages API, every successful response includes a stop_reason field. This tiny piece of data tells you why Claude stopped generating—and understanding it is the difference between a brittle prototype and a production-ready application.
In this guide, you'll learn:
- What each
stop_reasonvalue means - How to handle them in Python and TypeScript
- How to prevent and recover from empty responses
- Best practices for tool-using agents
What Is stop_reason?
The stop_reason field is part of every successful Messages API response. Unlike error codes (which indicate failures), stop_reason tells you why Claude successfully completed its response generation.
Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
The Four Stop Reason Values
1. end_turn – Natural Completion
This is the most common stop reason. Claude finished its response naturally—it said everything it wanted to say and handed control back to you.
How to handle it:from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
⚠️ Empty responses with end_turn
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete—especially after tool results.
- Adding text blocks immediately after
tool_resultblocks - Sending Claude's completed response back without adding anything new
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send tool results directly
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
# ✅ No extra text
]}
]
Recovering from empty responses:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# ❌ Don't just retry with the same messages
# ✅ Add a continuation prompt in a NEW user message
messages.append({"role": "user", "content": "Please continue"})
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
2. max_tokens – Token Limit Reached
Claude stopped because it hit the max_tokens limit you set. The response is truncated—Claude had more to say but ran out of space.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=200, # Low limit for demonstration
messages=[{"role": "user", "content": "Write a long story"}]
)
if response.stop_reason == "max_tokens":
# The response is incomplete. Append it and ask Claude to continue.
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue from where you left off."})
# Make a new request to get the rest
continuation = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2000,
messages=messages
)
TypeScript version:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function getCompleteResponse() {
let response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 200,
messages: [{ role: 'user', content: 'Write a long story' }]
});
if (response.stop_reason === 'max_tokens') {
const allContent = [...response.content];
while (response.stop_reason === 'max_tokens') {
response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 2000,
messages: [
...messages,
{ role: 'assistant', content: response.content },
{ role: 'user', content: 'Please continue from where you left off.' }
]
});
allContent.push(...response.content);
}
return allContent;
}
return response.content;
}
3. tool_use – Tool Call Requested
Claude stopped because it wants to call a tool. The response content will contain one or more tool_use blocks.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}],
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
# Execute the tool
result = execute_tool(block.name, block.input)
# Add tool result and continue
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]})
# Let Claude continue with the result
final_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[...],
messages=messages
)
4. stop_sequence – Custom Stop Sequence Hit
Claude stopped because it encountered one of your custom stop_sequences. This is useful for structured outputs where you want Claude to stop at a specific delimiter.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nEND"],
messages=[{"role": "user", "content": "List 3 colors and then write END"}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The content ends right before the stop sequence
print(response.content[0].text)
Best Practices for Production Applications
1. Always Check stop_reason
Never assume a response is complete. Always check stop_reason before processing:
def process_response(response):
if response.stop_reason == "end_turn":
return handle_complete(response)
elif response.stop_reason == "max_tokens":
return handle_truncated(response)
elif response.stop_reason == "tool_use":
return handle_tool_calls(response)
elif response.stop_reason == "stop_sequence":
return handle_stop_sequence(response)
else:
raise ValueError(f"Unknown stop_reason: {response.stop_reason}")
2. Build a Retry Loop for max_tokens
For long-form generation, implement a loop that continues until you get end_turn:
def generate_complete_response(client, messages, max_tokens=4096):
all_content = []
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
messages=messages
)
all_content.extend(response.content)
if response.stop_reason != "max_tokens":
break
# Continue from where we left off
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
return all_content
3. Handle Tool Chains Properly
When using tools, you may get multiple tool_use blocks in one response (parallel tool use). Process all of them before continuing:
def handle_tool_chain(client, messages, tools):
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tools,
messages=messages
)
if response.stop_reason != "tool_use":
return response
# Process all tool calls in this turn
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
# Add assistant response and tool results
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
Common Pitfalls to Avoid
| Pitfall | Solution |
|---|---|
Ignoring stop_reason | Always check it before processing content |
Adding text after tool_result | Send only the tool_result block |
| Retrying empty responses without changes | Add a continuation prompt |
| Forgetting to append assistant content | Include Claude's response in the next request |
| Not handling parallel tool calls | Iterate over all content blocks |
Key Takeaways
stop_reasontells you why Claude stopped – always check it before processing a response. The four values areend_turn,max_tokens,tool_use, andstop_sequence.end_turncan sometimes produce empty responses – prevent this by never adding text aftertool_resultblocks, and recover by sending a continuation prompt.max_tokensmeans the response is truncated – implement a retry loop that appends the partial response and asks Claude to continue.tool_userequires you to execute tools and feed results back – handle all tool calls in a single turn before continuing.- Build a state machine around
stop_reasonfor robust, production-ready applications that handle all scenarios gracefully.