Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens
Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable applications. Includes code examples for empty responses, tool loops, and streaming.
This guide explains Claude API stop_reason values (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each one in your application. You'll learn to prevent empty responses, manage tool loops, detect truncation, and build robust streaming logic.
Introduction
Every time you call the Claude Messages API, the response includes a stop_reason field. This small piece of data tells you why the model stopped generating—whether it finished naturally, requested a tool call, hit a token limit, or encountered a stop sequence. Ignoring stop_reason is like driving without paying attention to traffic lights: you might get where you're going, but you'll eventually crash.
In this guide, you'll learn:
- The four possible
stop_reasonvalues and what each means - How to handle each stop reason in Python and TypeScript
- How to prevent and recover from empty responses
- How to build a robust tool-use loop
- How to handle truncation in streaming and non-streaming modes
Understanding the stop_reason Field
The stop_reason field appears in every successful Messages API response. It's not an error—it's a signal. Here's the anatomy of a response:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
The Four Stop Reasons
| Value | Meaning | When It Occurs |
|---|---|---|
end_turn | Claude finished naturally | After a complete response, or after tool results when Claude decides its turn is done |
tool_use | Claude wants to call a tool | When the model determines it needs external data or computation |
max_tokens | Claude hit the token limit | When the response was truncated due to max_tokens |
stop_sequence | Claude encountered a custom stop sequence | When the model generates one of your specified stop_sequences |
Handling end_turn — The Natural Stop
end_turn is the most common stop reason. It means Claude completed its response without needing a tool or being cut off. In most cases, you can simply display the response content.
Basic Handling (Python)
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
Basic Handling (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello!' }]
});
if (response.stop_reason === 'end_turn') {
console.log(response.content[0].text);
}
The Empty Response Problem
Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens in tool-use scenarios when:
- You add text blocks immediately after tool results — Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern.
- You send Claude's completed response back without adding anything — Claude already decided it's done, so it remains done.
tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
Correct: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} # ✅ Just the result
]}
]
#### Recovering from Empty Responses
If you still get empty responses after fixing the above, use a continuation prompt:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# ❌ Don't just retry with the same messages
# Claude already decided it's done
# ✅ Add a continuation prompt in a NEW user message
messages.append({
"role": "user",
"content": "Please continue with your response."
})
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
Handling tool_use — Building the Tool Loop
When stop_reason is "tool_use", Claude has decided it needs to call a tool. Your application must:
- Execute the requested tool
- Return the result as a
tool_resultblock - Continue the conversation
Complete Tool Loop (Python)
from anthropic import Anthropic
client = Anthropic()
messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages,
tools=[{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}]
)
if response.stop_reason == "tool_use":
# Extract the tool use block
tool_use = next(block for block in response.content if block.type == "tool_use")
# Execute the tool (in real code, call your API)
tool_result = execute_tool(tool_use.name, tool_use.input)
# Add assistant response and tool result to messages
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(tool_result)
}]
})
# Loop continues...
elif response.stop_reason == "end_turn":
# Claude has finished
print(response.content[0].text)
break
Handling max_tokens — Detecting Truncation
When stop_reason is "max_tokens", Claude's response was cut off because it hit the max_tokens limit. This is common for long responses or complex reasoning.
What to Do
- Increase
max_tokensif the response is consistently truncated - Use a continuation prompt to ask Claude to finish
- Enable extended thinking for complex tasks that need more tokens
Continuation Pattern (Python)
def handle_truncation(client, messages, max_tokens=4096):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
messages=messages
)
while response.stop_reason == "max_tokens":
# Add the partial response to messages
messages.append({"role": "assistant", "content": response.content})
# Ask Claude to continue
messages.append({"role": "user", "content": "Please continue."})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
messages=messages
)
return response
Streaming with max_tokens
When streaming, you can detect truncation by checking the final message_stop event:
stream = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=256,
messages=[{"role": "user", "content": "Write a long story"}],
stream=True
)
for event in stream:
if event.type == "message_stop":
if event.message.stop_reason == "max_tokens":
print("\n[Response was truncated — consider increasing max_tokens]")
Handling stop_sequence — Custom Stop Conditions
When you define custom stop_sequences in your API call, Claude will stop generating as soon as it produces one of those sequences. This is useful for:
- Extracting structured data (stop at
</output>) - Limiting response length in a controlled way
- Building chat interfaces that stop at user-like messages
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["</answer>", "\n\nHuman:"],
messages=[
{"role": "user", "content": "Explain quantum computing in one sentence.</answer>"}
]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
print(response.content[0].text)
Building a Complete Stop Reason Handler
Here's a production-ready function that handles all stop reasons:
from anthropic import Anthropic
from typing import List, Dict
client = Anthropic()
def handle_claude_response(
messages: List[Dict],
tools: List[Dict] = None,
max_tokens: int = 1024
) -> str:
"""
Handle all stop reasons and return the final text response.
"""
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
messages=messages,
tools=tools
)
if response.stop_reason == "end_turn":
# Check for empty response
if not response.content:
messages.append({
"role": "user",
"content": "Please continue."
})
continue
return response.content[0].text
elif response.stop_reason == "tool_use":
# Execute tools and continue
messages.append({"role": "assistant", "content": response.content})
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]
})
elif response.stop_reason == "max_tokens":
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
elif response.stop_reason == "stop_sequence":
return response.content[0].text
Best Practices Summary
- Always check
stop_reason— Don't assumeend_turnmeans success. It could be an empty response. - Never add text after
tool_result— This causes empty responses. Send only the result. - Use continuation prompts for truncation — When
max_tokensstops the response, ask Claude to continue. - Build a loop for
tool_use— Keep calling the API until you getend_turn. - Log
stop_reasonin production — It's invaluable for debugging unexpected behavior.
Key Takeaways
- Four stop reasons exist:
end_turn(natural),tool_use(needs tool),max_tokens(truncated),stop_sequence(custom stop). - Empty responses with
end_turnare caused by adding text aftertool_resultblocks or sending back Claude's own response unchanged. Fix by sending only tool results and using continuation prompts. - Tool loops require explicit handling: When
stop_reasonistool_use, execute the tool, return the result, and continue the conversation untilend_turn. - Truncation from
max_tokenscan be handled by appending the partial response and asking Claude to continue in a new user message. - Streaming applications should check
stop_reasonin the finalmessage_stopevent to detect truncation or tool requests.