Mastering Claude’s Stop Reasons: Build Reliable API Applications
Learn how to interpret and handle Claude's stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—with practical Python examples to build robust, production-ready apps.
This guide explains Claude’s stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—and shows how to handle each case in Python. You’ll learn to detect empty responses, continue tool loops, and manage token limits for reliable API integrations.
Introduction
When you call the Claude API, every successful response includes a stop_reason field. This small piece of data tells you why the model stopped generating—whether it finished naturally, requested a tool call, hit a token limit, or encountered a stop sequence. Ignoring it can lead to incomplete answers, broken tool loops, or silent failures.
In this guide, you’ll learn exactly what each stop reason means, how to handle it in your code, and how to avoid common pitfalls like empty responses. By the end, you’ll be able to build robust applications that gracefully handle every scenario.
Understanding the stop_reason Field
The stop_reason field appears in every successful Messages API response. It’s not an error—it’s a signal about how Claude completed its turn. Here’s a typical response snippet:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
There are four possible values for stop_reason:
| Value | Meaning |
|---|---|
end_turn | Claude finished its response naturally. |
tool_use | Claude wants to call a tool (function). |
max_tokens | Claude stopped because it hit the max_tokens limit. |
stop_sequence | Claude encountered a custom stop sequence you defined. |
end_turn: The Natural Finish
end_turn is the most common stop reason. It means Claude decided it had completed its response and didn’t need to say more. In most cases, you can simply return the response to the user.
Basic Handling in Python
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
The Empty Response Gotcha
Sometimes Claude returns stop_reason: "end_turn" with an empty or near-empty response (2–3 tokens, no meaningful content). This typically happens in tool-use workflows when:
- You add text blocks immediately after
tool_resultblocks. - You send Claude’s completed response back without adding anything new.
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send only the tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
]} # ✅ Just the result
]
If you still get empty responses, add a retry loop:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Retry with a slight prompt adjustment
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
tool_use: Claude Wants to Call a Tool
When stop_reason is tool_use, Claude has decided it needs to use a tool to complete the task. Your application must:
- Extract the tool call details from
content. - Execute the tool (e.g., call an API, query a database).
- Append the result as a
tool_resultblock. - Send the updated message history back to Claude.
Complete Tool Loop Example
from anthropic import Anthropic
client = Anthropic()
messages = [
{"role": "user", "content": "What's the weather in Tokyo?"}
]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}],
messages=messages
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
break
elif response.stop_reason == "tool_use":
# Extract tool call
tool_call = response.content[0]
tool_name = tool_call.name
tool_input = tool_call.input
# Execute tool (simulated)
if tool_name == "get_weather":
result = f"25°C in {tool_input['city']}"
# Append assistant's tool call and result
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_call.id,
"content": result
}]
})
# Loop continues
max_tokens: Hit the Token Limit
When stop_reason is max_tokens, Claude’s response was cut off because it reached the max_tokens limit you set. The response may be incomplete.
How to Handle It
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100, # Intentionally low for demonstration
messages=[
{"role": "user", "content": "Write a detailed essay on AI ethics."}
]
)
if response.stop_reason == "max_tokens":
print("Response was truncated. Consider increasing max_tokens.")
# Option 1: Increase max_tokens and retry
# Option 2: Append the partial response and ask Claude to continue
partial_text = response.content[0].text
messages.append({"role": "assistant", "content": partial_text})
messages.append({"role": "user", "content": "Continue from where you left off."})
continued_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=500,
messages=messages
)
Best practice: Set max_tokens generously (e.g., 4096 or higher) for open-ended tasks, or implement a continuation loop as shown above.
stop_sequence: Custom Stop Sequence Encountered
If you define custom stop sequences in your API request, Claude will stop generating when it encounters one. The stop_sequence field will contain the matched sequence.
Example
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nEND"],
messages=[
{"role": "user", "content": "List three fruits and then write END."}
]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
print(response.content[0].text)
This is useful for structured outputs where you want Claude to stop at a delimiter.
Building a Robust Response Handler
Combine all the logic into a single function:
def handle_claude_response(response, client, messages):
"""Route Claude's response based on stop_reason."""
if response.stop_reason == "end_turn":
if not response.content:
# Handle empty response
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 100,
messages=messages
)
return response.content[0].text
elif response.stop_reason == "tool_use":
# Extract and execute tool
tool_call = response.content[0]
result = execute_tool(tool_call.name, tool_call.input)
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_call.id,
"content": result
}]
})
# Recursively call
new_response = client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 500,
messages=messages
)
return handle_claude_response(new_response, client, messages)
elif response.stop_reason == "max_tokens":
# Continue generation
messages.append({"role": "assistant", "content": response.content[0].text})
messages.append({"role": "user", "content": "Continue."})
new_response = client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 500,
messages=messages
)
return handle_claude_response(new_response, client, messages)
elif response.stop_reason == "stop_sequence":
# Custom handling
return response.content[0].text
Key Takeaways
- Always check
stop_reasonin every API response to determine the next action—don’t assume the response is final. - For
tool_use, implement a loop that executes the tool and feeds the result back to Claude until you getend_turn. - For
max_tokens, either increase the limit or implement a continuation pattern to avoid truncated responses. - Avoid empty
end_turnresponses by sending onlytool_resultblocks (no extra text) in tool workflows. - Use
stop_sequencesfor structured outputs when you need Claude to stop at a specific delimiter.