Mastering Claude's Stop Reasons: A Practical Guide to Handling API Responses
Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples and fixes for empty responses.
This guide explains Claude's stop_reason field—end_turn, tool_use, max_tokens, and stop_sequence—and how to handle each in your application. You'll learn to detect empty responses, manage tool loops, and build robust conversational flows with practical Python examples.
Introduction
When you call the Claude Messages API, every successful response includes a stop_reason field. This small but critical piece of data tells you why Claude stopped generating—whether it finished naturally, requested a tool, hit a token limit, or matched a custom stop sequence. Misinterpreting these values can lead to broken conversations, infinite loops, or missed tool calls.
In this guide, you'll learn exactly what each stop_reason means, how to handle them in your code, and how to avoid common pitfalls like empty responses or stuck tool chains.
Understanding the stop_reason Field
The stop_reason field appears in every successful response from the Messages API. It is not an error—it indicates a normal completion. Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
The four possible values are:
end_turn– Claude finished naturally.tool_use– Claude wants to call a tool.max_tokens– Claude hit themax_tokenslimit.stop_sequence– Claude encountered a custom stop sequence.
end_turn: Natural Completion
This is the most common stop reason. Claude has finished its response and expects no further action from you. In a simple Q&A flow, you can safely display the response and wait for the next user input.
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
The Empty Response Problem
Sometimes Claude returns an empty response (2–3 tokens, no content) with stop_reason: "end_turn". This usually happens in tool-use scenarios when:
- You add text blocks immediately after
tool_resultblocks. - You send Claude's own completed response back without adding anything new.
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send tool results directly
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{"type": "tool_use", "id": "toolu_123", "name": "calculator", "input": {"operation": "add", "a": 1234, "b": 5678}}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} # ✅ Just the result
]}
]
If you still get empty responses, add a continuation prompt in a new user message:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Don't retry with the same messages—Claude already decided it's done
messages.append({"role": "user", "content": "Please continue your response."})
return client.messages.create(model="claude-opus-4-7", max_tokens=1024, messages=messages)
return response
tool_use: Claude Wants to Call a Tool
When Claude decides it needs external data or computation, it returns stop_reason: "tool_use" along with one or more tool_use content blocks. Your application must:
- Extract the tool name and input.
- Execute the tool (e.g., call an API, query a database).
- Return the result as a
tool_resultblock in a new user message.
def handle_tool_call(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
],
messages=messages
)
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
# Execute the tool (pseudo-code)
result = execute_tool(tool_name, tool_input)
# Append the result
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]
})
# Continue the conversation
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[...],
messages=messages
)
return response
Important: Always return tool results in a user message, not an assistant message. The tool_use_id must match the ID from Claude's request.
max_tokens: Hit the Output Limit
If Claude's response is cut off because it reached the max_tokens limit, you'll see stop_reason: "max_tokens". This is common for long-form content or complex reasoning.
- Increase
max_tokensif you expect longer responses. - Use continuation – send Claude's partial response back and ask it to continue.
def handle_max_tokens(client, messages, response):
if response.stop_reason == "max_tokens":
# Append Claude's partial response to the conversation
messages.append({"role": "assistant", "content": response.content})
# Ask to continue
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048, # Increase limit
messages=messages
)
return response
Pro tip: For very long outputs, consider using streaming to show partial results to the user while you request continuations in the background.
stop_sequence: Custom Stop Triggered
If you defined custom stop_sequences in your API request, Claude will stop when it encounters one. The stop_sequence field will contain the matched sequence.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nHuman:"],
messages=[{"role": "user", "content": "Tell me a short story."}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The content ends right before the stop sequence
print(response.content[0].text)
This is useful for:
- Building chatbots that stop before generating a user turn.
- Extracting structured data by stopping at delimiters.
- Preventing Claude from continuing beyond a certain point.
Building a Robust Response Handler
Here's a complete example that handles all four stop reasons:
from anthropic import Anthropic
client = Anthropic()
def process_response(client, messages, tools=None):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
if not response.content:
# Handle empty response
messages.append({"role": "user", "content": "Please continue."})
return process_response(client, messages, tools)
return response.content[0].text
elif response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]
})
return process_response(client, messages, tools)
elif response.stop_reason == "max_tokens":
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
return process_response(client, messages, tools)
elif response.stop_reason == "stop_sequence":
# Content is complete up to the stop sequence
return response.content[0].text
else:
raise ValueError(f"Unknown stop_reason: {response.stop_reason}")
Common Pitfalls and Best Practices
1. Don't Ignore tool_use
If you ignore a tool_use stop reason and just display the response, Claude's tool call will be lost. Always check for tool blocks.
2. Avoid Infinite Tool Loops
Set a maximum number of tool call iterations (e.g., 10) to prevent runaway loops.MAX_TOOL_CALLS = 10
tool_call_count = 0
while response.stop_reason == "tool_use" and tool_call_count < MAX_TOOL_CALLS:
# handle tool call
tool_call_count += 1
3. Stream for Long Responses
Formax_tokens scenarios, streaming gives users immediate feedback and lets you handle continuations gracefully.
4. Validate Tool Results
Always ensure tool results are properly formatted and include the correcttool_use_id. Mismatched IDs can cause Claude to ignore the result.
Key Takeaways
stop_reasonis your guide – It tells you exactly why Claude stopped, enabling you to build the correct next step in your application logic.- Empty responses with
end_turnare usually caused by adding text after tool results. Fix by sending onlytool_resultblocks, or add a continuation prompt. tool_userequires a loop – Your application must execute the tool and return results in a user message. Always limit the number of iterations.max_tokensmeans partial output – Increase the limit or use continuation prompts. Streaming helps manage user expectations.stop_sequencegives you control – Use custom stop sequences to prevent unwanted generation or extract structured data.