Mastering Claude API Stop Reasons: A Practical Guide to Handling Response Endings
Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens. Includes code examples, empty response fixes, and best practices for robust applications.
This guide explains the five stop_reason values in Claude's API responses (end_turn, tool_use, max_tokens, stop_sequence, content_filtered) and shows how to handle each one in your code, including preventing empty responses and building robust multi-turn conversations.
Introduction
When you make a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. Understanding these values is essential for building applications that handle different response types correctly—whether you're building a chatbot, a tool-using agent, or a content generation pipeline.
Unlike API errors (which indicate a failure in processing your request), stop_reason is part of every successful response. It gives you insight into Claude's internal decision-making and helps you decide what to do next.
This guide covers all five stop_reason values, how to handle them in code, and how to avoid common pitfalls like empty responses.
---
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Here's a typical example:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
There are five possible values for stop_reason:
| Value | Meaning |
|---|---|
end_turn | Claude finished its response naturally |
tool_use | Claude wants to call a tool |
max_tokens | The response hit the max_tokens limit |
stop_sequence | A custom stop sequence was encountered |
content_filtered | The response was filtered by content moderation |
---
end_turn: The Natural Completion
end_turn is the most common stop reason. It means Claude has finished its response and has nothing more to say. This is the ideal outcome for simple Q&A or single-turn conversations.
Handling end_turn in Python
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
The Empty Response Problem
Sometimes Claude returns an empty response (exactly 2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.
- Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
- Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)
# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't add text after tool_result
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
]} # Just the tool_result, no additional text
]
If you still get empty responses after fixing the above, implement a retry loop:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
# Check if response is empty
if response.stop_reason == "end_turn" and not response.content:
# Add a gentle nudge and retry
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
---
tool_use: Claude Wants to Call a Tool
When Claude decides it needs to use a tool (like a calculator, database query, or web search), it returns stop_reason: "tool_use" along with one or more tool_use content blocks.
Handling tool_use in Python
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
],
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
]
)
if response.stop_reason == "tool_use":
# Extract tool calls from content blocks
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
tool_id = block.id
# Execute the tool (your implementation)
result = execute_tool(tool_name, tool_input)
# Append tool result to messages
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_id,
"content": str(result)
}
]
})
# Continue the conversation
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
Important: After handling a tool_use, you must continue the conversation by making another API call with the tool results appended. Claude will then either produce a final answer or request additional tool calls.
---
max_tokens: Hit the Token Limit
When Claude's response reaches the max_tokens limit you set, it stops with stop_reason: "max_tokens". This is common for long-form content generation or when Claude is in the middle of a thought.
Handling max_tokens
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100, # Deliberately low for demonstration
messages=[
{"role": "user", "content": "Write a detailed essay about AI safety."}
]
)
if response.stop_reason == "max_tokens":
# The response is truncated
partial_text = response.content[0].text
# Option 1: Continue from where it left off
messages.append({"role": "assistant", "content": partial_text})
messages.append({"role": "user", "content": "Please continue."})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1000, # Increase limit
messages=messages
)
# Option 2: Increase max_tokens and retry the original request
# This is simpler but may produce slightly different output
Best practice: Set max_tokens generously (e.g., 4096 or higher) for tasks that might require long responses. For streaming applications, handle max_tokens by continuing the conversation.
---
stop_sequence: Custom Stop Sequences
If you define custom stop sequences in your API request, Claude will stop generating when it encounters one. This is useful for structured outputs like JSON or XML.
Example with Stop Sequences
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\n---END---"],
messages=[
{"role": "user", "content": "List three colors and end with ---END---"}
]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
print(response.content[0].text)
Use cases:
- Extracting structured data (JSON, XML)
- Controlling output length in specific formats
- Building multi-step generation pipelines
content_filtered: Content Moderation
This stop reason indicates that Claude's response was filtered by content moderation systems. This is rare but can happen if the model generates content that violates safety policies.
Handling content_filtered
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Generate harmful content"}
]
)
if response.stop_reason == "content_filtered":
# Log the incident for review
logger.warning(f"Content filtered for request: {response.id}")
# Return a safe fallback to the user
return "I'm sorry, I cannot generate that type of content."
Note: If you encounter frequent content_filtered responses, review your prompts and system instructions to ensure they align with Claude's usage policies.
---
Building a Robust Response Handler
Here's a complete example that handles all stop reasons in a single function:
def handle_claude_response(response, client, messages, tools=None):
"""Handle all possible stop_reason values."""
if response.stop_reason == "end_turn":
# Natural completion
if response.content:
return response.content[0].text
else:
# Empty response - retry with nudge
messages.append({"role": "user", "content": "Please continue."})
new_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
return handle_claude_response(new_response, client, messages, tools)
elif response.stop_reason == "tool_use":
# Execute tools and continue
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]
})
new_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
return handle_claude_response(new_response, client, messages, tools)
elif response.stop_reason == "max_tokens":
# Continue from where we left off
messages.append({"role": "assistant", "content": response.content[0].text})
messages.append({"role": "user", "content": "Please continue."})
new_response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096, # Increase limit
tools=tools,
messages=messages
)
return handle_claude_response(new_response, client, messages, tools)
elif response.stop_reason == "stop_sequence":
# Custom stop sequence encountered
return response.content[0].text
elif response.stop_reason == "content_filtered":
# Content moderation triggered
return "I cannot generate that content. Please rephrase your request."
else:
# Unknown stop reason (shouldn't happen)
raise ValueError(f"Unknown stop_reason: {response.stop_reason}")
---
Best Practices Summary
- Always check
stop_reason– Don't assume a response is complete just because you got a 200 status code. - Handle
tool_useas a loop – Continue making API calls until you getend_turnormax_tokens. - Set appropriate
max_tokens– For long-form content, use at least 4096 tokens. - Avoid empty responses – Don't add text blocks after
tool_resultcontent blocks. - Log
content_filtered– Monitor for policy violations and adjust prompts accordingly.
Key Takeaways
- Five stop reasons exist:
end_turn,tool_use,max_tokens,stop_sequence, andcontent_filtered– each requires different handling logic. - Empty responses with
end_turnare usually caused by adding text after tool results; fix by sending only thetool_resultblock. tool_userequires a loop: Execute the tool, append the result, and make another API call until Claude finishes.max_tokensmeans truncation: Either increase the limit and retry, or continue the conversation by appending a "please continue" message.- Build a unified handler that recursively processes responses until a natural completion (
end_turn) or a final output is achieved.