Mastering Claude's Stop Reasons: A Practical Guide to Handling API Responses
Learn how to interpret and handle Claude's stop_reason field in the Messages API, including end_turn, tool_use, max_tokens, and stop_sequence scenarios with practical code examples.
This guide explains Claude's stop_reason field — end_turn, tool_use, max_tokens, and stop_sequence — and how to handle each in your application. You'll learn to detect empty responses, manage tool calls, handle token limits, and build robust conversational loops.
Introduction
When you send a request to Claude via the Messages API, the response includes a stop_reason field. This field tells you why Claude stopped generating — not because something went wrong, but because the model reached a natural stopping point. Understanding these values is essential for building reliable, production-grade applications.
This guide covers every stop_reason value, common pitfalls (like empty responses), and how to handle each scenario in Python and TypeScript.
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Unlike errors (which indicate a failed request), stop_reason explains why Claude finished its response generation.
Here’s a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
There are four possible values for stop_reason:
end_turntool_usemax_tokensstop_sequence
end_turn — Natural Completion
end_turn is the most common stop reason. It means Claude finished its response naturally — it decided it had answered the user’s query completely.
Basic Handling
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
Empty Responses with end_turn
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, especially after tool results.
- Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
- Sending Claude’s completed response back without adding anything (Claude already decided it’s done, so it will remain done)
# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]
},
{
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't add text after tool_result
]
}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]
},
{
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
] # Just the tool_result, no additional text
}
]
If you still get empty responses after fixing the above, use a continuation prompt:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
# Check if response is empty
if response.stop_reason == "end_turn" and not response.content:
# Add a continuation prompt in a NEW user message
messages.append({
"role": "user",
"content": "Please continue with your response."
})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
tool_use — Claude Wants to Call a Tool
When stop_reason is tool_use, Claude has decided to use a tool. The response will contain one or more tool_use content blocks. Your application must execute the tool and return the result.
Handling Tool Calls
def handle_tool_call(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
],
messages=messages
)
if response.stop_reason == "tool_use":
# Extract the tool use block
tool_use_block = next(
block for block in response.content
if block.type == "tool_use"
)
# Execute the tool (your implementation)
tool_result = execute_tool(tool_use_block.name, tool_use_block.input)
# Append assistant response and tool result to messages
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_use_block.id,
"content": tool_result
}
]
})
# Continue the conversation
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[...],
messages=messages
)
return response
Parallel Tool Calls
Claude can request multiple tools in a single response. Handle this by iterating over all tool_use blocks:
def handle_parallel_tools(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[...],
messages=messages
)
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[...],
messages=messages
)
return response
max_tokens — Token Limit Reached
When stop_reason is max_tokens, Claude hit the max_tokens limit you set. The response is truncated. This often happens with long outputs or when Claude is in the middle of a thought.
Handling Truncated Responses
def handle_max_tokens(client, messages, response):
if response.stop_reason == "max_tokens":
# Append the partial response to messages
messages.append({"role": "assistant", "content": response.content})
# Add a continuation prompt
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
# Retry with higher max_tokens
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096, # Increase limit
messages=messages
)
return response
Best practice: If you frequently hit max_tokens, increase the limit or implement a loop that continues until end_turn.
stop_sequence — Custom Stop Sequence Triggered
If you set a stop_sequences parameter in your API request, Claude will stop when it encounters one of those sequences. The stop_reason will be stop_sequence, and the stop_sequence field will contain the matched sequence.
Handling Stop Sequences
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
messages=[{"role": "user", "content": "Tell me a story"}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The content ends right before the stop sequence
print(response.content[0].text)
Building a Complete Conversation Loop
Here’s a robust loop that handles all stop reasons:
def complete_conversation(client, messages, tools=None, max_tokens=4096):
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
# Natural completion — return the final response
return response
elif response.stop_reason == "tool_use":
# Execute tools and continue
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "user", "content": tool_results})
elif response.stop_reason == "max_tokens":
# Append partial response and continue
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue."
})
elif response.stop_reason == "stop_sequence":
# Stop sequence reached — return what we have
return response
Key Takeaways
end_turnmeans Claude finished naturally. Watch for empty responses after tool calls — fix by not adding text aftertool_resultblocks.tool_usemeans Claude wants to call a tool. Execute the tool and return the result in a newusermessage withtool_resultblocks.max_tokensmeans the response was truncated. Increasemax_tokensor implement a continuation loop.stop_sequencemeans a custom stop sequence was matched. Thestop_sequencefield tells you which one.- Always check
stop_reasonbefore processing content — it determines your next action in the conversation loop.