Mastering Claude's Stop Reasons: Build Robust API Applications
Learn how to interpret and handle Claude's stop_reason field in the Messages API. Includes code examples, troubleshooting empty responses, and best practices for tool use.
This guide explains Claude's stop_reason field (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each in your application. You'll learn to prevent empty responses, manage tool calls, and build robust conversational flows.
Introduction
When you send a request to Claude via the Messages API, the response includes a stop_reason field that tells you why the model stopped generating. Understanding these values is essential for building applications that handle different response types correctly—whether it's a natural conversation end, a tool call, or a token limit hit.
This guide covers every stop reason, how to handle them in code, common pitfalls like empty responses, and best practices for production applications.
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Unlike errors (which indicate request failures), stop_reason tells you why Claude completed its response generation successfully.
Example Response
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
Stop Reason Values
end_turn
The most common stop reason. Indicates Claude finished its response naturally—it decided the assistant's turn was complete.
How to handle:from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
#### Empty Responses with end_turn
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.
- Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
- Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)
# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't add text after tool_result
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
]} # Just the tool_result, no additional text
]
If you still get empty responses after fixing the above, implement a retry loop:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Retry with a prompt that encourages a response
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
tool_use
Indicates Claude wants to call one or more tools. The response content will contain tool_use blocks with tool names and inputs.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
],
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
# Execute the tool and send results back
tool_result = execute_tool(block.name, block.input)
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": [
{"type": "tool_result", "tool_use_id": block.id, "content": tool_result}
]})
# Continue the conversation
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
max_tokens
Indicates Claude's response was cut off because it reached the max_tokens limit you set. The response may be incomplete.
if response.stop_reason == "max_tokens":
# The response is truncated. Continue the conversation to get more.
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024, # Consider increasing this
messages=messages
)
Best practice: If you frequently hit max_tokens, increase the limit or implement automatic continuation logic.
stop_sequence
Indicates Claude stopped because it encountered one of the stop_sequences you specified in your API request. The stop_sequence field in the response will contain the actual sequence that triggered the stop.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
messages=[{"role": "user", "content": "Tell me a story."}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The content up to (but not including) the stop sequence
print(response.content[0].text)
This is useful for role-playing or structured generation where you want to control when Claude stops.
Building a Robust Response Handler
Combine all stop reasons into a single handler for production applications:
def handle_claude_response(response, messages, tools=None):
"""
Handle all possible stop reasons from Claude.
Returns the final response after processing.
"""
if response.stop_reason == "end_turn":
# Natural end - return the response
if not response.content:
# Empty response - retry
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 100,
messages=messages
)
return response
elif response.stop_reason == "tool_use":
# Execute tools and continue
messages.append({"role": "assistant", "content": response.content})
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({"role": "user", "content": [
{"type": "tool_result", "tool_use_id": block.id, "content": result}
]})
return client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 100,
tools=tools,
messages=messages
)
elif response.stop_reason == "max_tokens":
# Continue the conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
return client.messages.create(
model=response.model,
max_tokens=response.usage.output_tokens + 100,
messages=messages
)
elif response.stop_reason == "stop_sequence":
# Handle as needed - content is complete up to the stop sequence
return response
else:
raise ValueError(f"Unknown stop_reason: {response.stop_reason}")
Best Practices
- Always check
stop_reasonbefore processing response content. Don't assumeend_turnmeans the response is complete—check for empty content. - Handle
tool_useexplicitly in a loop. Claude may call multiple tools in one response, and each tool result must be sent back. - Increase
max_tokensif you frequently seemax_tokensstop reasons. For long-form content, consider setting it to 4096 or higher. - Use
stop_sequencescarefully—they can truncate responses mid-sentence if the sequence appears in generated text. - Log stop reasons in production to monitor response patterns and detect issues early.
Key Takeaways
- Claude returns four stop reasons:
end_turn(natural end),tool_use(wants to call a tool),max_tokens(response truncated), andstop_sequence(custom stop triggered). - Empty responses with
end_turnare common after tool results—prevent them by sending tool results without additional text, or implement a retry mechanism. tool_userequires a loop: Execute the tool, send results back, and continue the conversation until Claude returnsend_turn.max_tokensmeans incomplete output: Always continue the conversation or increase the token limit to get the full response.- Build a unified handler that processes all stop reasons to create robust, production-ready applications.