Mastering Claude API Stop Reasons: A Practical Guide to Handling Response Termination
Learn how to interpret and handle Claude API stop_reason values like end_turn, tool_use, and max_tokens to build robust, production-ready applications.
This guide explains Claude's stop_reason field—why the model stops generating (end_turn, tool_use, max_tokens, stop_sequence)—and provides actionable code examples for handling each case, including empty responses and tool loops.
Introduction
When you send a request to the Claude API, the response includes a stop_reason field that tells you why the model stopped generating. This isn't an error—it's a signal. Understanding these signals is essential for building applications that respond intelligently to different scenarios, from natural conversation endings to tool call requests.
In this guide, you'll learn:
- What each
stop_reasonvalue means - How to handle
end_turn(including empty responses) - How to process
tool_useand continue the conversation - How to manage
max_tokenslimits gracefully - Best practices for production systems
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Unlike HTTP errors (which indicate a failed request), stop_reason tells you why Claude successfully finished its response.
Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
Stop Reason Values
end_turn
What it means: Claude finished its response naturally. It has nothing more to say and is waiting for the user to respond.
When it occurs: This is the most common stop reason. You'll see it after Claude answers a question, completes a task, or decides its turn is over.
How to handle it: In most cases, you can display the response to the user and wait for their next input.
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
#### Empty Responses with end_turn
Sometimes Claude returns an empty response (exactly 2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.
- Adding text blocks immediately after tool results (Claude learns to expect the user to always insert text after tool results, so it ends its turn to follow the pattern)
- Sending Claude's completed response back without adding anything (Claude already decided it's done, so it will remain done)
# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't add text after tool_result
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
]} # Just the tool_result, no additional text
]
If you still get empty responses after fixing the above, use a continuation prompt:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Add a continuation prompt in a NEW user message
messages.append({
"role": "user",
"content": "Please continue with your response."
})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
tool_use
What it means: Claude wants to use a tool. The response will contain one or more tool_use content blocks.
When it occurs: When you've defined tools in your API request and Claude decides it needs to call one (or more) to complete the task.
How to handle it: You must execute the tool, return the result as a tool_result block, and continue the conversation.
def handle_tool_use(response, messages):
# Extract tool use blocks
tool_use_blocks = [
block for block in response.content
if block.type == "tool_use"
]
# Execute each tool and collect results
tool_results = []
for tool_use in tool_use_blocks:
result = execute_tool(tool_use.name, tool_use.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result)
})
# Append assistant response and tool results to messages
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
# Continue the conversation
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
max_tokens
What it means: Claude hit the max_tokens limit you set. The response is truncated.
When it occurs: When Claude's full response would exceed the max_tokens parameter in your request.
How to handle it: You can continue the conversation by sending a follow-up message asking Claude to finish its thought.
def handle_max_tokens(response, messages):
if response.stop_reason == "max_tokens":
# Append the partial response
messages.append({"role": "assistant", "content": response.content})
# Ask Claude to continue
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
return response
stop_sequence
What it means: Claude encountered a custom stop sequence you defined in your API request.
When it occurs: When you've set the stop_sequences parameter (e.g., ["\n\nHuman:"]) and Claude generates that sequence.
How to handle it: The response is complete up to the stop sequence. You can process it as-is or continue with a new user message.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["END"],
messages=[
{"role": "user", "content": "List three colors and then write END."}
]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
print(response.content[0].text) # Will not include "END"
Building a Robust Handler
In production, you'll want a single function that handles all stop reasons gracefully:
def process_claude_response(client, messages, max_iterations=10):
"""
Process Claude's response, handling all stop reasons.
Automatically continues for tool_use and max_tokens.
"""
iteration = 0
while iteration < max_iterations:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn":
# Check for empty response
if not response.content:
messages.append({
"role": "user",
"content": "Please continue."
})
iteration += 1
continue
return response
elif response.stop_reason == "tool_use":
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
elif response.stop_reason == "max_tokens":
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue."
})
elif response.stop_reason == "stop_sequence":
return response
iteration += 1
raise Exception("Max iterations reached without completion")
Best Practices
- Always check
stop_reason– Never assume the response is final. Always inspect thestop_reasonfield to determine next steps.
- Handle empty responses gracefully – Implement the continuation prompt pattern for empty
end_turnresponses, especially in tool-use workflows.
- Set a maximum iteration limit – When handling
tool_useormax_tokensin a loop, always set a limit to prevent infinite loops.
- Log stop reasons – In production, log the
stop_reasonandusagefields for monitoring and debugging.
- Test with different scenarios – Test your handler with short responses (to trigger
max_tokens), tool-using prompts, and natural conversation endings.
Key Takeaways
end_turnmeans Claude finished naturally; watch for empty responses in tool-use workflows and use a continuation prompt if needed.tool_usemeans Claude wants to call a tool; you must execute it and return the result to continue.max_tokensmeans Claude's response was truncated; send a follow-up message to let it finish.stop_sequencemeans a custom stop sequence was triggered; the response is complete up to that point.- Build a unified handler that loops through tool calls and truncated responses, with a maximum iteration limit to prevent infinite loops.