Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens
Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable applications. Includes code examples, empty response fixes, and best practices.
This guide explains Claude's stop_reason field (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each case in your application. You'll learn to detect empty responses, continue tool loops, handle token limits, and build robust multi-turn conversations.
Introduction
Every time you call the Claude Messages API, the response includes a stop_reason field. This tiny piece of data tells you why Claude stopped generating—whether it finished naturally, wants to use a tool, hit a token limit, or encountered a stop sequence. Understanding these values is the difference between a brittle prototype and a production-ready application.
In this guide, you'll learn:
- What each
stop_reasonvalue means - How to handle
end_turn(including empty responses) - How to build tool-use loops with
tool_use - How to manage
max_tokensandstop_sequencegracefully - Best practices for robust multi-turn conversations
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Unlike errors (which indicate failures), stop_reason tells you why Claude successfully completed its response generation.
Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
Stop Reason Values
Claude can return four distinct stop_reason values:
| Value | Meaning | When It Occurs |
|---|---|---|
end_turn | Claude finished naturally | Most common; Claude believes the conversation turn is complete |
tool_use | Claude wants to call a tool | The response contains one or more tool_use content blocks |
max_tokens | Claude hit the token limit | The response was truncated because it reached max_tokens |
stop_sequence | Claude encountered a custom stop sequence | One of your provided stop_sequences was generated |
Handling end_turn
end_turn is the simplest case: Claude has finished its response and expects you to either end the conversation or provide a new user message.
Basic Handling
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
Empty Responses with end_turn
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude interprets that the assistant turn is complete, particularly after tool results.
- Adding text blocks immediately after
tool_resultblocks - Sending Claude's completed response back without adding anything new
# INCORRECT: Adding text immediately after tool_result
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{"type": "tool_use", "id": "toolu_123", "name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # Don't add text after tool_result!
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate the sum of 1234 and 5678"},
{"role": "assistant", "content": [
{"type": "tool_use", "id": "toolu_123", "name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
]} # Just the tool_result, no additional text
]
If you still get empty responses after fixing the above, use a continuation prompt:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Add a continuation prompt in a NEW user message
messages.append({
"role": "user",
"content": "Please continue with your response."
})
return client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
return response
Handling tool_use
When Claude decides it needs to call a tool, it returns stop_reason: "tool_use" along with one or more tool_use content blocks. Your application must:
- Execute the tool(s)
- Return the results as
tool_resultblocks - Continue the conversation
Tool Loop Pattern
def process_tool_calls(response, messages):
"""Handle tool_use responses and continue the conversation."""
while response.stop_reason == "tool_use":
# Collect all tool results
tool_results = []
for block in response.content:
if block.type == "tool_use":
# Execute the tool (your implementation)
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
# Add assistant response and tool results to messages
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
# Get next response from Claude
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
return response
Parallel Tool Calls
Claude can request multiple tools in a single response. Your code should handle all of them before returning results:
def execute_parallel_tools(response):
"""Execute all tools in parallel and return results."""
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
return tool_results
Handling max_tokens
When Claude hits the max_tokens limit, the response is truncated. This is common in long conversations or when generating large outputs.
Detection and Recovery
def handle_max_tokens(response, messages):
"""Handle truncated responses by continuing the conversation."""
if response.stop_reason == "max_tokens":
# Add the partial response to the conversation
messages.append({"role": "assistant", "content": response.content})
# Ask Claude to continue
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
return response
Increasing Token Budget
For long outputs, consider increasing max_tokens:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096, # Increase for longer responses
messages=messages
)
Handling stop_sequence
If you provide custom stop_sequences, Claude will stop when it encounters one. This is useful for structured outputs:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nHuman:", "\n\nAssistant:"],
messages=[{"role": "user", "content": "Tell me a story."}]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# Process the truncated response
Building a Complete Handler
Here's a robust handler that manages all stop reasons:
def handle_claude_response(client, messages, max_iterations=10):
"""Complete handler for all stop reasons."""
iteration = 0
while iteration < max_iterations:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn":
if not response.content:
# Handle empty response
messages.append({
"role": "user",
"content": "Please continue."
})
iteration += 1
continue
return response
elif response.stop_reason == "tool_use":
# Execute tools and continue
messages.append({"role": "assistant", "content": response.content})
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
iteration += 1
elif response.stop_reason == "max_tokens":
# Continue from where we left off
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue."
})
iteration += 1
elif response.stop_reason == "stop_sequence":
# Custom handling for stop sequences
return response
raise Exception("Max iterations reached without completion")
Best Practices
- Always check stop_reason – Never assume Claude finished naturally. Always inspect the field and handle each case.
- Limit tool loops – Set a maximum number of tool call iterations to prevent infinite loops.
- Handle empty responses – Implement the continuation prompt pattern for empty
end_turnresponses. - Log stop_reason – For debugging, log the stop reason and response metadata.
- Test edge cases – Test with max_tokens=1, empty tool results, and rapid tool sequences.
Key Takeaways
- Claude returns four stop reasons:
end_turn(natural finish),tool_use(wants to call a tool),max_tokens(truncated), andstop_sequence(custom stop). - Empty responses with
end_turnhappen when Claude thinks the turn is complete; fix by not adding text aftertool_resultblocks and using continuation prompts. - Tool loops require careful handling: execute all tools, return results, and continue the conversation until
end_turn. max_tokenstruncation can be handled by appending the partial response and asking Claude to continue.- Build a unified handler that manages all stop reasons to create robust, production-ready Claude applications.