Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, tool_use & max_tokens
Learn how to handle Claude API stop_reason values (end_turn, tool_use, max_tokens) to build reliable AI applications. Includes Python code examples and troubleshooting tips.
This guide explains Claude's stop_reason field (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each case in your application. You'll learn to detect tool calls, manage token limits, prevent empty responses, and build robust multi-turn conversations.
Introduction
When you call the Claude API, every successful response includes a stop_reason field that tells you why the model stopped generating. Understanding these values is essential for building applications that handle tool calls, manage conversation flow, and recover gracefully from limits.
Unlike error codes (which indicate failures), stop_reason is part of a successful response. It’s your signal for what to do next: continue the conversation, execute a tool, or inform the user.
The stop_reason Field
The stop_reason field appears in every successful Messages API response. Here’s a typical example:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
There are four possible values for stop_reason:
| Value | Meaning |
|---|---|
end_turn | Claude finished naturally and handed control back to the user |
tool_use | Claude wants to call a tool (function) |
max_tokens | Claude hit the max_tokens limit you set |
stop_sequence | Claude encountered a custom stop sequence you defined |
Handling end_turn
end_turn is the most common stop reason. It means Claude completed its response and expects you to provide the next user message.
Basic Handling
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
if response.stop_reason == "end_turn":
# Claude is done. Show the response and wait for user input.
print(response.content[0].text)
# Now you can prompt the user for the next question
Empty Responses with end_turn
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens when Claude decides the assistant turn is complete—often after tool results.
- Adding text blocks immediately after
tool_resultin the same user message - Sending Claude’s completed response back without adding new user input
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send tool results directly
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
# ✅ No extra text
]}
]
If you still get empty responses, add a continuation prompt in a new user message:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Add a continuation prompt
messages.append({
"role": "user",
"content": "Please continue with your response."
})
# Retry
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
return response
Handling tool_use
When Claude decides it needs to call a tool (function), it returns stop_reason: "tool_use". The response content will contain one or more tool_use blocks.
Detecting and Executing Tool Calls
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
],
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
]
)
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
tool_id = block.id
# Execute the tool (your function)
result = execute_tool(tool_name, tool_input)
# Add tool result to conversation
messages.append({
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": tool_id,
"content": str(result)
}
]
})
# Continue the conversation
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=messages
)
Handling Multiple Tool Calls (Parallel Tool Use)
Claude can request multiple tools in a single response. Always iterate through all content blocks:
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
messages.append({"role": "user", "content": tool_results})
Handling max_tokens
max_tokens indicates Claude hit the token limit you set. The response is truncated—Claude may have been in the middle of a sentence.
Recovery Strategies
if response.stop_reason == "max_tokens":
# Option 1: Increase max_tokens and retry
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096, # Increase limit
messages=messages
)
# Option 2: Ask Claude to continue
messages.append({
"role": "assistant",
"content": response.content # Include partial response
})
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=messages
)
Best practice: Always check stop_reason and handle max_tokens gracefully. For long outputs, consider using larger max_tokens values or streaming.
Handling stop_sequence
stop_sequence occurs when Claude encounters a custom stop sequence you defined in the request. This is useful for structured outputs or early termination.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nEND"],
messages=[
{"role": "user", "content": "List 3 facts about Mars, then write END"}
]
)
if response.stop_reason == "stop_sequence":
print(f"Stopped at sequence: {response.stop_sequence}")
# The content ends before the stop sequence
print(response.content[0].text)
Building a Robust Conversation Loop
Here’s a complete pattern that handles all stop reasons:
def run_conversation(client, system_prompt, tools, user_messages):
messages = [{"role": "user", "content": msg} for msg in user_messages]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system_prompt,
tools=tools,
messages=messages
)
if response.stop_reason == "end_turn":
# Claude is done
return response.content[0].text if response.content else ""
elif response.stop_reason == "tool_use":
# Execute tools and continue
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
messages.append({"role": "assistant", "content": [block]})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
}]
})
elif response.stop_reason == "max_tokens":
# Append partial response and ask to continue
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue."
})
elif response.stop_reason == "stop_sequence":
# Custom stop sequence reached
return response.content[0].text
Key Takeaways
- Always check
stop_reasonin every API response to determine the next action—don’t assumeend_turn. - For
tool_use, iterate through all content blocks to handle parallel tool calls, then feed results back to Claude. - For
max_tokens, either increase the limit or ask Claude to continue from its partial response. - Prevent empty
end_turnresponses by avoiding extra text aftertool_resultblocks, and use continuation prompts if needed. - Use
stop_sequencesfor structured outputs or to enforce early termination in specific workflows.