Mastering Claude API Stop Reasons: Build Smarter, More Reliable Applications
Learn how to handle Claude API stop reasons like end_turn, tool_use, and max_tokens. Practical code examples and strategies for building robust AI applications.
This guide explains Claude API stop reasons (end_turn, tool_use, max_tokens, stop_sequence) and how to handle each in your code. You'll learn to detect empty responses, manage tool calls, and build robust multi-turn conversations with practical Python examples.
Introduction
Every time you call the Claude API, the response includes a stop_reason field. This small piece of data tells you why Claude stopped generating—whether it finished naturally, wants to use a tool, or hit a token limit. Ignoring it is like driving without looking at your dashboard: you might get where you're going, but you'll miss critical signals along the way.
In this guide, you'll learn exactly what each stop_reason value means, how to handle them in code, and how to avoid common pitfalls like empty responses. By the end, you'll be able to build Claude-powered applications that gracefully handle every possible stopping scenario.
Understanding the stop_reason Field
The stop_reason field appears in every successful Messages API response. It's not an error—it's a signal. Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
There are four possible stop_reason values:
| Value | Meaning |
|---|---|
end_turn | Claude finished its response naturally |
tool_use | Claude wants to call a tool |
max_tokens | Claude hit the max_tokens limit |
stop_sequence | Claude encountered a custom stop sequence |
Handling end_turn
end_turn is the most common stop reason. It means Claude decided its response is complete. In most cases, you can simply return the response to the user.
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}]
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
The Empty Response Gotcha
Sometimes Claude returns an empty response (2–3 tokens with no content) with stop_reason: "end_turn". This typically happens in tool-use workflows when:
- You add text blocks immediately after
tool_resultblocks - You send Claude's own completed response back without adding anything new
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send tool results directly
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"} # ✅ Just the result
]}
]
If you still get empty responses, implement a retry loop:
def handle_empty_response(client, messages, max_retries=3):
for attempt in range(max_retries):
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Empty response—retry with a prompt adjustment
messages.append({"role": "user", "content": "Please continue."})
continue
return response
raise Exception("Claude returned empty responses after retries")
Handling tool_use
When Claude decides it needs to call a tool (like a calculator, database query, or external API), it returns stop_reason: "tool_use". Your application must:
- Detect the tool call
- Execute the tool
- Return results to Claude
def handle_tool_use(response, messages):
"""Process tool calls and continue the conversation."""
# Extract tool use blocks
tool_use_blocks = [
block for block in response.content
if block.type == "tool_use"
]
# Execute each tool and collect results
tool_results = []
for tool_block in tool_use_blocks:
result = execute_tool(tool_block.name, tool_block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_block.id,
"content": str(result)
})
# Add Claude's response and tool results to conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
# Continue the conversation
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
Multi-Turn Tool Loop
For complex tasks, Claude may call multiple tools in sequence. Build a loop that continues until stop_reason is end_turn:
def run_tool_conversation(client, messages, max_turns=10):
for turn in range(max_turns):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn":
return response
if response.stop_reason == "tool_use":
response = handle_tool_use(response, messages)
continue
# Handle other stop reasons
if response.stop_reason in ("max_tokens", "stop_sequence"):
# Continue or handle partial response
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue."})
continue
return response
Handling max_tokens
When Claude hits the max_tokens limit, the response is truncated. This is common for long-form content generation. Your strategy depends on the use case:
For Chat Applications
Simply ask Claude to continue:
if response.stop_reason == "max_tokens":
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Please continue from where you left off."})
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048, # Increase if needed
messages=messages
)
For Summarization or Data Extraction
If you need complete output, increase max_tokens or split the task:
def ensure_complete_response(client, prompt, max_tokens=4096):
"""Keep requesting until we get a complete response."""
messages = [{"role": "user", "content": prompt}]
full_content = []
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=max_tokens,
messages=messages
)
full_content.append(response.content[0].text)
if response.stop_reason != "max_tokens":
break
# Continue from where we left off
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": "Continue."})
return "".join(full_content)
Handling stop_sequence
If you've defined custom stop sequences (e.g., "`" for code blocks), Claude stops when it encounters one. This is useful for:
- Extracting structured data
- Preventing Claude from generating beyond a certain point
- Building controlled generation pipelines
# Example: Extract JSON response
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=[""],
messages=[{
"role": "user",
"content": "Generate a JSON config for a web server. Use ``json...`"
}]
)
if response.stop_reason == "stop_sequence":
# Extract the JSON before the stop sequence
json_text = response.content[0].text.split("
`")[0]
config = json.loads(json_text)
## Streaming and Stop Reasons
When streaming, you don't get
stop_reason until the final message event. Here's how to handle it:
python
import json
from anthropic import Anthropic
client = Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short poem."}]
) as stream:
for event in stream:
if event.type == "content_block_delta":
print(event.delta.text, end="", flush=True)
elif event.type == "message_stop":
# Now we can access stop_reason
stop_reason = event.message.stop_reason
print(f"\n\nStopped because: {stop_reason}")
if stop_reason == "tool_use":
# Handle tool calls from streamed response
handle_streamed_tool_calls(event.message)
`
Best Practices Summary
- Always check
stop_reason before processing the response content
Build a loop for tool use—Claude may need multiple tool calls
Handle max_tokens gracefully—either continue or inform the user
Watch for empty end_turn responses in tool-heavy workflows
Use stop_sequences for structured output extraction
Log stop reasons during development to understand Claude's behavior
Key Takeaways
stop_reason is not an error—it's a signal that tells you why Claude stopped generating, and each value requires a different handling strategy.
Tool use requires a loop—when stop_reason is tool_use, you must execute the tool and return results to Claude, often multiple times.
Empty responses are preventable—avoid adding text after tool_result blocks, and implement retry logic for robustness.
Streaming changes the game—with streaming, you only get stop_reason at the end, so design your event handlers accordingly.
max_tokens` is recoverable—you can always ask Claude to continue from where it left off, making long generations reliable.