Mastering Claude API Stop Reasons: Build Robust Applications with end_turn, max_tokens, and tool_use
Learn how to handle Claude API stop_reason values—end_turn, max_tokens, tool_use, and stop_sequence—to build reliable, production-ready applications. Includes code examples and troubleshooting tips.
This guide explains the four Claude API stop reasons (end_turn, max_tokens, tool_use, stop_sequence), how to handle each in code, and how to prevent common issues like empty responses. You'll learn to build robust applications that respond appropriately to every API response.
Introduction
When you call the Claude Messages API, every successful response includes a stop_reason field. This field tells you why the model stopped generating—whether it finished naturally, hit a token limit, requested a tool call, or encountered a stop sequence. Understanding these values is essential for building applications that handle responses correctly, especially when using tools or streaming.
In this guide, you'll learn:
- The four possible
stop_reasonvalues and what each means - How to handle each stop reason in Python and TypeScript
- How to prevent and recover from empty responses
- Best practices for production applications
Understanding the stop_reason Field
The stop_reason field is part of every successful Messages API response. Unlike errors (which indicate a failed request), stop_reason tells you why Claude successfully completed its response generation.
Here's a typical response structure:
{
"id": "msg_01234",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's the answer to your question..."
}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 100,
"output_tokens": 50
}
}
The Four Stop Reason Values
1. end_turn
Meaning: Claude finished its response naturally and decided the conversation turn is complete.
This is the most common stop reason. It indicates the model believes it has fully answered the user's request and doesn't need to continue.
How to handle it:from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
if response.stop_reason == "end_turn":
# Process the complete response
print(response.content[0].text)
Special case: Empty responses with end_turn
Sometimes Claude returns an empty response (2-3 tokens with no content) with stop_reason: "end_turn". This typically happens after tool results when Claude interprets that the assistant turn is already complete.
- Adding text blocks immediately after
tool_resultblocks - Sending Claude's completed response back without adding anything new
# INCORRECT: Adding text after tool_result
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"},
{"type": "text", "text": "Here's the result"} # ❌ Don't do this
]}
]
CORRECT: Send tool results directly without additional text
messages = [
{"role": "user", "content": "Calculate 1234 + 5678"},
{"role": "assistant", "content": [
{
"type": "tool_use",
"id": "toolu_123",
"name": "calculator",
"input": {"operation": "add", "a": 1234, "b": 5678}
}
]},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": "toolu_123", "content": "6912"}
# ✅ No additional text
]}
]
Recovering from empty responses:
If you still get empty responses after fixing the above, add a continuation prompt in a new user message:
def handle_empty_response(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
if response.stop_reason == "end_turn" and not response.content:
# Add a continuation prompt
messages.append({
"role": "user",
"content": "Please continue with your response."
})
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
return response
2. max_tokens
Meaning: Claude stopped because it reached the max_tokens limit you set in your request.
This is common for long responses. The model's response is truncated at the token limit.
How to handle it:response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=100, # Low limit to demonstrate
messages=[
{"role": "user", "content": "Write a detailed essay about AI safety."}
]
)
if response.stop_reason == "max_tokens":
# Response was truncated. Get the continuation.
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
continuation = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1000,
messages=messages
)
TypeScript example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function getCompleteResponse(messages: Anthropic.MessageParam[]) {
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 100,
messages: messages
});
if (response.stop_reason === 'max_tokens') {
// Request continuation
messages.push({ role: 'assistant', content: response.content });
messages.push({ role: 'user', content: 'Please continue.' });
return client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1000,
messages: messages
});
}
return response;
}
3. tool_use
Meaning: Claude decided to use a tool and stopped to wait for the tool result.
This is the most important stop reason for building agentic applications. When you see tool_use, you must:
- Execute the requested tool
- Return the result as a
tool_resultblock - Continue the conversation
def handle_tool_call(client, messages):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
],
messages=messages
)
if response.stop_reason == "tool_use":
# Extract the tool use from content
tool_use = next(
block for block in response.content
if block.type == "tool_use"
)
# Execute the tool (replace with actual implementation)
result = execute_tool(tool_use.name, tool_use.input)
# Add assistant response and tool result to messages
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result)
}]
})
# Continue the conversation
return handle_tool_call(client, messages)
return response
4. stop_sequence
Meaning: Claude stopped because it encountered a custom stop sequence you defined in your request.
This is useful for structured outputs or when you want to end generation at a specific marker.
How to handle it:response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stop_sequences=["\n\nEND"],
messages=[
{"role": "user", "content": "List three facts about Mars. End with END."}
]
)
if response.stop_reason == "stop_sequence":
# The response was cut off at the stop sequence
# response.stop_sequence will contain the actual sequence found
print(f"Stopped at sequence: {response.stop_sequence}")
# Remove the stop sequence from the content if needed
text = response.content[0].text
if text.endswith("\n\nEND"):
text = text[:-5] # Remove the stop sequence
Building a Robust Handler
Here's a complete handler that manages all stop reasons:
from anthropic import Anthropic
from typing import List, Dict, Any
class ClaudeResponseHandler:
def __init__(self, client: Anthropic, model: str = "claude-sonnet-4-20250514"):
self.client = client
self.model = model
def handle_response(self, messages: List[Dict], max_tokens: int = 1024, tools: List[Dict] = None):
response = self.client.messages.create(
model=self.model,
max_tokens=max_tokens,
messages=messages,
tools=tools
)
stop_reason = response.stop_reason
if stop_reason == "end_turn":
if not response.content:
# Handle empty response
messages.append({
"role": "user",
"content": "Please continue."
})
return self.handle_response(messages, max_tokens, tools)
return response.content[0].text
elif stop_reason == "max_tokens":
# Request continuation
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": "Please continue from where you left off."
})
return self.handle_response(messages, max_tokens * 2, tools)
elif stop_reason == "tool_use":
# Execute tool and continue
tool_use = next(
block for block in response.content
if block.type == "tool_use"
)
result = self.execute_tool(tool_use.name, tool_use.input)
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": str(result)
}]
})
return self.handle_response(messages, max_tokens, tools)
elif stop_reason == "stop_sequence":
# Clean up the stop sequence from content
text = response.content[0].text
if response.stop_sequence and text.endswith(response.stop_sequence):
text = text[:-len(response.stop_sequence)]
return text
else:
raise ValueError(f"Unknown stop_reason: {stop_reason}")
def execute_tool(self, name: str, input_data: Dict[str, Any]) -> Any:
# Implement your tool execution logic here
raise NotImplementedError("Tool execution not implemented")
Best Practices
- Always check
stop_reasonbefore processing content. Different reasons require different handling.
- Handle empty
end_turnresponses gracefully by adding a continuation prompt rather than retrying the same messages.
- For
max_tokens, always request continuation rather than increasingmax_tokensblindly—the model may need multiple continuations.
- For
tool_use, ensure your tool execution is reliable and handles errors gracefully. Return meaningful error messages astool_resultcontent.
- Log
stop_reasonandstop_sequencefor debugging and monitoring in production.
Key Takeaways
- Four stop reasons:
end_turn(natural completion),max_tokens(truncated),tool_use(tool requested), andstop_sequence(custom marker reached). - Empty responses with
end_turnare usually caused by adding text aftertool_resultblocks—send results directly without extra text. - For
max_tokens, always request continuation by adding a new user message—don't just increase the token limit. - For
tool_use, you must execute the tool and return the result to continue the conversation. - Build a unified handler that manages all stop reasons to create robust, production-ready Claude applications.