Mastering Extended Thinking in Claude: A Complete Guide to Adaptive Reasoning
Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, effort parameters, budget tokens, and practical API examples.
This guide explains how to enable and optimize Claude's extended thinking capability, including adaptive thinking with the effort parameter, manual budget tokens, and best practices for complex reasoning tasks.
Mastering Extended Thinking in Claude: A Complete Guide to Adaptive Reasoning
Claude's extended thinking feature unlocks enhanced reasoning capabilities for complex tasks, giving the model room to "think through" problems step-by-step before delivering a final answer. Whether you're building a research assistant, a code analysis tool, or a multi-step reasoning agent, understanding how to configure and use extended thinking is essential.
This guide covers everything from the basics of enabling extended thinking to advanced configuration with adaptive thinking and effort parameters. You'll learn practical API patterns, model-specific behavior, and best practices to get the most out of Claude's reasoning abilities.
What Is Extended Thinking?
Extended thinking allows Claude to generate internal reasoning content blocks before producing its final response. These thinking blocks contain the model's step-by-step analysis, which it then uses to craft a more accurate and well-reasoned answer.
When extended thinking is enabled, the API response includes one or more thinking content blocks followed by text content blocks. The thinking blocks contain Claude's internal reasoning and a cryptographic signature for verification.
Here's what a typical response looks like:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
Adaptive Thinking vs. Manual Extended Thinking
Claude offers two modes for extended thinking: adaptive thinking (recommended) and manual extended thinking (deprecated on most models).
Adaptive Thinking (Recommended)
Adaptive thinking lets Claude automatically decide how much reasoning to apply based on the complexity of the task. You control the reasoning depth using the effort parameter, which accepts values from 0.0 (minimum reasoning) to 1.0 (maximum reasoning).
- Automatically adjusts reasoning depth per request
- No need to guess a token budget
- Supported on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6
- Future-proof: manual mode is being phased out
Manual Extended Thinking (Deprecated)
Manual extended thinking requires you to specify a budget_tokens value, which sets a hard limit on the number of tokens Claude can use for reasoning. This mode is deprecated on Claude Opus 4.7 (returns a 400 error) and will be removed from other models in future releases.
Model-Specific Behavior
Different Claude models handle extended thinking differently. Here's what you need to know:
| Model | Adaptive Thinking | Manual Thinking | Notes |
|---|---|---|---|
| Claude Opus 4.7 | ✅ Required | ❌ Returns 400 error | Use thinking: {type: "adaptive"} with effort |
| Claude Opus 4.6 | ✅ Recommended | ✅ Deprecated but functional | Migrate to adaptive thinking |
| Claude Sonnet 4.6 | ✅ Recommended | ✅ Deprecated, interleaved mode | Migrate to adaptive thinking |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | thinking: {type: "disabled"} not supported; display defaults to "omitted" |
How to Enable Extended Thinking
Using Adaptive Thinking (Python)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": 0.8 # 0.0 (min) to 1.0 (max)
},
messages=[
{
"role": "user",
"content": "Analyze the following code for potential security vulnerabilities and suggest fixes:\n\n
python\ndef authenticate(user, password):\n if user == 'admin' and password == 'secret123':\n return True\n return False\n``"
}
]
)
Access thinking and text content
for block in response.content:
if block.type == "thinking":
print("Thinking:", block.thinking[:200] + "...")
elif block.type == "text":
print("Response:", block.text)
### Using Manual Extended Thinking (Python - Legacy)
python
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
thinking={
"type": "enabled",
"budget_tokens": 16000 # Max tokens for reasoning
},
messages=[
{
"role": "user",
"content": "Explain the P vs NP problem in computer science, including its implications for cryptography."
}
]
)
### Using Adaptive Thinking (TypeScript)
typescript
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 0.9
},
messages: [
{
role: 'user',
content: 'Design a distributed caching system that handles cache invalidation across multiple regions. Consider consistency, latency, and failure modes.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Response:', block.text);
}
}
## Understanding the Effort Parameter
The
effort parameter in adaptive thinking gives you fine-grained control over reasoning depth:
- 0.0: Minimal reasoning – Claude skips most internal deliberation and produces a quick response. Best for simple, factual queries.
- 0.3–0.5: Moderate reasoning – Good balance for everyday complex tasks like code review or data analysis.
- 0.7–0.9: Deep reasoning – Ideal for research, mathematical proofs, or multi-step problem solving.
- 1.0: Maximum reasoning – Use for the most challenging problems where accuracy is critical and latency is acceptable.
Pro tip: Start with effort: 0.5 and increase if you need deeper analysis. Higher effort values increase response time and token usage.
Best Practices for Extended Thinking
1. Set Appropriate Max Tokens
Extended thinking consumes tokens from your
max_tokens budget. Ensure max_tokens is large enough to accommodate both thinking and the final response. A good rule of thumb:
max_tokens = budget_tokens (or effort-equivalent) + expected_response_tokens
For complex tasks, start with max_tokens: 8192 and adjust based on observed usage.
2. Handle Thinking Blocks in Streaming
When streaming responses, thinking blocks appear before text blocks. Process them accordingly:
python
import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive", "effort": 0.7},
messages=[{"role": "user", "content": "Solve this complex math problem..."}]
) as stream:
for event in stream:
if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
# Accumulate thinking content
pass
elif event.type == "content_block_delta" and event.delta.type == "text_delta":
# Output final response
print(event.delta.text, end="")
`
3. Use for Multi-Step Reasoning Tasks
Extended thinking shines in tasks that require:
- Complex mathematical calculations
- Multi-step code analysis and debugging
- Research synthesis from multiple sources
- Strategic planning and decision trees
- Legal or regulatory analysis
4. Monitor Token Usage
Extended thinking can significantly increase token consumption. Monitor your usage and adjust effort/budget accordingly. For production systems, consider:
- Setting lower effort values for simple queries
- Caching responses for repeated questions
- Using prompt caching to reduce costs on long context windows
Common Pitfalls to Avoid
- Setting budget_tokens too low: If the budget is too small, Claude may cut off reasoning prematurely, leading to incomplete or inaccurate responses.
- Forgetting to update max_tokens: Extended thinking tokens count toward your max_tokens limit. If max_tokens is too low, the response may be truncated.
- Using manual mode on Opus 4.7: This returns a 400 error. Always use adaptive thinking with the effort parameter.
- Ignoring the signature: The
signature field in thinking blocks is essential for verifying the integrity of Claude's reasoning, especially in regulated environments.
Key Takeaways
- Adaptive thinking (
thinking: {type: "adaptive"}) is the recommended approach for all current Claude models, with the effort parameter (0.0–1.0) controlling reasoning depth.
Manual extended thinking ( thinking: {type: "enabled", budget_tokens: N}) is deprecated on Claude Opus 4.7 and will be removed from other models in future releases.
Model behavior varies: Claude Opus 4.7 requires adaptive thinking, while Claude Mythos Preview defaults to it. Always check the model's documentation.
Set max_tokens` generously to accommodate both thinking and response tokens. A good starting point is 8192 tokens for complex tasks.