Mastering Claude's Extended Thinking: A Complete Guide to Adaptive Reasoning
Learn how to use Claude's extended thinking capabilities for complex tasks. Covers adaptive thinking, effort parameters, manual mode, and practical API examples.
This guide teaches you how to enable and optimize Claude's extended thinking feature for complex reasoning tasks. You'll learn about adaptive thinking with the effort parameter, manual mode for legacy models, and how to handle thinking blocks in API responses.
Mastering Claude's Extended Thinking: A Complete Guide to Adaptive Reasoning
Claude's extended thinking capability is one of its most powerful features—it allows the model to engage in deep, step-by-step reasoning before delivering a final answer. Whether you're solving complex mathematical proofs, analyzing intricate codebases, or conducting multi-step research, extended thinking gives Claude the cognitive runway it needs to produce more accurate and thoughtful responses.
In this guide, you'll learn how to configure and use extended thinking effectively, understand the differences between adaptive and manual modes, and see practical code examples for the Claude API.
Understanding Extended Thinking
Extended thinking works by creating thinking content blocks in Claude's response. These blocks contain the model's internal reasoning process, followed by the final text response. This transparency allows you to see how Claude arrived at its conclusions—not just what it concluded.
Here's what a typical extended thinking response looks like:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
The thinking block includes a cryptographic signature that verifies the integrity of the thinking content—useful for auditing and trust verification.
Adaptive Thinking vs. Manual Extended Thinking
Claude offers two modes for extended thinking, and the right choice depends on your model version and use case.
Adaptive Thinking (Recommended for Claude Opus 4.7+)
Adaptive thinking is the modern approach, introduced with Claude Opus 4.7 and later models. Instead of setting a fixed token budget, you use the effort parameter to tell Claude how much reasoning effort to apply.
- Uses
thinking: {type: "adaptive"} - Requires the
effortparameter (values:"low","medium","high") - Claude dynamically allocates thinking tokens based on task complexity
- Simpler tasks use fewer tokens; complex tasks get more
Manual Extended Thinking (Legacy)
Manual extended thinking uses a fixed token budget:
{
"thinking": {
"type": "enabled",
"budget_tokens": 10000
}
}
Important: Manual mode is no longer supported on Claude Opus 4.7 and later models (returns a 400 error). It remains functional but deprecated on Claude Opus 4.6 and Claude Sonnet 4.6.
Model Compatibility Matrix
| Model | Adaptive Thinking | Manual Mode | Notes |
|---|---|---|---|
| Claude Opus 4.7+ | ✅ Required | ❌ Returns 400 error | Use effort parameter |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | disabled not supported; use display: "summarized" for summaries |
| Claude Opus 4.6 | ✅ Recommended | ✅ Deprecated | Will be removed in future |
| Claude Sonnet 4.6 | ✅ Recommended | ✅ Deprecated | Uses interleaved mode |
| Claude Sonnet 3.7 | ❌ | ✅ Supported | Legacy behavior |
How to Use Extended Thinking in the API
Basic Setup with Adaptive Thinking
Here's how to enable adaptive thinking with the effort parameter:
Python Example:import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={
"type": "adaptive",
"effort": "high"
},
messages=[
{
"role": "user",
"content": "Prove that there are infinitely many prime numbers congruent to 3 mod 4."
}
]
)
Process the response
for block in response.content:
if block.type == "thinking":
print(f"Thinking: {block.thinking[:200]}...")
print(f"Signature: {block.signature}")
elif block.type == "text":
print(f"Final answer: {block.text}")
TypeScript Example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 32000,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{
role: 'user',
content: 'Prove that there are infinitely many prime numbers congruent to 3 mod 4.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(Thinking: ${block.thinking.substring(0, 200)}...);
console.log(Signature: ${block.signature});
} else if (block.type === 'text') {
console.log(Final answer: ${block.text});
}
}
Using Manual Mode (Legacy Models)
For models that still support manual mode:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000
},
messages=[
{
"role": "user",
"content": "Explain the P vs NP problem in detail."
}
]
)
Choosing the Right Effort Level
The effort parameter in adaptive thinking gives you fine-grained control:
"low": Minimal reasoning overhead. Best for straightforward tasks where you want quick responses with basic verification."medium": Balanced reasoning. Good for most complex tasks like code review, data analysis, or multi-step logic."high": Maximum reasoning depth. Use for mathematical proofs, complex debugging, or tasks requiring thorough analysis.
"medium" and escalate to "high" only when you need deeper reasoning. Higher effort consumes more tokens and increases latency.
Handling Thinking Blocks in Responses
When processing responses, you'll need to handle the thinking blocks appropriately:
def process_claude_response(response):
thinking_content = []
final_text = []
for block in response.content:
if block.type == "thinking":
thinking_content.append({
"thinking": block.thinking,
"signature": block.signature
})
elif block.type == "text":
final_text.append(block.text)
return {
"thinking_blocks": thinking_content,
"final_answer": "".join(final_text)
}
Best Practices
1. Set Appropriate max_tokens
Always set max_tokens higher than your thinking budget. A good rule of thumb:
max_tokens = thinking_budget + expected_output_tokens
For adaptive thinking, set max_tokens generously (e.g., 32000 for complex tasks).
2. Use Streaming for Long Responses
Extended thinking can produce lengthy reasoning. Enable streaming to get partial results faster:
stream = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": "high"},
messages=[{"role": "user", "content": "Complex question..."}],
stream=True
)
for event in stream:
# Handle streaming events
pass
3. Validate Signatures for Critical Applications
For applications requiring audit trails (e.g., financial analysis, legal reasoning), verify the thinking block signatures:
# Store signatures for later verification
signatures = [
block.signature
for block in response.content
if block.type == "thinking"
]
4. Combine with Structured Outputs
Extended thinking pairs well with structured outputs for complex data extraction:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": "high"},
messages=[{"role": "user", "content": "Analyze this financial report..."}],
# Structured output configuration
)
Common Pitfalls to Avoid
- Using manual mode on Opus 4.7+: This will return a 400 error. Always use adaptive thinking for new models.
- Setting
budget_tokenstoo low: Claude may cut off reasoning prematurely. For manual mode, use at least 50% ofmax_tokens. - Ignoring the signature: For production systems, always validate signatures to ensure thinking integrity.
- Forgetting
max_tokens: Extended thinking requires sufficient token headroom. Always setmax_tokenshigher than your thinking budget.
Real-World Use Cases
Extended thinking excels in scenarios requiring deep reasoning:
- Mathematical proofs and theorem verification
- Complex code debugging and optimization
- Multi-step research synthesis
- Legal document analysis
- Scientific hypothesis generation
- Strategic planning and decision trees
Key Takeaways
- Adaptive thinking (
type: "adaptive"witheffortparameter) is the recommended approach for Claude Opus 4.7+ and newer models—manual mode is deprecated on these versions. - Choose effort levels wisely: Use
"low"for simple tasks,"medium"for most complex work, and"high"only when maximum reasoning depth is required. - Always set
max_tokensgenerously to give Claude enough room for both thinking and final output—a common source of errors is insufficient token allocation. - Handle thinking blocks explicitly in your code to extract reasoning content and signatures for auditing or transparency purposes.
- Model compatibility matters: Check the compatibility matrix before implementing—older models may still use manual mode, while newer ones require adaptive thinking.