Mastering Extended Thinking in Claude: A Guide to Adaptive Thinking and Manual Modes
Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and code examples for the API.
This guide explains how to enable and configure Claude's extended thinking feature, including adaptive thinking with the effort parameter and manual mode. You'll learn the differences across model versions, how to handle thinking blocks in API responses, and best practices for complex reasoning tasks.
Mastering Extended Thinking in Claude: A Guide to Adaptive Thinking and Manual Modes
Claude's extended thinking feature unlocks enhanced reasoning capabilities for complex tasks. It allows the model to "think" step-by-step before delivering its final answer, providing varying levels of transparency into its internal thought process. This guide will walk you through everything you need to know to use extended thinking effectively, from basic setup to advanced configuration.
What Is Extended Thinking?
Extended thinking enables Claude to allocate additional computational resources to reasoning. When enabled, Claude produces thinking content blocks that contain its internal reasoning, followed by the final text response. This is particularly useful for:
- Mathematical proofs and logical deductions
- Complex code generation and debugging
- Multi-step analysis and planning
- Tasks requiring careful reasoning before answering
thinking blocks and text blocks. Here's what a typical response looks like:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
Adaptive Thinking vs. Manual Extended Thinking
Claude offers two modes for extended thinking: adaptive thinking (recommended) and manual extended thinking (deprecated on most models).
Adaptive Thinking (Recommended)
Adaptive thinking lets Claude automatically determine how much thinking is needed based on the complexity of the task. You control the effort level using the effort parameter, which accepts values like low, medium, and high.
- You want optimal performance without manual tuning
- You're using Claude Opus 4.7 (where manual mode is not supported)
- You want Claude to dynamically allocate thinking resources
Manual Extended Thinking
Manual extended thinking allows you to set a fixed token budget for thinking using the budget_tokens parameter. This mode is deprecated on Claude Opus 4.6 and Claude Sonnet 4.6, and not supported on Claude Opus 4.7.
- You need precise control over thinking token allocation
- You're working with older models that still support it
- You have specific token budget requirements
Model-Specific Behavior
Different Claude models handle extended thinking differently. Here's a quick reference:
| Model | Adaptive Thinking | Manual Mode | Notes |
|---|---|---|---|
| Claude Opus 4.7 | ✅ Recommended | ❌ Not supported | Returns 400 error if used |
| Claude Opus 4.6 | ✅ Recommended | ⚠️ Deprecated but functional | Manual mode still works |
| Claude Sonnet 4.6 | ✅ Recommended | ⚠️ Deprecated but functional | Interleaved mode deprecated |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | disabled not supported |
How to Use Extended Thinking in the API
Basic Setup with Adaptive Thinking
Here's how to enable adaptive thinking with the effort parameter in Python:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "high" # Options: low, medium, high
},
messages=[
{
"role": "user",
"content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
}
]
)
Process the response
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
Using Manual Extended Thinking (Older Models)
For models that still support manual mode, you can set a specific token budget:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000 # Set your budget here
},
messages=[
{
"role": "user",
"content": "Explain the proof of Fermat's Last Theorem in simple terms."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking summary: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{
role: 'user',
content: 'Design a distributed system for real-time chat.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Response:', block.text);
}
}
}
main();
Best Practices for Extended Thinking
1. Choose the Right Effort Level
- Low effort: Use for simple tasks where quick reasoning is sufficient
- Medium effort: Good balance for most complex tasks
- High effort: Use for deep analytical problems, mathematical proofs, or multi-step reasoning
2. Set Appropriate max_tokens
Always set max_tokens higher than your thinking budget. A good rule of thumb:
max_tokens = budget_tokens + expected_output_tokens
For adaptive thinking, set max_tokens to accommodate both thinking and output.
3. Handle Thinking Blocks in Streaming
When streaming responses, thinking blocks appear before text blocks. Handle them appropriately:
import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive", "effort": "high"},
messages=[{"role": "user", "content": "Solve this complex equation..."}]
) as stream:
for event in stream:
if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
print(event.delta.thinking, end="")
elif event.type == "content_block_delta" and event.delta.type == "text_delta":
print(event.delta.text, end="")
4. Monitor Token Usage
Extended thinking consumes tokens from your budget. Monitor usage to avoid unexpected costs:
response = client.messages.create(...)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
5. Use with Structured Outputs
Extended thinking works well with structured outputs. You can request JSON or other formats in the thinking block:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive", "effort": "high"},
messages=[
{
"role": "user",
"content": "Return a JSON object with the solution steps and final answer."
}
]
)
Common Pitfalls and How to Avoid Them
Pitfall 1: Using Manual Mode on Claude Opus 4.7
Problem: You get a 400 error. Solution: Switch to adaptive thinking with theeffort parameter.
Pitfall 2: Insufficient max_tokens
Problem: Claude stops thinking mid-reasoning. Solution: Increasemax_tokens to accommodate both thinking and output.
Pitfall 3: Ignoring Thinking Blocks
Problem: You miss valuable reasoning context. Solution: Always process thinking blocks in your response handling.Pitfall 4: Using disabled on Claude Mythos
Problem: The parameter is not supported.
Solution: Use summarized display mode instead.
Differences Across Model Versions
Extended thinking behavior varies across Claude model versions:
- Claude Opus 4.7: Only supports adaptive thinking. Manual mode returns a 400 error.
- Claude Opus 4.6: Adaptive thinking recommended; manual mode deprecated but functional.
- Claude Sonnet 4.6: Adaptive thinking recommended; manual mode with interleaved mode deprecated.
- Claude Mythos Preview: Adaptive thinking is the default;
disablednot supported; usedisplay: "summarized"for summaries.
Key Takeaways
- Use adaptive thinking (
thinking: {type: "adaptive", effort: "high|medium|low"}) for Claude Opus 4.7 and newer models; manual mode is deprecated or unsupported. - Set max_tokens appropriately to ensure Claude has enough room for both thinking and final output.
- Handle thinking blocks in your API response processing to capture the full reasoning context.
- Monitor token usage to manage costs, as extended thinking consumes additional tokens.
- Check model compatibility before implementing; behavior differs across Claude model versions.