Mastering Extended Thinking in Claude: A Complete Guide to Adaptive and Manual Modes
Learn how to enable and optimize Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and code examples for the Messages API.
Extended thinking gives Claude enhanced reasoning capabilities by exposing its step-by-step thought process. Use adaptive thinking (type: "adaptive") with an effort parameter for Claude Opus 4.7+, or manual mode (type: "enabled") with budget_tokens for older models. This guide covers setup, best practices, and code examples.
Introduction
Claude's extended thinking feature is a game-changer for developers and power users who need deep, transparent reasoning from the model. When enabled, Claude outputs its internal thought process as separate thinking content blocks before delivering its final answer. This not only improves accuracy on complex tasks but also gives you visibility into how Claude arrives at its conclusions.
Whether you're building a research assistant, a code analysis tool, or a multi-step reasoning agent, mastering extended thinking will help you get the most out of Claude.
How Extended Thinking Works
When extended thinking is turned on, the API response includes one or more thinking content blocks followed by a text content block. Each thinking block contains Claude's internal reasoning, a cryptographic signature for verification, and optionally a summary.
Here's a simplified response structure:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
Adaptive Thinking (Recommended for Claude Opus 4.7+)
For Claude Opus 4.7 and later models, Anthropic has introduced adaptive thinking. Instead of setting a fixed token budget, you specify an effort level, and Claude dynamically allocates thinking tokens as needed.
Effort Parameter
The effort parameter accepts three values:
"low"– Minimal thinking, fast responses"medium"– Balanced reasoning"high"– Maximum reasoning depth
API Example (Python)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "high"
},
messages=[
{
"role": "user",
"content": "Analyze the ethical implications of autonomous vehicles in urban environments. Consider safety, privacy, and economic factors."
}
]
)
Print thinking blocks
for block in response.content:
if block.type == "thinking":
print("Thinking:", block.thinking)
elif block.type == "text":
print("Final answer:", block.text)
API Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{
role: 'user',
content: 'Analyze the ethical implications of autonomous vehicles in urban environments. Consider safety, privacy, and economic factors.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Final answer:', block.text);
}
}
Manual Extended Thinking (Legacy Models)
For Claude Opus 4.6 and Claude Sonnet 4.6, you can still use manual extended thinking with a fixed token budget. However, this mode is deprecated and will be removed in a future release. Anthropic strongly recommends migrating to adaptive thinking.
API Example (Manual Mode)
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=8192,
thinking={
"type": "enabled",
"budget_tokens": 4096 # Maximum tokens for thinking
},
messages=[
{
"role": "user",
"content": "Solve this complex math problem step by step: ∫(0 to π) sin²(x) dx"
}
]
)
Important: budget_tokens must be less than max_tokens. The thinking budget defines how many tokens Claude can use for reasoning before generating the final answer.
Model-Specific Behavior
| Model | Recommended Mode | Notes |
|---|---|---|
| Claude Opus 4.7+ | Adaptive (type: "adaptive") | Manual mode returns 400 error |
| Claude Opus 4.6 | Adaptive (preferred) | Manual mode deprecated but functional |
| Claude Sonnet 4.6 | Adaptive (preferred) | Manual mode deprecated; interleaved mode available |
| Claude Mythos Preview | Adaptive (default) | type: "enabled" also accepted; type: "disabled" not supported |
Task Budgets (Beta)
For advanced use cases, you can set a task budget that limits the total tokens across both thinking and final response. This is useful when you need predictable latency and cost.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "high",
"budget_tokens": 2048 # Optional: limit total thinking tokens
},
messages=[...]
)
Fast Mode (Research Preview)
Fast mode is a beta feature that reduces thinking time for simpler tasks while maintaining quality. It's ideal for applications where latency matters more than maximum reasoning depth.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "low",
"fast_mode": True # Enable fast mode
},
messages=[...]
)
Best Practices
1. Choose the Right Effort Level
- Low effort: Simple Q&A, quick lookups, classification tasks
- Medium effort: Balanced analysis, code review, summarization
- High effort: Complex reasoning, multi-step math, ethical analysis, strategic planning
2. Set Appropriate max_tokens
Always setmax_tokens generously when using extended thinking. The thinking process consumes tokens, so you need headroom for both reasoning and the final answer. A good rule of thumb is max_tokens = budget_tokens * 2.
3. Handle Thinking Blocks in Your Application
When streaming responses, thinking blocks appear before text blocks. Make sure your UI can handle this ordering gracefully—for example, by showing a "thinking..." indicator while processing thinking blocks.4. Use Signatures for Verification
Each thinking block includes a cryptographic signature. You can use this to verify that the thinking content hasn't been tampered with, which is especially important for audit trails and compliance.5. Migrate from Manual to Adaptive
If you're currently using manual extended thinking (type: "enabled" with budget_tokens), plan to migrate to adaptive thinking (type: "adaptive" with effort). Manual mode is deprecated on Claude Opus 4.6 and Sonnet 4.6, and completely unsupported on Opus 4.7+.
Common Pitfalls
- Setting budget_tokens too low: Claude may cut off its reasoning prematurely, leading to incomplete or incorrect answers.
- Forgetting max_tokens: Without a sufficient
max_tokensvalue, Claude may stop before finishing its final answer. - Using manual mode on Opus 4.7+: This returns a 400 error. Always use adaptive thinking for newer models.
- Ignoring thinking blocks: The thinking content contains valuable context about Claude's reasoning process. Use it for debugging, transparency, or user-facing explanations.
Conclusion
Extended thinking is one of Claude's most powerful features, enabling deep, transparent reasoning for complex tasks. By understanding the differences between adaptive and manual modes, choosing the right effort level, and following best practices, you can build applications that leverage Claude's full reasoning capabilities.
Start with adaptive thinking on Claude Opus 4.7+ for the best balance of performance and simplicity. For legacy models, plan your migration to adaptive thinking before manual mode is fully removed.
Key Takeaways
- Adaptive thinking (
type: "adaptive"witheffort) is the recommended mode for Claude Opus 4.7+ and preferred for Opus 4.6 and Sonnet 4.6. - Manual extended thinking (
type: "enabled"withbudget_tokens) is deprecated on older models and returns a 400 error on Opus 4.7+. - Effort levels (
low,medium,high) let you control reasoning depth dynamically without setting a fixed token budget. - Task budgets and fast mode provide additional control over latency and token consumption.
- Always handle thinking blocks in your application code to provide a smooth user experience and leverage Claude's reasoning for debugging or transparency.