Mastering Extended Thinking in Claude: A Complete Guide to Adaptive & Manual Thinking Modes
Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and API integration with code examples.
This guide explains how to enable and configure Claude's extended thinking for enhanced reasoning, including adaptive thinking (recommended for Opus 4.7) and manual mode (deprecated but functional on older models). You'll learn API patterns, effort parameters, and best practices.
Mastering Extended Thinking in Claude: A Complete Guide to Adaptive & Manual Thinking Modes
Claude's extended thinking feature unlocks deeper reasoning capabilities for complex tasks. Whether you're building a multi-step analysis tool, a code review assistant, or a research aid, extended thinking gives Claude the space to "think through" problems before delivering its final answer—and optionally shows you its step-by-step reasoning.
This guide covers everything you need to know: how extended thinking works, which modes to use for each model, how to configure it in the API, and practical tips for getting the most out of it.
What Is Extended Thinking?
When extended thinking is enabled, Claude creates internal thinking content blocks where it performs step-by-step reasoning. It then uses the insights from that reasoning to craft its final response. The API response includes both the thinking blocks (with varying levels of transparency) and the final text blocks.
This is especially useful for:
- Complex mathematical or logical problems
- Multi-step reasoning tasks (e.g., legal analysis, code debugging)
- Tasks requiring careful planning before execution
- Any scenario where you want to see how Claude arrived at its answer
Adaptive Thinking vs. Manual Extended Thinking
Claude now offers two approaches to extended thinking. The right choice depends on your model and use case.
Adaptive Thinking (Recommended)
Adaptive thinking (thinking: {type: "adaptive"}) lets Claude dynamically decide how much thinking to do based on the complexity of the task. You can optionally control the effort level. This is the modern, recommended approach.
Supported on: Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview.
Manual Extended Thinking (Deprecated)
Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) lets you set a fixed token budget for thinking. This approach is deprecated and will be removed in future model releases. It is no longer accepted on Claude Opus 4.7 (returns a 400 error).
Still functional on: Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview (but deprecated).
Model-Specific Behavior
| Model | Adaptive Thinking | Manual Thinking | Notes |
|---|---|---|---|
| Claude Opus 4.7 | ✅ Recommended | ❌ Not supported (400 error) | Use effort parameter |
| Claude Opus 4.6 | ✅ Recommended | ⚠️ Deprecated but functional | Migrate to adaptive |
| Claude Sonnet 4.6 | ✅ Recommended | ⚠️ Deprecated but functional | Interleaved mode deprecated |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | disabled not supported; use display: "summarized" |
How to Use Extended Thinking in the API
1. Adaptive Thinking with Effort (Claude Opus 4.7)
For Claude Opus 4.7, use adaptive thinking with the effort parameter. Effort controls how much reasoning Claude applies—higher effort means deeper thinking but longer response times.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "high" # Options: "low", "medium", "high"
},
messages=[
{"role": "user", "content": "Analyze the logical consistency of this argument: All men are mortal. Socrates is a man. Therefore, Socrates is immortal."}
]
)
Print the thinking content
for block in response.content:
if block.type == "thinking":
print("Thinking:", block.thinking)
elif block.type == "text":
print("Final answer:", block.text)
TypeScript Example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{ role: 'user', content: 'Analyze the logical consistency of this argument: All men are mortal. Socrates is a man. Therefore, Socrates is immortal.' }
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Final answer:', block.text);
}
}
}
main();
2. Manual Extended Thinking (Deprecated, Older Models)
If you're using Claude Opus 4.6 or Sonnet 4.6 and haven't migrated yet, you can still set a manual token budget:
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=8192,
thinking={
"type": "enabled",
"budget_tokens": 4096 # Max tokens for thinking
},
messages=[
{"role": "user", "content": "Solve this complex math problem step by step..."}
]
)
Important: Thebudget_tokensvalue must be less thanmax_tokens. It sets the upper limit on how many tokens Claude can use for internal reasoning.
3. Task Budgets (Beta)
For advanced use cases, you can set a task budget—a target for total tokens (thinking + output). This helps control costs and latency while still allowing adaptive thinking.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={
"type": "adaptive",
"effort": "medium",
"task_budget_tokens": 4096 # Beta: target total tokens
},
messages=[
{"role": "user", "content": "Write a detailed analysis of quantum computing..."}
]
)
4. Fast Mode (Research Preview)
Fast mode reduces thinking time for simpler tasks while still using extended thinking. This is useful when you want reasoning but need faster responses.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "low",
"fast_mode": True # Research preview
},
messages=[
{"role": "user", "content": "Summarize this article in 3 bullet points..."}
]
)
Understanding the Response Format
When extended thinking is enabled, the API response includes thinking content blocks before the final text block:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
The signature field is used internally for verification and can be ignored in most applications.
Best Practices
- Use adaptive thinking for new projects. Manual mode is deprecated and will break on future models.
- Match effort to task complexity. Use
"low"for simple Q&A,"medium"for standard analysis, and"high"for complex reasoning. - Set
max_tokensgenerously. Extended thinking consumes tokens internally; ensure yourmax_tokensis at least 2x your expected output length. - Handle thinking blocks in streaming. If streaming, thinking blocks appear as
content_block_deltaevents with typethinking_delta. - Test with task budgets (beta). If cost or latency is a concern, experiment with
task_budget_tokensto find the sweet spot.
Differences Across Model Versions
Thinking behavior varies by model version. Key differences:
- Claude Opus 4.7: Only supports adaptive thinking; manual mode returns 400 error.
- Claude Opus 4.6 & Sonnet 4.6: Adaptive recommended; manual still works but deprecated.
- Claude Mythos Preview: Adaptive is default;
disablednot supported; usedisplay: "summarized"for summarized thinking.
Key Takeaways
- Extended thinking gives Claude enhanced reasoning capabilities with transparent step-by-step thinking.
- Adaptive thinking (
type: "adaptive") is the recommended approach for all current models, especially Claude Opus 4.7. - Manual extended thinking (
type: "enabled"withbudget_tokens) is deprecated and will be removed in future releases. - Use the
effortparameter (low/medium/high) to control reasoning depth without manual token budgets. - Task budgets (beta) and fast mode (research preview) offer additional control over cost and latency.
- Always set
max_tokenshigher than your expected output to accommodate internal thinking tokens.