Mastering Extended Thinking in Claude: A Practical Guide to Adaptive & Manual Thinking Modes
Learn how to enable and optimize Claude's extended thinking capabilities for complex reasoning tasks. Covers adaptive thinking, manual budgets, effort parameters, and code examples.
This guide teaches you how to use Claude's extended thinking feature to enhance reasoning for complex tasks. You'll learn about adaptive thinking with effort parameters, manual token budgets, and how to implement them in the Messages API with practical code examples.
Introduction
Claude's extended thinking feature unlocks a new level of reasoning capability, allowing the model to work through complex problems step-by-step before delivering a final answer. Whether you're building a code analysis tool, a mathematical solver, or a multi-step decision engine, extended thinking gives Claude the "thinking time" it needs to produce more accurate and nuanced responses.
This guide covers everything you need to know: from the basics of enabling extended thinking to advanced configuration with adaptive thinking and effort parameters. You'll find practical code examples, model-specific behavior notes, and best practices to get the most out of this powerful feature.
How Extended Thinking Works
When extended thinking is enabled, Claude generates thinking content blocks in its response. These blocks contain the model's internal reasoning—its step-by-step thought process—before it produces the final text output.
Here's what a typical response looks like:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
The thinking blocks include a signature field, which is used for verification purposes. The final answer appears in subsequent text blocks.
Adaptive Thinking (Recommended for Claude 4 Models)
For Claude Opus 4.7 and later models, Anthropic has introduced adaptive thinking—a smarter way to allocate thinking resources. Instead of manually setting a fixed token budget, you tell Claude how much effort to apply, and it dynamically adjusts its thinking depth.
Using the Effort Parameter
The effort parameter accepts values on a scale from 0.0 to 1.0:
- Low effort (0.0–0.3): Quick reasoning for simple tasks
- Medium effort (0.4–0.7): Balanced depth for most complex tasks
- High effort (0.8–1.0): Maximum reasoning for very challenging problems
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": 0.8 # High effort for complex reasoning
},
messages=[
{"role": "user", "content": "Solve this complex math problem: integrate ∫(x^2 * sin(x)) dx"}
]
)
Access thinking blocks
for block in response.content:
if block.type == "thinking":
print("Thinking:", block.thinking)
elif block.type == "text":
print("Answer:", block.text)
TypeScript example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 0.8
},
messages: [
{ role: 'user', content: 'Analyze the pros and cons of quantum computing for cryptography.' }
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Answer:', block.text);
}
}
}
main();
Task Budgets (Beta)
For even finer control, you can combine adaptive thinking with a task budget—a maximum token limit for the thinking process. This is useful when you want to cap costs while still benefiting from adaptive allocation.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={
"type": "adaptive",
"effort": 0.9,
"task_budget_tokens": 4000 # Cap thinking at 4000 tokens
},
messages=[
{"role": "user", "content": "Write a comprehensive essay on the history of AI."}
]
)
Fast Mode (Beta Research Preview)
For scenarios where you need faster responses but still want some reasoning, fast mode reduces thinking overhead. This is ideal for real-time applications where latency matters.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=2048,
thinking={
"type": "adaptive",
"effort": 0.5,
"fast_mode": True # Enable faster thinking
},
messages=[
{"role": "user", "content": "Summarize this article in 3 bullet points."}
]
)
Manual Extended Thinking (Legacy)
For Claude Opus 4.6 and Claude Sonnet 4.6, you can still use manual extended thinking with a fixed token budget. However, this approach is deprecated and will be removed in future model releases. Anthropic strongly recommends migrating to adaptive thinking.
Setting a Token Budget
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
thinking={
"type": "enabled",
"budget_tokens": 2000 # Fixed thinking budget
},
messages=[
{"role": "user", "content": "Debug this Python code: [code here]"}
]
)
Important: Manual extended thinking (type: "enabled") is not supported on Claude Opus 4.7 or later models. Attempting to use it will result in a400error.
Model-Specific Behavior
Different Claude models handle extended thinking differently. Here's a quick reference:
| Model | Adaptive Thinking | Manual Thinking | Notes |
|---|---|---|---|
| Claude Opus 4.7+ | ✅ Recommended | ❌ Not supported | Use effort parameter |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | disabled not supported; use display: "summarized" for summaries |
| Claude Opus 4.6 | ✅ Recommended | ⚠️ Deprecated | Manual still functional |
| Claude Sonnet 4.6 | ✅ Recommended | ⚠️ Deprecated | Interleaved mode deprecated |
Claude Mythos Preview Special Behavior
Claude Mythos Preview has unique behavior:
- Adaptive thinking is the default mode
thinking: {type: "disabled"}is not supported- The display defaults to
"omitted"(no thinking content returned) - Pass
display: "summarized"to receive summaries of the thinking process
response = client.messages.create(
model="claude-mythos-preview",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": 0.7,
"display": "summarized" # Get thinking summaries
},
messages=[
{"role": "user", "content": "Explain quantum entanglement."}
]
)
Best Practices
1. Choose the Right Effort Level
Start with a moderate effort (0.5–0.7) and adjust based on task complexity. For simple factual queries, low effort (0.2–0.3) is sufficient. For multi-step reasoning, coding, or analysis, use high effort (0.8–1.0).
2. Combine with Structured Outputs
Extended thinking pairs well with structured outputs. Use thinking for reasoning, then output a structured JSON response:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive", "effort": 0.8},
messages=[
{"role": "user", "content": "Extract key entities from this text and return as JSON."}
]
)
3. Monitor Token Usage
Extended thinking consumes tokens from your max_tokens budget. The thinking tokens are counted separately from output tokens. Use the usage field in the API response to monitor consumption.
4. Handle Streaming Responses
When streaming, thinking blocks appear before text blocks. Process them in order:
stream = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive", "effort": 0.8},
messages=[{"role": "user", "content": "Solve this riddle..."}],
stream=True
)
for event in stream:
if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
print(event.delta.thinking, end="")
elif event.type == "content_block_delta" and event.delta.type == "text_delta":
print(event.delta.text, end="")
Troubleshooting
- 400 Error on Opus 4.7+: You're using
type: "enabled"instead oftype: "adaptive". Switch to adaptive thinking. - No thinking content returned: Check the
displayparameter. On Mythos, it defaults to"omitted". - High latency: Reduce the
effortparameter or enablefast_modefor quicker responses. - Token limit exceeded: Increase
max_tokensor reducetask_budget_tokens.
Conclusion
Extended thinking is a game-changer for complex reasoning tasks. With the introduction of adaptive thinking and the effort parameter, you now have fine-grained control over how Claude allocates its cognitive resources. Whether you're building a code assistant, a research tool, or a decision support system, mastering extended thinking will help you get the most out of Claude.
Key Takeaways
- Adaptive thinking (with the
effortparameter) is the recommended approach for Claude Opus 4.7 and later models—manual thinking is no longer supported on these models. - Manual extended thinking (
type: "enabled"withbudget_tokens) is deprecated on Opus 4.6 and Sonnet 4.6 but still functional; migrate to adaptive thinking for future compatibility. - Effort values range from 0.0 (low reasoning) to 1.0 (maximum reasoning); start with 0.5–0.7 and adjust based on task complexity.
- Task budgets and fast mode provide additional controls for cost management and latency optimization.
- Model behavior varies—always check the documentation for your specific model, especially Claude Mythos Preview which has unique display defaults.