Mastering Extended Thinking in Claude: A Practical Guide to Adaptive and Manual Modes
Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and code examples for the API.
This guide explains how to enable and configure Claude's extended thinking for complex reasoning tasks. You'll learn the difference between adaptive thinking (recommended for Opus 4.7) and manual mode, how to set effort levels, and how to handle thinking blocks in your API responses.
Mastering Extended Thinking in Claude: A Practical Guide to Adaptive and Manual Modes
Claude's extended thinking feature unlocks enhanced reasoning capabilities for complex tasks—from mathematical proofs to multi-step analysis. By giving Claude a dedicated "thinking" phase before it produces its final answer, you can get deeper, more accurate responses. This guide covers everything you need to know to implement extended thinking effectively, whether you're using the latest Claude Opus 4.7 or earlier models.
What Is Extended Thinking?
Extended thinking allows Claude to "think out loud" before delivering its final answer. When enabled, the API response includes special thinking content blocks that contain Claude's internal reasoning, followed by the final text response. This provides:
- Transparency into Claude's reasoning process
- Improved accuracy on complex tasks like math, logic, and planning
- Debugging insights when responses are unexpected
Adaptive Thinking vs. Manual Extended Thinking
Claude now offers two approaches to extended thinking, and the right choice depends on your model version.
Adaptive Thinking (Recommended)
Adaptive thinking (thinking: {type: "adaptive"}) lets Claude decide how much thinking is needed for each request. You control the effort level rather than a fixed token budget. This is the only supported mode on Claude Opus 4.7 and is recommended for all current models.
Manual Extended Thinking (Deprecating)
Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) lets you set a fixed token budget for thinking. This mode is:
- No longer supported on Claude Opus 4.7 (returns a 400 error)
- Deprecated but functional on Claude Opus 4.6 and Claude Sonnet 4.6
- Will be removed in a future model release
Key takeaway: If you're building new applications, use adaptive thinking. If you're maintaining legacy code on older models, plan to migrate soon.
Model-Specific Behavior
Different Claude models handle extended thinking differently:
| Model | Adaptive Thinking | Manual Mode | Notes |
|---|---|---|---|
| Claude Opus 4.7 | ✅ Required | ❌ Not supported | Use effort parameter |
| Claude Opus 4.6 | ✅ Recommended | ✅ Deprecated | Migrate to adaptive |
| Claude Sonnet 4.6 | ✅ Recommended | ✅ Deprecated (interleaved) | Migrate to adaptive |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | disabled not supported; display defaults to "omitted" |
How to Use Extended Thinking in the API
Prerequisites
- An Anthropic API key
- The Anthropic Python SDK (
pip install anthropic) or TypeScript SDK
Basic Implementation with Adaptive Thinking
Here's how to enable adaptive thinking on Claude Opus 4.7:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "high" # Options: "low", "medium", "high"
},
messages=[
{
"role": "user",
"content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
}
]
)
Process the response
for block in response.content:
if block.type == "thinking":
print(f"Thinking: {block.thinking}")
elif block.type == "text":
print(f"Final answer: {block.text}")
Using Manual Extended Thinking (Legacy Models)
For Claude Opus 4.6 or Sonnet 4.6, you can still use manual mode:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000 # Set your thinking budget
},
messages=[
{
"role": "user",
"content": "Explain the Riemann Hypothesis in simple terms."
}
]
)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{
role: 'user',
content: 'Design a sorting algorithm that works in O(n log n) time.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log('Thinking:', block.thinking);
} else if (block.type === 'text') {
console.log('Answer:', block.text);
}
}
Understanding the Response Format
When extended thinking is enabled, the API response contains a content array with two types of blocks:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
The signature field in thinking blocks is used for verification purposes. The thinking blocks appear before the final text response.
Choosing the Right Effort Level
With adaptive thinking, you control the effort parameter:
"low": Minimal thinking, faster responses. Good for simple tasks where you just need a quick check."medium": Balanced thinking. Suitable for most analytical tasks."high": Maximum reasoning depth. Best for complex proofs, multi-step logic, or tasks requiring careful analysis.
"medium" and increase to "high" if you need deeper reasoning. Use "low" for high-throughput applications where speed matters more than depth.
Best Practices
1. Set Appropriate max_tokens
Your max_tokens value must be greater than your thinking budget (or effort equivalent). A good rule is to set max_tokens to at least 1.5× your expected thinking tokens.
2. Handle Thinking Blocks in Streaming
When streaming, thinking blocks appear as separate events. Make sure your streaming handler can process both thinking and text content blocks:
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive", "effort": "high"},
messages=[{"role": "user", "content": "Solve this equation step by step."}]
) as stream:
for event in stream:
if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
print(f"Thinking: {event.delta.thinking}", end="")
elif event.type == "content_block_delta" and event.delta.type == "text_delta":
print(f"Answer: {event.delta.text}", end="")
3. Use for Complex Reasoning Tasks
Extended thinking shines on tasks like:
- Mathematical proofs and derivations
- Multi-step logical reasoning
- Code generation with complex algorithms
- Strategic planning and analysis
- Debugging and root cause analysis
4. Don't Use for Simple Queries
For simple factual questions or straightforward tasks, extended thinking adds latency without benefit. Reserve it for problems that genuinely benefit from step-by-step reasoning.
Common Pitfalls
- Using manual mode on Opus 4.7: This returns a 400 error. Always use adaptive thinking.
- Setting
budget_tokenstoo low: Claude may run out of thinking tokens before completing its reasoning, leading to truncated responses. - Forgetting
max_tokens: Extended thinking requiresmax_tokensto be set explicitly. - Ignoring the
signaturefield: While optional for most use cases, the signature is important for verifying the authenticity of thinking blocks.
Migrating from Manual to Adaptive Thinking
If you're currently using manual extended thinking, here's your migration checklist:
- Update your model to Claude Opus 4.7 (or keep using 4.6/4.7 with adaptive)
- Replace
budget_tokenswith theeffortparameter - Test with
"medium"effort first, then adjust up or down - Update your error handling to catch 400 errors if you accidentally use manual mode on Opus 4.7
Key Takeaways
- Adaptive thinking is the future: Use
thinking: {type: "adaptive", effort: "high|medium|low"}for all new projects, especially on Claude Opus 4.7. - Manual mode is deprecated: Avoid
thinking: {type: "enabled", budget_tokens: N}on new code; it will be removed in future model releases. - Choose effort wisely: Match the effort level to task complexity—"low" for speed, "high" for depth.
- Handle both thinking and text blocks: Your application must process both content block types in API responses.
- Extended thinking adds latency: Use it strategically for complex reasoning, not for simple queries.