Mastering Extended Thinking in Claude: A Guide to Adaptive and Manual Reasoning Modes
Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and practical API examples.
This guide explains how to enable and configure Claude's extended thinking feature, including adaptive thinking (recommended for Opus 4.7) and manual mode (deprecated but functional on older models). You'll learn how to set effort levels, handle thinking content blocks, and optimize for complex tasks like math, code analysis, and multi-step reasoning.
Introduction
Claude's extended thinking feature unlocks a new level of reasoning capability, allowing the model to "think step by step" before delivering a final answer. This internal reasoning process produces thinking content blocks that give you varying degrees of transparency into Claude's logic. Whether you're building a complex code analysis tool, a math tutoring app, or a multi-step reasoning agent, extended thinking can dramatically improve the quality and accuracy of Claude's outputs.
This guide covers everything you need to know: from the basics of enabling extended thinking to the latest adaptive thinking mode, effort parameters, and best practices for different Claude models.
How Extended Thinking Works
When extended thinking is enabled, Claude generates one or more thinking content blocks before its final text response. These blocks contain the model's internal reasoning—its chain-of-thought, intermediate calculations, and logical deductions. The API response looks like this:
{
"content": [
{
"type": "thinking",
"thinking": "Let me analyze this step by step...",
"signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
},
{
"type": "text",
"text": "Based on my analysis..."
}
]
}
The signature field is used for verification and is required when streaming. The thinking blocks are always followed by at least one text block containing the final answer.
Adaptive Thinking (Recommended for Claude Opus 4.7)
Starting with Claude Opus 4.7, Anthropic introduced adaptive thinking—a smarter, more flexible approach to extended thinking. Instead of setting a fixed token budget, you let Claude decide how much thinking is needed based on the complexity of the task. This is controlled via the effort parameter.
Effort Levels
The effort parameter accepts three values:
low: Minimal thinking. Best for simple tasks where you want fast responses.medium: Balanced thinking. Good for most everyday complex tasks.high: Maximum thinking. Ideal for the hardest problems—math proofs, multi-step reasoning, code generation with many constraints.
Using Adaptive Thinking
Here's how to enable adaptive thinking in the Messages API:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "high"
},
messages=[
{
"role": "user",
"content": "Prove that the square root of 2 is irrational."
}
]
)
print(response.content)
Important: On Claude Opus 4.7, using the old manual mode (thinking: {type: "enabled", budget_tokens: N}) will return a 400 error. You must use adaptive thinking.
Manual Extended Thinking (Legacy)
For models like Claude Opus 4.6 and Claude Sonnet 4.6, you can still use manual extended thinking with a fixed token budget. However, this mode is deprecated and will be removed in a future release. Anthropic strongly recommends migrating to adaptive thinking.
Setting a Token Budget
The budget_tokens parameter specifies the maximum number of tokens Claude can use for thinking. This is separate from max_tokens, which controls the total response length (thinking + final answer).
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=8192,
thinking={
"type": "enabled",
"budget_tokens": 4096
},
messages=[
{
"role": "user",
"content": "Write a Python function to solve a Sudoku puzzle using backtracking."
}
]
)
Budget Tokens vs. Max Tokens
budget_tokens: The thinking budget. Claude will stop thinking once it reaches this limit.max_tokens: The total response limit (thinking + final text). Must be greater thanbudget_tokens.
max_tokens to at least 1.5x your budget_tokens to leave room for the final answer.
Model-Specific Behavior
Different Claude models handle extended thinking differently. Here's a quick reference:
| Model | Adaptive Thinking | Manual Mode | Notes |
|---|---|---|---|
| Claude Opus 4.7 | ✅ Required | ❌ Returns 400 error | Use effort parameter |
| Claude Opus 4.6 | ✅ Recommended | ✅ Deprecated but functional | Migrate to adaptive |
| Claude Sonnet 4.6 | ✅ Recommended | ✅ Deprecated (interleaved mode) | Migrate to adaptive |
| Claude Mythos Preview | ✅ Default | ✅ Accepted | thinking: disabled not supported; use display: "summarized" for summaries |
Claude Mythos Preview Special Behavior
Claude Mythos Preview has unique behavior:
- Adaptive thinking is the default.
- You cannot disable thinking (
thinking: {type: "disabled"}is not supported). - By default, thinking content is omitted from the response (display = "omitted").
- To receive thinking summaries, pass
display: "summarized"in the thinking configuration.
response = client.messages.create(
model="claude-mythos-preview",
max_tokens=4096,
thinking={
"type": "adaptive",
"display": "summarized"
},
messages=[
{"role": "user", "content": "Explain quantum entanglement."}
]
)
Streaming with Extended Thinking
Extended thinking works with streaming. When streaming, you'll receive thinking content block deltas followed by text deltas. Here's an example using Python:
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive", "effort": "high"},
messages=[
{"role": "user", "content": "Solve this math problem: 1234 * 5678"}
]
) as stream:
for event in stream:
if event.type == "content_block_delta":
if event.delta.type == "thinking_delta":
print(f"Thinking: {event.delta.thinking}", end="")
elif event.delta.type == "text_delta":
print(f"Answer: {event.delta.text}", end="")
Best Practices
1. Choose the Right Effort Level
- Low effort: Use for simple Q&A, straightforward code generation, or when speed matters.
- Medium effort: Good default for most tasks—complex instructions, moderate reasoning.
- High effort: Reserve for the hardest problems: mathematical proofs, multi-step planning, code with many constraints.
2. Combine with Structured Outputs
Extended thinking pairs well with structured outputs (JSON mode). Claude can reason internally and then output a perfectly formatted JSON response:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive", "effort": "high"},
messages=[
{"role": "user", "content": "Extract all dates and amounts from this invoice text..."}
]
)
3. Monitor Token Usage
Extended thinking consumes tokens. The thinking content blocks count toward your token usage. Monitor your costs, especially with high effort or large budget_tokens.
4. Handle Thinking Content in Your Application
If you're displaying Claude's response to end users, you may want to:
- Show thinking content as an expandable section (e.g., "Show reasoning").
- Use the thinking content for debugging or quality assurance.
- Omit thinking content entirely and only show the final text.
5. Migrate from Manual to Adaptive
If you're using manual mode on Claude Opus 4.6 or Sonnet 4.6, start migrating to adaptive thinking now. The transition is straightforward:
Before (manual):thinking={"type": "enabled", "budget_tokens": 4096}
After (adaptive):
thinking={"type": "adaptive", "effort": "high"}
Common Pitfalls
- Setting budget_tokens too low: If the budget is too small, Claude may cut off its reasoning prematurely, leading to lower quality answers.
- Forgetting max_tokens: Always set
max_tokenshigher thanbudget_tokens. - Using manual mode on Opus 4.7: This will return a 400 error. Use adaptive thinking.
- Ignoring the signature: When streaming, the signature is required for verification. Always include it if you need to verify the integrity of the thinking content.
Conclusion
Extended thinking is a powerful feature that elevates Claude from a simple text generator to a true reasoning engine. By understanding the differences between adaptive and manual modes, choosing the right effort level, and following best practices, you can build applications that tackle the most complex problems with clarity and accuracy.
Whether you're building a math tutor, a code reviewer, or a research assistant, extended thinking gives you the transparency and control you need to deliver high-quality results.
Key Takeaways
- Adaptive thinking is the future: Use
thinking: {type: "adaptive", effort: "high|medium|low"}on Claude Opus 4.7 and newer models. Manual mode is deprecated. - Effort levels matter: Choose
lowfor speed,mediumfor balance, andhighfor the hardest reasoning tasks. - Model compatibility: Claude Opus 4.7 requires adaptive thinking; older models support both but manual mode is deprecated.
- Token budget: Always set
max_tokenshigher thanbudget_tokensin manual mode. In adaptive mode, Claude manages the thinking budget automatically. - Streaming works: Extended thinking is fully compatible with streaming—handle
thinking_deltaandtext_deltaevents separately.