Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens based on task complexity, with practical API examples and best practices.
Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on complex tasks, supports interleaved thinking between tool calls, and is the only mode on Claude Opus 4.7.
Introduction
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed amount of thinking tokens for every request. This worked well for predictable workloads, but it often wasted tokens on simple queries or fell short on unexpectedly complex ones.
This guide covers everything you need to know: which models support it, how to use it in your code, how to tune it with the effort parameter, and when to stick with the old approach.
What Is Adaptive Thinking?
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. It is also the default mode on Claude Mythos Preview.
Instead of setting a fixed budget_tokens value, you set thinking.type to "adaptive". Claude then evaluates each request's complexity and decides:
- Whether to use extended thinking at all
- How many thinking tokens to allocate
high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems, saving you tokens and latency.
Key Benefits
- Better performance on bimodal tasks (mixing simple and complex queries)
- Ideal for long-horizon agentic workflows where reasoning depth varies
- Automatic interleaved thinking—Claude can think between tool calls, improving multi-step reasoning
- No beta header required—just set the parameter and go
Note: Adaptive thinking is eligible for Zero Data Retention (ZDR). If your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | ✅ Default | Thinking auto-applies; thinking: {type: "disabled"} is not supported |
| Claude Opus 4.7 | ✅ Only mode | Manual thinking: {type: "enabled"} is rejected with a 400 error |
| Claude Opus 4.6 | ✅ Supported | Manual budget_tokens is deprecated but still functional |
| Claude Sonnet 4.6 | ✅ Supported | Manual budget_tokens is deprecated but still functional |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | ❌ Not supported | Must use thinking: {type: "enabled"} with budget_tokens |
How to Use Adaptive Thinking
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Here's a Python example:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
And the equivalent in TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
That's it. No budget_tokens, no beta headers. Claude handles the rest.
Tuning with the Effort Parameter
Adaptive thinking pairs with the optional effort parameter to give you soft control over how much thinking Claude does. The effort level acts as guidance, not a hard limit.
Available Effort Levels
| Effort Level | Behavior |
|---|---|
low | Minimal thinking; Claude may skip thinking for simple tasks |
medium | Moderate thinking; a balanced approach |
high (default) | Maximum thinking; Claude almost always thinks |
Example with Effort
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium"
},
messages=[
{
"role": "user",
"content": "Write a Python script to analyze a CSV file."
}
]
)
When to Use Each Effort Level
low: High-throughput tasks like simple classification, extraction, or Q&A where deep reasoning isn't needed.medium: General-purpose use—good for most chat applications, content generation, and moderate analysis.high: Complex reasoning, multi-step agentic workflows, mathematical proofs, code generation, and tasks requiring deep analysis.
Adaptive Thinking in Agentic Workflows
One of the biggest advantages of adaptive thinking is automatic interleaved thinking. In agentic workflows where Claude makes multiple tool calls, adaptive thinking allows Claude to "think" between each call. This leads to:
- Better planning across steps
- More coherent multi-turn reasoning
- Improved error recovery when a tool returns unexpected results
import anthropic
client = anthropic.Anthropic()
def run_agentic_task(user_query):
messages = [{"role": "user", "content": user_query}]
while True:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
tools=[
{
"name": "search_database",
"description": "Search the internal database",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "calculate",
"description": "Perform a calculation",
"input_schema": {
"type": "object",
"properties": {
"expression": {"type": "string"}
},
"required": ["expression"]
}
}
],
messages=messages
)
# Process response and handle tool calls...
# Adaptive thinking happens automatically between tool calls
Migration Guide: From Fixed Budget to Adaptive
If you're currently using thinking: {type: "enabled", budget_tokens: N}, here's how to migrate:
Step 1: Identify Your Models
Check which models you're using. If you're on Opus 4.7, you must migrate—the old format is rejected. If you're on Opus 4.6 or Sonnet 4.6, the old format still works but is deprecated.
Step 2: Replace the Parameter
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (recommended):
thinking={
"type": "adaptive",
"effort": "high"
}
Step 3: Test and Tune
Run your existing test suite with adaptive thinking. If you notice Claude thinking too much or too little, adjust the effort parameter. For most workloads, high matches or exceeds the performance of a generous fixed budget.
Step 4: Monitor Costs
Adaptive thinking can reduce costs on simple queries (where Claude skips thinking) but may increase thinking on complex ones. Monitor your token usage and adjust effort accordingly.
When to Use Fixed Budget Instead
Adaptive thinking is the recommended approach, but fixed budget_tokens is still available on Opus 4.6 and Sonnet 4.6 for specific use cases:
- Predictable latency: If you need every response to take roughly the same time, a fixed budget ensures consistent thinking duration.
- Strict cost control: If you need to cap thinking tokens per request for budgeting reasons.
- Legacy systems: While you migrate, the old format still works (but plan to update).
Warning: Fixed budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release. Migrate to adaptive thinking as soon as possible.
Best Practices
- Start with
higheffort for most workloads. It provides the best reasoning quality and you can dial down if needed. - Use
mediumorlowfor high-throughput applications where speed matters more than deep reasoning. - Combine with tool use for agentic workflows—adaptive thinking's interleaved mode shines here.
- Monitor your token usage after switching. You may see significant savings on simple tasks.
- Test on your specific use case before rolling out to production. While adaptive thinking generally improves performance, results can vary by workload.
Key Takeaways
- Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the need for manual
budget_tokens. - It's the only supported mode on Claude Opus 4.7 and the default on Claude Mythos Preview. Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6. - The
effortparameter (low,medium,high) gives you soft control over thinking depth without hard token limits. - Automatic interleaved thinking makes adaptive thinking ideal for agentic workflows with multiple tool calls.
- Migrate now if you're using fixed budgets—the old format will be removed in future model releases.