Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Covers API setup, effort levels, migration from fixed budgets, and best practices.
Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on bimodal tasks and agentic workflows while reducing unnecessary costs.
Introduction
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—a fixed number of thinking tokens—for every request. This one-size-fits-all approach often wasted tokens on simple queries or fell short on truly complex ones.
This guide covers everything you need to know: which models support it, how to configure it, the new effort parameter, and practical migration tips.
What Is Adaptive Thinking?
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. It's also the default mode on Claude Mythos Preview.
Instead of a rigid budget_tokens number, you set thinking.type to "adaptive". Claude then:
- Evaluates the complexity of each request
- Decides whether thinking is needed at all
- Allocates thinking tokens dynamically
- Enables interleaved thinking—Claude can think between tool calls, which is especially powerful for agentic workflows
Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | ✅ Default mode | Thinking auto-applies; cannot disable |
| Claude Opus 4.7 | ✅ Only mode | Manual thinking rejected |
| Claude Opus 4.6 | ✅ Supported | Manual thinking deprecated |
| Claude Sonnet 4.6 | ✅ Supported | Manual thinking deprecated |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | ❌ Not supported | Must use enabled with budget_tokens |
How to Use Adaptive Thinking
Basic Setup
Set thinking.type to "adaptive" in your API request. No beta header is required.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive"},
messages=[
{"role": "user", "content": "Explain the implications of quantum entanglement on cryptography."}
]
)
print(response.content)
TypeScript example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: { type: 'adaptive' },
messages: [
{ role: 'user', content: 'Explain the implications of quantum entanglement on cryptography.' }
]
});
console.log(response.content);
Using the Effort Parameter
Adaptive thinking pairs with the new effort parameter, which gives you soft control over how much thinking Claude does. The effort level acts as guidance, not a hard limit.
| Effort Level | Behavior | Available On |
|---|---|---|
max | Always thinks with no constraints on depth | Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6 |
xhigh | Always thinks deeply with extended exploration | Opus 4.7 only |
high (default) | Always thinks; deep reasoning on complex tasks | All adaptive models |
medium | Moderate thinking; may skip for simple tasks | All adaptive models |
low | Minimal thinking; often skips for simple tasks | All adaptive models |
none | Never thinks | All adaptive models |
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive"},
thinking_config={"effort": "high"}, # or "max", "xhigh", "medium", "low", "none"
messages=[
{"role": "user", "content": "Design a distributed consensus algorithm for a network with intermittent connectivity."}
]
)
Tip: Start withhigh(the default) for most workloads. Usemaxorxhighfor research-grade problems. Usemediumorlowfor high-throughput, simpler tasks.
Why Migrate from Fixed Budgets?
If you're currently using thinking: {type: "enabled", budget_tokens: N} on Opus 4.6 or Sonnet 4.6, you should plan to migrate. Here's why:
- Better performance on bimodal tasks – Adaptive thinking excels when your workload mixes simple and complex requests. Claude doesn't waste tokens on easy questions.
- Long-horizon agentic workflows – Interleaved thinking (thinking between tool calls) is automatically enabled. This is critical for agents that need to reason, act, observe, and reason again.
- Cost efficiency – You only pay for the thinking Claude actually uses, not a fixed budget that may be too high or too low.
- Future-proof – Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release.
Migration Steps
- Change
thinking.typefrom"enabled"to"adaptive" - Remove
budget_tokensentirely - Optionally add an
effortlevel (start with"high") - Test on a representative sample of your workload
- Monitor latency and cost—adaptive thinking often reduces both
thinking={"type": "enabled", "budget_tokens": 16000}
After (recommended):
thinking={"type": "adaptive"},
thinking_config={"effort": "high"}
Practical Use Cases
1. Customer Support Bot
A support bot handles everything from password resets to complex billing disputes. With adaptive thinking:
- Simple requests ("What's my account balance?") → no thinking → fast, cheap
- Complex requests ("Why was my transaction declined and how do I appeal?") → deep thinking → accurate
2. Code Generation Assistant
- Simple snippet ("Write a function to reverse a string") → low effort
- Complex refactoring ("Migrate this monolithic Java app to microservices") → high effort
3. Research Agent
Use effort: "max" or "xhigh" for deep analytical tasks like literature review synthesis or mathematical proofs.
Best Practices
- Start with default effort –
highworks well for most use cases. Only adjust if you have specific latency or depth requirements. - Combine with tool use – Adaptive thinking's interleaved thinking makes it ideal for agents that call tools, process results, and call more tools.
- Monitor thinking blocks – In streaming mode, look for
thinkingcontent blocks to understand how Claude is reasoning. - Test on representative data – Don't just test on easy examples. Include edge cases that require deep reasoning.
- Consider zero data retention – Adaptive thinking is eligible for ZDR. If your organization has a ZDR arrangement, data is not stored after the API response.
Limitations and Caveats
- Not available on older models – Sonnet 4.5, Opus 4.5, and earlier require manual
budget_tokens. - No hard latency guarantees – Because thinking depth is dynamic, response times can vary. If you need predictable latency, consider using
effort: "low"or"none". - Effort is soft guidance – Claude may think more or less than you expect. It's not a hard constraint.
Key Takeaways
- Adaptive thinking replaces fixed token budgets – Claude dynamically decides when and how much to think, improving performance and reducing waste.
- Use the
effortparameter for soft control – Choose fromnone,low,medium,high,max, orxhigh(Opus 4.7 only) to guide thinking depth. - Interleaved thinking is automatic – Claude can reason between tool calls, making adaptive thinking ideal for agentic workflows.
- Migrate away from
budget_tokens– Manual thinking is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases. - Start with
effort: "high"– It's the default for a reason. Adjust up or down based on your specific workload needs.