Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use adaptive thinking with Claude Opus 4.7, Sonnet 4.6, and Mythos Preview. Includes code examples, effort parameters, and migration tips from fixed budget_tokens.
Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking per request, replacing fixed budget_tokens. It's the only mode on Opus 4.7 and default on Mythos Preview. Use thinking.type: 'adaptive' and optionally the effort parameter to control depth.
Introduction
Claude's extended thinking capability has always been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning before generating a response. This worked well for predictable workloads, but for tasks with varying complexity, you often either wasted tokens on simple queries or ran out of thinking room on hard ones.
In this guide, you'll learn:
- How adaptive thinking works under the hood
- How to use it in your API calls (with code examples)
- How to control thinking depth with the
effortparameter - Best practices for migrating from fixed
budget_tokens - When to stick with manual thinking (and when not to)
What Is Adaptive Thinking?
Adaptive thinking is a new mode for Claude's extended thinking feature. When enabled, Claude evaluates each incoming request and decides:
- Whether thinking is needed at all
- How many tokens to allocate to thinking
Key Benefits
- Cost efficiency: No wasted tokens on trivial queries
- Better performance on bimodal tasks: Handles both simple and complex requests in the same conversation
- Ideal for agentic workflows: Adaptive thinking automatically enables interleaved thinking, meaning Claude can think between tool calls—critical for multi-step agents
- No more guesswork: You don't need to tune
budget_tokensper use case
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | ✅ Default | thinking: {type: "disabled"} not supported |
| Claude Opus 4.7 | ✅ Only mode | Manual thinking: {type: "enabled"} rejected with 400 error |
| Claude Opus 4.6 | ✅ Supported | Manual mode deprecated, plan to migrate |
| Claude Sonnet 4.6 | ✅ Supported | Manual mode deprecated, plan to migrate |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | ❌ Not supported | Must use thinking.type: "enabled" with budget_tokens |
⚠️ Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"andbudget_tokensare deprecated and will be removed in a future model release. Migrate to adaptive thinking now.
How to Use Adaptive Thinking
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.
Basic Example (Python)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
Basic Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Controlling Thinking Depth with the Effort Parameter
Adaptive thinking can be combined with the optional effort parameter to guide how much thinking Claude should do. The effort level acts as soft guidance, not a hard limit.
| Effort Level | Behavior |
|---|---|
low | Minimal thinking; may skip thinking for simple problems |
medium | Balanced thinking allocation |
high (default) | Claude almost always thinks, and allocates more tokens |
Example with Effort
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # Options: low, medium, high
},
messages=[
{
"role": "user",
"content": "Design a scalable microservices architecture for an e-commerce platform."
}
]
)
When to Use Each Effort Level
low: Use for high-throughput, low-complexity tasks where you want minimal latency and cost. Example: simple classification, basic Q&A.medium: Good default for most workloads. Balances reasoning depth with speed.high: Use for complex reasoning, multi-step agentic workflows, or when you need Claude to thoroughly explore a problem. This is the default and works well for most use cases.
Adaptive Thinking in Agentic Workflows
One of the most powerful features of adaptive thinking is interleaved thinking. In agentic workflows where Claude makes multiple tool calls, adaptive thinking allows Claude to think between each tool call, not just at the beginning.
This is critical for:
- Multi-step research agents that need to reason after each search result
- Code generation agents that iterate on solutions
- Decision-making agents that evaluate intermediate results
Example: Agent with Tool Calls
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive"},
tools=[
{
"name": "web_search",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
messages=[
{
"role": "user",
"content": "Research the latest AI regulations in the EU and summarize them."
}
]
)
With adaptive thinking, Claude will think before the first tool call, then think again after receiving results, before making the next call or generating the final summary.
Migrating from Fixed budget_tokens
If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration path:
Step 1: Identify affected models
Check which models you're using. Only Opus 4.6 and Sonnet 4.6 are affected. Older models still require the old syntax.Step 2: Replace the thinking block
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (recommended):
thinking={
"type": "adaptive",
"effort": "high" # optional, defaults to high
}
Step 3: Test and adjust
- Start with
effort: "high"to match your previous deep-thinking behavior - If you notice higher token usage, try
effort: "medium" - For simple tasks,
effort: "low"can significantly reduce costs
Step 4: Remove any budget_tokens logic
Your application code should no longer calculate or passbudget_tokens. Let Claude handle allocation.
Best Practices
- Start with
effort: "high"for complex reasoning tasks, then dial down if you see overthinking. - Use adaptive thinking for agentic workflows—interleaved thinking is a major advantage over fixed budgets.
- Monitor token usage in your logs. Adaptive thinking may use more tokens on complex queries than your old fixed budget, but it will save on simple ones.
- Don't mix modes on Opus 4.7—adaptive thinking is the only option. Trying to use
type: "enabled"will return a 400 error. - For Mythos Preview, adaptive thinking is the default. You cannot disable thinking entirely.
When to Use Manual Thinking (and When Not To)
Adaptive thinking is recommended for almost all use cases. However, there are two scenarios where manual budget_tokens might still be preferable (temporarily):
- Predictable latency: If you need guaranteed response times and can't tolerate variability, fixed budgets give you deterministic thinking time.
- Strict cost control: If you need to cap thinking tokens per request for budgeting reasons.
Conclusion
Adaptive thinking represents a significant step forward in how Claude handles reasoning. By letting the model decide when and how much to think, you get:
- Better performance on mixed-complexity workloads
- Cost savings on simple queries
- Improved agentic behavior with interleaved thinking
- No more manual token budget tuning
Key Takeaways
- Adaptive thinking replaces fixed
budget_tokenson Opus 4.7, Opus 4.6, and Sonnet 4.6. On Opus 4.7, it's the only supported mode. - Use
thinking.type: "adaptive"with an optionaleffortparameter (low,medium,high) to guide thinking depth. - Interleaved thinking is automatic in adaptive mode, making it ideal for agentic workflows with multiple tool calls.
- Migrate now if you're using
budget_tokenson Opus 4.6 or Sonnet 4.6—the old syntax is deprecated and will be removed. - Start with
effort: "high"for complex tasks, then adjust based on your observed token usage and latency requirements.