Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning resources, optimize performance, and reduce costs across complex agentic workflows and bimodal tasks.
Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking based on request complexity, replacing fixed token budgets. It enables interleaved thinking between tool calls, supports effort levels for cost control, and is the default on Claude Mythos Preview and the only mode on Opus 4.7.
Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Claude's reasoning capabilities have taken a major leap forward with adaptive thinking — a new approach to extended thinking that replaces the old fixed-budget model. Instead of manually guessing how many thinking tokens your task needs, adaptive thinking lets Claude dynamically decide when and how much to reason, based on the complexity of each individual request.
This guide will walk you through everything you need to know about adaptive thinking: how it works, which models support it, how to implement it in your API calls, and best practices for getting the most out of this powerful feature.
What Is Adaptive Thinking?
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It is also the default mode on Claude Mythos Preview, where it auto-applies whenever thinking is unset.
Unlike the older thinking: {type: "enabled", budget_tokens: N} approach — where you had to specify a fixed number of tokens for reasoning — adaptive thinking allows Claude to:
- Evaluate request complexity and decide whether thinking is needed
- Allocate thinking dynamically — more for complex problems, less (or none) for simple ones
- Enable interleaved thinking between tool calls, making it ideal for agentic workflows
Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} will be rejected with a 400 error.
Supported Models
| Model | API Name | Adaptive Thinking Support |
|---|---|---|
| Claude Mythos Preview | claude-mythos-preview | Default (auto-applies when thinking unset; cannot disable) |
| Claude Opus 4.7 | claude-opus-4-7 | Only mode (must explicitly set thinking: {type: "adaptive"}) |
| Claude Opus 4.6 | claude-opus-4-6 | Supported (old enabled mode deprecated) |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | Supported (old enabled mode deprecated) |
Note: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still requirethinking: {type: "enabled"}withbudget_tokens.
How Adaptive Thinking Works
In adaptive mode, thinking becomes optional for the model. Here's what happens under the hood:
- Claude evaluates each request — it looks at the prompt complexity, the number of steps required, and whether tool calls are involved.
- Decision time — At the default
effortlevel (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems. - Dynamic allocation — If thinking is needed, Claude allocates just enough tokens to solve the problem effectively, rather than burning through a fixed budget.
- Interleaved thinking — Claude can think between tool calls, not just before them. This is a game-changer for agentic workflows where the model needs to reason after receiving tool results.
Why Interleaved Thinking Matters
Traditional extended thinking only allowed Claude to think before generating a response. In agentic workflows — where Claude calls tools, gets results, and needs to decide what to do next — this was limiting. With interleaved thinking, Claude can:
- Call a search tool
- Receive results
- Think about those results
- Decide the next action
- Call another tool
- Think again after receiving new data
How to Use Adaptive Thinking in the API
Basic Implementation
Here's how to enable adaptive thinking in your API requests:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Controlling Thinking with the Effort Parameter
Adaptive thinking comes with an optional effort parameter that acts as soft guidance for how much thinking Claude should do. This is especially useful when you want to balance reasoning depth against cost and latency.
Effort Levels
| Effort Level | Behavior | Best For |
|---|---|---|
low | Minimal thinking; may skip for simple tasks | High-throughput, low-cost scenarios |
medium | Moderate thinking allocation | Balanced workloads |
high (default) | Maximum thinking; Claude almost always thinks | Complex reasoning, agentic workflows |
Example with Effort Control
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # Options: "low", "medium", "high"
},
messages=[
{
"role": "user",
"content": "Write a Python script to analyze a CSV file with 10,000 rows."
}
]
)
Pro Tip: Start witheffort: "high"for development and testing. Once you understand your workload's complexity, you can dial it down to"medium"or"low"to optimize costs.
When to Use Adaptive Thinking
Ideal Use Cases
- Bimodal tasks — Workloads that mix simple and complex requests. Adaptive thinking saves tokens on easy questions while going deep on hard ones.
- Long-horizon agentic workflows — Multi-step tasks where Claude needs to reason between tool calls (e.g., research agents, data analysis pipelines).
- Variable complexity workloads — When you can't predict how much thinking each request will need.
- Cost-sensitive applications — The effort parameter gives you fine-grained control over spending.
When to Stick with Fixed Budget
If your workload requires predictable latency or precise control over thinking costs, the old thinking: {type: "enabled", budget_tokens: N} approach is still functional on Opus 4.6 and Sonnet 4.6. However, it is deprecated and no longer recommended — plan to migrate to adaptive thinking.
Migration Guide: Moving from Fixed Budget to Adaptive
If you're currently using thinking: {type: "enabled", budget_tokens: N}, here's how to migrate:
Step 1: Update Your Model
Ensure you're using a supported model (Opus 4.6+, Sonnet 4.6+, or Mythos Preview).
Step 2: Change the Thinking Configuration
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (recommended):
thinking={
"type": "adaptive",
"effort": "high"
}
Step 3: Test and Tune
Run your existing test suite with adaptive thinking. You may find that:
- Simple requests use fewer tokens than before
- Complex requests may use more (but produce better results)
- Overall costs may decrease for bimodal workloads
Step 4: Adjust Effort as Needed
If you notice costs increasing, try reducing the effort level to "medium" or "low".
Best Practices
- Always set
max_tokensgenerously — Adaptive thinking can allocate more tokens than you might expect for complex tasks. Setmax_tokensto at least 2x your expected output length.
- Monitor thinking blocks — In your response handling, always check for
block.type == "thinking"to capture Claude's reasoning. This is valuable for debugging and transparency.
- Start with
effort: "high"— Get a baseline for your workload's thinking needs before optimizing.
- Use for agentic loops — Adaptive thinking shines when Claude needs to reason between tool calls. Build your agents to leverage interleaved thinking.
- Test with your actual workload — Synthetic benchmarks may not reflect real-world performance. Run A/B tests with your production prompts.
- No beta header required — Unlike some Claude features, adaptive thinking doesn't need any special beta headers. Just set
thinking: {type: "adaptive"}.
Key Takeaways
- Adaptive thinking replaces fixed token budgets — Claude dynamically decides when and how much to think based on request complexity, eliminating the need to guess
budget_tokensvalues. - Interleaved thinking enables better agentic workflows — Claude can reason between tool calls, creating more natural and effective multi-step reasoning loops.
- The effort parameter gives you cost control — Use
effort: "low","medium", or"high"to balance reasoning depth against token usage and latency. - Migration is straightforward — Update your thinking configuration from
{type: "enabled", budget_tokens: N}to{type: "adaptive"}and optionally add an effort level. - Opus 4.7 requires adaptive thinking — If you're using the latest Opus model, adaptive thinking is your only option (and it's the best one).