Adaptive Thinking: Claude's Dynamic Reasoning Mode for Smarter AI Responses
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort based on task complexity, improving performance on agentic workflows and bimodal tasks.
Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning based on request complexity. It replaces manual budget_tokens on Claude Opus 4.7 and Sonnet 4.6, enabling interleaved thinking between tool calls for better agentic performance.
Adaptive Thinking: Claude's Dynamic Reasoning Mode for Smarter AI Responses
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But what if Claude could decide when to think deeply and how much effort to invest—without you having to guess a token budget? That's exactly what adaptive thinking delivers.
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Instead of manually setting a budget_tokens value, you let Claude dynamically determine when and how much to use extended thinking based on the complexity of each request.
This guide covers everything you need to know: supported models, how it works, code examples, and migration tips.
What Is Adaptive Thinking?
Adaptive thinking is a new mode for Claude's extended thinking feature. In traditional extended thinking, you specify a fixed budget_tokens value (e.g., 16,000 tokens) that Claude must use for every response. With adaptive thinking, Claude evaluates each request individually and decides:
- Whether extended thinking is needed at all
- How much thinking effort to apply
- When to think between tool calls (interleaved thinking)
- Bimodal tasks – workloads that mix simple and complex requests
- Long-horizon agentic workflows – multi-step tasks where Claude uses tools and needs to reason between calls
Supported Models
Adaptive thinking is available on these Claude models:
| Model | API Name | Notes |
|---|---|---|
| Claude Mythos Preview | claude-mythos-preview | Adaptive thinking is default; thinking: {"type": "disabled"} is not supported |
| Claude Opus 4.7 | claude-opus-4-7 | Only supported thinking mode; manual budget_tokens rejected with 400 error |
| Claude Opus 4.6 | claude-opus-4-6 | Supported; manual budget_tokens deprecated |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | Supported; manual budget_tokens deprecated |
Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"withbudget_tokensis deprecated and will be removed in a future model release. Plan to migrate to adaptive thinking.
Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking.type: "enabled" with budget_tokens.
How Adaptive Thinking Works
In adaptive mode, thinking is optional for the model. Here's what happens under the hood:
- Request arrives – Claude receives your prompt and messages
- Complexity evaluation – Claude analyzes the request's difficulty
- Decision – At the default
effortlevel (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems - Interleaved thinking – In agentic workflows, Claude can think between tool calls, improving reasoning across multi-step tasks
How to Use Adaptive Thinking (Code Examples)
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.
Python Example
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
cURL Example
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-opus-4-7",
"max_tokens": 16000,
"thinking": {"type": "adaptive"},
"messages": [
{"role": "user", "content": "Explain why the sum of two even numbers is always even."}
]
}'
The Effort Parameter
Adaptive thinking introduces an optional effort parameter that lets you control how aggressively Claude applies thinking. The default is high, which means Claude will almost always think. Lower effort levels can reduce thinking for simple requests, saving tokens and latency.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # Options: low, medium, high
},
messages=[
{"role": "user", "content": "What is 2 + 2?"}
]
)
At low effort, Claude may skip thinking entirely for trivial questions. At high effort, it will think even for simple requests if it determines that deeper reasoning adds value.
Migrating from Fixed Budget to Adaptive Thinking
If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:
Before (deprecated)
thinking={
"type": "enabled",
"budget_tokens": 16000
}
After (recommended)
thinking={
"type": "adaptive",
"effort": "high" # optional, defaults to high
}
Migration tips:
- Start with
effort: "high"to maintain similar thinking behavior - Test on a subset of your traffic first
- Monitor latency and token usage—adaptive thinking may use fewer tokens for simple requests
- For agentic workflows, expect improved performance due to interleaved thinking
When to Use Adaptive Thinking
| Use Case | Recommendation |
|---|---|
| Simple Q&A | Use adaptive with low/medium effort to save tokens |
| Complex reasoning | Use adaptive with high effort |
| Agentic workflows (tool use) | Adaptive is ideal—interleaved thinking helps planning |
| Predictable latency | Consider fixed budget on older models (temporary) |
| Cost-sensitive apps | Adaptive can reduce thinking on simple requests |
Limitations and Considerations
- Zero Data Retention (ZDR): Adaptive thinking is eligible for ZDR. If your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
- Predictable latency: If your workload requires strict latency guarantees, adaptive thinking's dynamic nature may introduce variability. In such cases, consider using fixed
budget_tokenson older models (though this is deprecated). - Older models: Sonnet 4.5, Opus 4.5, and earlier do not support adaptive thinking. You must use
thinking.type: "enabled"withbudget_tokensfor those models.
Key Takeaways
- Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning, replacing manual
budget_tokenson Opus 4.7, Opus 4.6, and Sonnet 4.6. - Interleaved thinking enables Claude to reason between tool calls, making it especially effective for agentic workflows and multi-step tasks.
- The
effortparameter (low, medium, high) gives you control over how aggressively Claude applies thinking, withhighbeing the default. - Migration is straightforward – replace
thinking.type: "enabled"withthinking.type: "adaptive"and optionally set an effort level. - No beta header required – adaptive thinking is generally available on supported models.