Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning resources, improve agentic performance, and simplify your API code—with practical examples and best practices.
Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing manual token budgets. It improves performance on complex tasks, supports interleaved thinking between tool calls, and is the only mode for Claude Opus 4.7. Use `thinking: {type: "adaptive"}` with optional effort levels.
Introduction
If you've ever struggled to set the perfect budget_tokens for Claude's extended thinking, you're not alone. Too low, and Claude cuts off mid-reasoning. Too high, and you waste tokens on simple queries. With adaptive thinking, Claude handles that decision for you—dynamically allocating reasoning resources based on the complexity of each request.
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and it's the default mode on Claude Mythos Preview. It's especially powerful for bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows where Claude needs to reason between tool calls.
In this guide, you'll learn exactly how adaptive thinking works, how to implement it in your API calls, and how to fine-tune it with the effort parameter.
What Is Adaptive Thinking?
Adaptive thinking is a replacement for the older thinking: {type: "enabled", budget_tokens: N} approach. Instead of you guessing how many tokens Claude needs to think, Claude evaluates each incoming request and decides:
- Whether to use extended thinking at all
- How much thinking to allocate
high), Claude almost always thinks. At lower levels, it may skip thinking for simpler problems—saving you time and money.
Key Benefits
- No more token budget guesswork – Claude self-calibrates per request.
- Interleaved thinking – Claude can think between tool calls, which is critical for agentic loops.
- Better performance on mixed workloads – Simple queries don't waste thinking tokens; complex ones get the depth they need.
- Future-proof – Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | ✅ Default | Auto-applies when thinking is unset; disabled not supported |
| Claude Opus 4.7 | ✅ Only mode | Manual enabled rejected with 400 error |
| Claude Opus 4.6 | ✅ Supported | Manual enabled deprecated, still functional |
| Claude Sonnet 4.6 | ✅ Supported | Manual enabled deprecated, still functional |
| Older models (Sonnet 4.5, Opus 4.5) | ❌ Not supported | Must use thinking: {type: "enabled", budget_tokens: N} |
Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"withbudget_tokensis deprecated. Existing configurations still work, but you should migrate tothinking.type: "adaptive"as soon as possible.
How to Use Adaptive Thinking
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.
Basic Example (Python)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
Basic Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Fine-Tuning with the Effort Parameter
Adaptive thinking comes with an optional effort parameter that acts as soft guidance for how much thinking Claude should do. The effort level doesn't enforce a hard limit—it nudges Claude toward more or less reasoning.
Effort Levels
| Effort | Behavior | Use Case |
|---|---|---|
low | Minimal thinking; may skip for simple queries | High-throughput, low-complexity tasks |
medium | Balanced thinking allocation | General-purpose use |
high (default) | Almost always thinks; deep reasoning | Complex analysis, math, coding |
Example with Effort
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium"
},
messages=[
{
"role": "user",
"content": "Analyze the pros and cons of using microservices vs monoliths for a startup."
}
]
)
Tip: Start witheffort: "high"for most workloads. Drop to"medium"or"low"only if you're certain your queries are simple or you need faster responses.
Adaptive Thinking in Agentic Workflows
One of the most powerful features of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic loops where Claude needs to reason about tool results before deciding the next action.
Example: Agent with Tool Calls
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive"},
tools=[
{
"name": "search_database",
"description": "Search the internal database for user records",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
messages=[
{
"role": "user",
"content": "Find all users who signed up last week and send them a welcome email."
}
]
)
Claude will think, call the tool, think again about the results, then respond
for block in response.content:
if block.type == "thinking":
print(f"Thinking: {block.thinking}")
elif block.type == "tool_use":
print(f"Tool call: {block.name}({block.input})")
elif block.type == "text":
print(f"Final response: {block.text}")
Without adaptive thinking, Claude would do all its reasoning upfront before any tool calls. With interleaved thinking, it can reason after each tool result—leading to more accurate multi-step actions.
Migrating from Manual Budget Tokens
If you're currently using thinking: {type: "enabled", budget_tokens: 8000}, here's how to migrate:
Before (Deprecated)
thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (Recommended)
thinking={
"type": "adaptive",
"effort": "high" # optional, defaults to high
}
What changes:
- You no longer need to set
budget_tokens. - Claude decides the thinking depth per request.
- You may see varying thinking token usage—that's expected and intentional.
- For predictable latency or strict cost control, you can still use
budget_tokenson Opus 4.6 and Sonnet 4.6, but plan to migrate soon.
Best Practices
- Start with
effort: "high"– It's the default for a reason. Only lower it after testing your workload. - Use with tool use – The interleaved thinking capability shines in agentic loops. Always enable adaptive thinking when using tools.
- Monitor thinking blocks – In your response handling, always check for
block.type === "thinking"to capture reasoning traces. - Set
max_tokensgenerously – Adaptive thinking can use more tokens than a fixed budget. Setmax_tokenshigh enough to accommodate both thinking and response. - Test bimodal workloads – If your app handles both simple Q&A and complex analysis, adaptive thinking will automatically adjust—no need to switch modes.
Limitations and Considerations
- Not supported on older models – Sonnet 4.5, Opus 4.5, and earlier require manual
budget_tokens. - No hard budget control – If you need strict cost predictability, adaptive thinking may not be ideal. Use
budget_tokenson supported models as a transitional measure. - Effort is soft guidance – Setting
effort: "low"doesn't guarantee Claude won't think deeply; it's a nudge, not a command.
Key Takeaways
- Adaptive thinking replaces manual token budgets – Claude dynamically decides when and how much to think, simplifying your code and often improving performance.
- It's the only thinking mode on Claude Opus 4.7 – Manual
thinking: {type: "enabled"}is rejected with a 400 error on this model. - Interleaved thinking supercharges agentic workflows – Claude can reason between tool calls, leading to more accurate multi-step actions.
- Use the
effortparameter for soft control – Choose fromlow,medium, orhighto guide Claude's reasoning depth. - Migrate now – Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future release.