Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens, improve agentic workflows, and optimize performance without manual budget settings.
Adaptive thinking lets Claude automatically decide when and how much to use extended reasoning based on task complexity. It replaces manual token budgets, supports interleaved thinking between tool calls, and offers effort levels (low to max) for fine-grained control. Ideal for agentic workflows and bimodal tasks.
Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But manually setting a thinking token budget for every request can be tedious and suboptimal—especially when your workload mixes simple queries with deep analytical problems. Enter adaptive thinking, the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6.
Adaptive thinking shifts the burden of deciding when and how much to think from you to Claude. The model evaluates each request's complexity in real time and allocates reasoning tokens dynamically. This leads to better performance on bimodal tasks (e.g., a mix of quick lookups and deep analysis) and long-horizon agentic workflows where intermediate reasoning between tool calls is critical.
In this guide, you'll learn exactly how adaptive thinking works, how to implement it in your API calls, and how to tune it with the effort parameter for your specific use case.
What Is Adaptive Thinking?
Adaptive thinking is a mode where Claude decides autonomously whether to engage extended reasoning and how many tokens to allocate. Unlike the older thinking: {type: "enabled", budget_tokens: N} approach, you don't set a fixed budget. Instead, Claude examines the input and chooses a reasoning strategy on the fly.
Key characteristics:
- Dynamic allocation: Claude spends more tokens on complex problems, less on simple ones.
- Interleaved thinking: Claude can think between tool calls, making it ideal for agentic loops.
- Effort guidance: You can provide soft hints via the
effortparameter to influence thinking depth.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | Default (auto-applies) | thinking: {type: "disabled"} not supported |
| Claude Opus 4.7 | Only supported mode | Manual budget_tokens rejected with 400 error |
| Claude Opus 4.6 | Supported | Manual mode deprecated, plan migration |
| Claude Sonnet 4.6 | Supported | Manual mode deprecated, plan migration |
Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"withbudget_tokensis deprecated and will be removed in a future release. Migrate to adaptive thinking now.
How Adaptive Thinking Works
When you set thinking.type: "adaptive", Claude treats thinking as optional. At the default effort level (high), Claude almost always thinks—but for trivial requests, it may skip thinking entirely. At lower effort levels, Claude becomes more selective, reserving deep reasoning for genuinely complex tasks.
Adaptive thinking also enables interleaved thinking. In agentic workflows where Claude makes multiple tool calls, it can pause to think between calls, refining its strategy based on intermediate results. This is a major advantage over fixed-budget thinking, which allocates all reasoning tokens upfront and cannot adjust mid-conversation.
How to Use Adaptive Thinking
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Optionally, include the effort parameter to guide reasoning depth.
Basic Usage (Python)
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "high" # optional, defaults to "high"
},
messages=[
{"role": "user", "content": "Analyze the quarterly financial report and identify three key risks."}
]
)
print(response.content)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 'high'
},
messages: [
{ role: 'user', content: 'Analyze the quarterly financial report and identify three key risks.' }
]
});
console.log(response.content);
Effort Levels Explained
The effort parameter provides soft guidance. Here's what each level means:
| Effort | Behavior | Available On |
|---|---|---|
max | Always thinks, no constraints on depth | Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6 |
xhigh | Always thinks deeply with extended exploration | Opus 4.7 only |
high (default) | Always thinks; deep reasoning on complex tasks | All adaptive models |
medium | Moderate thinking; may skip for simple queries | All adaptive models |
low | Minimizes thinking; skips for most queries | All adaptive models |
Example: Using Low Effort for Simple Tasks
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
thinking={
"type": "adaptive",
"effort": "low"
},
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
For a trivial question like this, Claude will likely skip extended thinking entirely, saving tokens and reducing latency.
Adaptive Thinking in Agentic Workflows
One of the most powerful use cases for adaptive thinking is agentic workflows—multi-step tasks where Claude uses tools, evaluates results, and decides next actions. With interleaved thinking, Claude can reason between each tool call, adapting its strategy as new information arrives.
Example: Research Agent with Tool Calls
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={
"type": "adaptive",
"effort": "high"
},
tools=[
{
"name": "web_search",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
messages=[
{"role": "user", "content": "Research the latest AI regulations in the EU and summarize the key changes."}
]
)
In this scenario, Claude will:
- Think about which search queries to run.
- Call the search tool.
- Think again about the results.
- Decide if additional searches are needed.
- Finally compose the summary.
Migrating from Fixed Budget to Adaptive
If you're currently using thinking: {type: "enabled", budget_tokens: 16000}, migrating is simple:
thinking={
"type": "enabled",
"budget_tokens": 16000
}
After (recommended):
thinking={
"type": "adaptive",
"effort": "high" # or "medium", "low", etc.
}
No other changes are required. The max_tokens parameter still controls the total response length, including thinking tokens.
When to Use Each Effort Level
max: Mission-critical analysis, complex code generation, multi-step agentic tasks where you want maximum reasoning.xhigh(Opus 4.7 only): Research-grade deep dives, novel problem solving, or tasks requiring exhaustive exploration.high: General-purpose reasoning. Best for most production workloads.medium: High-volume tasks where cost and latency matter, but you still want reasoning for moderately complex queries.low: Simple Q&A, data extraction, or tasks where speed is paramount.
Limitations and Considerations
- Predictable latency: If your application requires consistent response times, adaptive thinking may introduce variability. For such cases, consider using
effort: "low"or"medium"to reduce variance. - Cost control: Adaptive thinking can use more tokens than a fixed budget on complex tasks. Monitor your token usage and adjust effort levels accordingly.
- Older models: Models like Sonnet 4.5 and Opus 4.5 do not support adaptive thinking. You must use the legacy
budget_tokensapproach with those models. - No beta header required: Unlike some other Claude features, adaptive thinking is generally available—no special headers needed.
Key Takeaways
- Adaptive thinking lets Claude dynamically allocate reasoning tokens based on task complexity, replacing manual budget settings for Opus 4.7, Opus 4.6, and Sonnet 4.6.
- Use the
effortparameter to guide reasoning depth:lowfor speed,highfor deep reasoning,maxfor unrestricted exploration. - Interleaved thinking enables smarter agentic workflows by allowing Claude to reason between tool calls, adapting its strategy in real time.
- Migrate from fixed budgets now—the legacy
budget_tokensapproach is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases. - Monitor token usage when using
effort: "max"or"high"on complex tasks, as adaptive thinking may consume more tokens than a fixed budget.