Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens based on task complexity, optimize agentic workflows, and reduce costs without sacrificing performance.
Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on complex tasks, enables interleaved thinking between tool calls, and supports effort levels to balance depth and cost.
Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows
Claude’s extended thinking capability has been a game-changer for complex reasoning tasks—but until recently, you had to manually set a fixed token budget for every request. That approach worked well for predictable workloads, but it often wasted tokens on simple queries or fell short on unexpectedly complex ones.
Enter adaptive thinking: a smarter, more flexible way to use Claude’s reasoning powers. Instead of guessing how many thinking tokens a task needs, you let Claude decide. This guide will walk you through everything you need to know to implement adaptive thinking effectively.
What Is Adaptive Thinking?
Adaptive thinking is the recommended mode for using extended thinking with Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview. Rather than requiring a fixed budget_tokens value, it allows Claude to evaluate each request individually and determine:
- Whether extended thinking is needed at all
- How much thinking to allocate
Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} syntax will return a 400 error. On Opus 4.6 and Sonnet 4.6, the fixed-budget mode is deprecated and will be removed in a future release.
How Adaptive Thinking Works
When you enable adaptive thinking, Claude’s internal reasoning engine evaluates the complexity of the incoming prompt and decides on the fly how many thinking tokens to use. At the default effort level (high), Claude almost always engages thinking. At lower effort levels, it may skip thinking for straightforward requests.
A key benefit is interleaved thinking: Claude can pause between tool calls to re-evaluate, plan next steps, or refine its approach. This makes adaptive thinking especially powerful for agentic workflows where the model needs to reason iteratively.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
| Claude Mythos Preview | Default (auto-applies) | thinking: {type: "disabled"} not supported |
| Claude Opus 4.7 | Only mode | Manual enabled rejected with 400 |
| Claude Opus 4.6 | Supported | Manual mode deprecated |
| Claude Sonnet 4.6 | Supported | Manual mode deprecated |
| Older models (Sonnet 4.5, Opus 4.5) | Not supported | Must use enabled with budget_tokens |
How to Use Adaptive Thinking in the API
Enabling adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Here’s a complete Python example:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
And the equivalent in TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Fine-Tuning with the Effort Parameter
Adaptive thinking becomes even more powerful when combined with the effort parameter. This acts as soft guidance for how much thinking Claude should do:
| Effort Level | Behavior |
|---|---|
low | Minimal thinking; skips for simple tasks |
medium | Balanced approach |
high (default) | Almost always thinks; maximum reasoning depth |
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=8000,
thinking={
"type": "adaptive",
"effort": "medium"
},
messages=[
{
"role": "user",
"content": "What is the capital of France?"
}
]
)
Tip: Useloweffort for high-volume, simple queries where you want to minimize latency and cost. Usehigh(or omit the parameter) for complex analysis, multi-step reasoning, or agentic tasks.
When to Use Adaptive Thinking vs. Fixed Budget
Adaptive Thinking Is Best For:
- Bimodal workloads — Mix of simple and complex queries
- Agentic workflows — Multi-step tasks with tool calls
- Long-horizon planning — Tasks where reasoning depth varies
- Cost optimization — Avoid over-thinking simple requests
Fixed Budget Still Makes Sense For:
- Predictable latency — Real-time applications where response time must be consistent
- Strict cost control — You need to cap maximum thinking tokens per request
- Legacy integrations — Older models that don’t support adaptive mode
Practical Example: Agentic Workflow with Adaptive Thinking
Here’s how adaptive thinking shines in a multi-step agentic task. The model can think between each tool call:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive"},
tools=[
{
"name": "search_database",
"description": "Search a database for customer records",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "calculate_discount",
"description": "Calculate discount based on customer tier",
"input_schema": {
"type": "object",
"properties": {
"tier": {"type": "string"},
"amount": {"type": "number"}
},
"required": ["tier", "amount"]
}
}
],
messages=[
{
"role": "user",
"content": "Find the customer John Doe and calculate his loyalty discount on a $500 purchase."
}
]
)
Claude will think between tool calls automatically
for block in response.content:
if block.type == "thinking":
print(f"Thinking: {block.thinking}")
elif block.type == "tool_use":
print(f"Tool call: {block.name}")
elif block.type == "text":
print(f"Final response: {block.text}")
Migration Guide: Moving from Fixed Budget to Adaptive
If you’re currently using thinking: {type: "enabled", budget_tokens: N}, here’s your migration path:
- Identify all API calls using the old syntax on Opus 4.6 or Sonnet 4.6
- Replace
thinking: {type: "enabled", budget_tokens: 16000}withthinking: {type: "adaptive"} - Optionally add an
effortparameter if you need to control depth - Test on a subset of traffic to verify performance and cost
- Update model references — if you’re on Opus 4.7, the old syntax will fail
Key Takeaways
- Adaptive thinking lets Claude dynamically decide when and how much to reason, eliminating the need to guess a fixed token budget for each request.
- It enables interleaved thinking between tool calls, making it ideal for agentic workflows and multi-step tasks.
- The
effortparameter (low,medium,high) gives you soft control over reasoning depth without hard token limits. - On Claude Opus 4.7, adaptive thinking is mandatory — the old fixed-budget mode is no longer accepted.
- Migrate now if you’re using the deprecated
enabled+budget_tokenssyntax on Opus 4.6 or Sonnet 4.6, as it will be removed in a future release.