Mastering Adaptive Thinking in Claude: Smarter Reasoning Without Manual Budgets
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort based on task complexity. Includes code examples, effort parameters, and migration tips.
Adaptive thinking lets Claude automatically decide when and how much to use extended reasoning, replacing manual token budgets. You set thinking.type to 'adaptive' and optionally use the effort parameter to guide depth. It's ideal for bimodal tasks and agentic workflows.
Introduction
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning before generating a response. This worked, but it was rigid: simple requests got more thinking than needed, and complex ones could run out of room.
In this guide, you'll learn:
- What adaptive thinking is and how it differs from manual extended thinking
- Which models support it
- How to use it with code examples (Python and TypeScript)
- How to control thinking depth with the
effortparameter - Best practices and migration tips
What Is Adaptive Thinking?
Adaptive thinking is a new mode for Claude's extended reasoning. Instead of forcing Claude to use a fixed number of thinking tokens on every request, you let Claude decide:
- Whether to think at all (for simple questions, it may skip thinking)
- How much to think (for complex tasks, it can use more tokens)
Adaptive thinking also enables interleaved thinking: Claude can think between tool calls, not just before the first response. This makes it ideal for agents that need to plan, act, observe, and re-plan.
Note: Adaptive thinking is eligible for Zero Data Retention (ZDR). If your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
Claude Mythos Preview (claude-mythos-preview) | Default (auto-applies when thinking is unset) | thinking: {type: "disabled"} is not supported |
Claude Opus 4.7 (claude-opus-4-7) | Only supported mode | Manual thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error |
Claude Opus 4.6 (claude-opus-4-6) | Supported | Manual mode deprecated; migrate to adaptive |
Claude Sonnet 4.6 (claude-sonnet-4-6) | Supported | Manual mode deprecated; migrate to adaptive |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | Not supported | Must use thinking.type: "enabled" with budget_tokens |
How to Use Adaptive Thinking
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.
Python Example
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Controlling Thinking Depth with the Effort Parameter
Adaptive thinking alone lets Claude decide how much to think. But you can provide soft guidance using the effort parameter. This is especially useful when you want to:
- Save costs on simple tasks (low effort)
- Maximize reasoning on critical problems (high effort)
- Balance between speed and depth
Effort Levels
| Effort Level | Behavior |
|---|---|
low | Claude may skip thinking for simple problems; uses minimal thinking tokens |
medium | Balanced approach; thinks when beneficial but conserves tokens |
high | (Default) Claude almost always thinks; uses substantial tokens for complex tasks |
Example with Effort
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # or "low", "high"
},
messages=[
{
"role": "user",
"content": "Design a scalable architecture for a real-time chat application serving 10 million users."
}
]
)
Tip: Start witheffort: "high"for complex reasoning tasks. Use"medium"or"low"for high-throughput applications where latency matters.
Adaptive Thinking in Agentic Workflows
One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is critical for agents that need to:
- Plan a sequence of actions
- Execute a tool call
- Observe the result
- Re-think and adjust the plan
- Execute the next tool call
Example: Agent with Tool Calls
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": "high"},
tools=[
{
"name": "search_database",
"description": "Search the internal database for customer records",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "send_email",
"description": "Send an email to a customer",
"input_schema": {
"type": "object",
"properties": {
"to": {"type": "string"},
"subject": {"type": "string"},
"body": {"type": "string"}
},
"required": ["to", "subject", "body"]
}
}
],
messages=[
{
"role": "user",
"content": "Find the customer with the highest outstanding balance and send them a payment reminder."
}
]
)
In this scenario, Claude can:
- Think about how to approach the task
- Call
search_databaseto find the customer - Think about the results and decide on an appropriate email tone
- Call
send_emailwith the right content
Migration Guide: From Manual to Adaptive
If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:
Step 1: Identify affected models
Check which models you're using. If you're on Opus 4.6 or Sonnet 4.6, you can migrate now. If you're on Opus 4.7, you must migrate (manual mode is rejected).
Step 2: Replace the thinking block
Before (manual):{
"thinking": {
"type": "enabled",
"budget_tokens": 8000
}
}
After (adaptive):
{
"thinking": {
"type": "adaptive",
"effort": "high"
}
}
Step 3: Adjust max_tokens
With manual mode, max_tokens needed to be greater than budget_tokens. With adaptive mode, max_tokens still limits the total output (thinking + visible text), but you don't need to reserve a separate budget. You can often reduce max_tokens for simpler tasks.
Step 4: Test and compare
Run your existing test suite with adaptive mode. For most workloads, you'll see:
- Equal or better reasoning quality
- Lower token usage for simple requests
- More consistent performance on complex tasks
Best Practices
- Start with
effort: "high"for tasks that previously used manual thinking. You can dial it down later. - Use
effort: "low"for high-throughput, low-complexity workloads like simple classification or extraction. - Monitor token usage—adaptive thinking can reduce costs on bimodal workloads but may increase tokens on uniformly complex tasks.
- Combine with tool use for agentic workflows; interleaved thinking is a major advantage.
- Don't set
max_tokenstoo low—Claude needs room for both thinking and visible output. A good starting point is 2x your expected output length.
Limitations and Considerations
- Not supported on older models (Sonnet 4.5, Opus 4.5, etc.). You must continue using manual
budget_tokensfor those. - Predictable latency: If you need consistent response times, manual mode may still be preferable on Opus 4.6/Sonnet 4.6 (though deprecated).
- Cost control: Adaptive thinking can use more tokens than expected on very complex tasks. Use the
effortparameter to limit this.
Key Takeaways
- Adaptive thinking replaces manual token budgets on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview. Set
thinking.typeto"adaptive". - Claude dynamically decides how much to think based on request complexity, saving tokens on simple tasks and allocating more for complex ones.
- Use the
effortparameter (low,medium,high) to guide thinking depth without hard constraints. - Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
- Migrate now if you're on Opus 4.6 or Sonnet 4.6—manual mode is deprecated and will be removed in a future release.