Adaptive Thinking in Claude API: Smarter Extended Thinking Without Manual Budgets
Learn how to use adaptive thinking in Claude API to let Claude dynamically decide when and how much to think, improving performance on complex tasks without manual token budgets.
Adaptive thinking lets Claude dynamically allocate thinking tokens based on request complexity, replacing manual budget_tokens. It's the only supported mode on Claude Opus 4.7 and default on Mythos Preview, enabling interleaved thinking for agentic workflows and better performance on bimodal tasks.
Adaptive Thinking in Claude API: Smarter Extended Thinking Without Manual Budgets
Extended thinking has been one of Claude's most powerful capabilities, allowing the model to "think" before responding to complex queries. But manually setting a thinking token budget for every request was often a guessing game—too few tokens and Claude couldn't fully reason through hard problems; too many and you wasted tokens on simple tasks.
Enter adaptive thinking: a new mode that lets Claude dynamically decide when and how much to think, based on the complexity of each individual request. This guide covers everything you need to know to use adaptive thinking effectively in your Claude API applications.
What Is Adaptive Thinking?
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It's also the default mode on Claude Mythos Preview, where it auto-applies whenever thinking is unset.
Instead of you manually setting a budget_tokens value, Claude evaluates each request and determines:
- Whether to use extended thinking at all
- How much thinking to allocate
Key Benefits
- No more guesswork: You don't need to estimate thinking budgets per request
- Better performance: Adaptive thinking often outperforms fixed budgets on mixed workloads
- Interleaved thinking: Claude can think between tool calls automatically, making agents more effective
- Cost efficiency: Claude skips thinking on simple problems when appropriate
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
Claude Mythos Preview (claude-mythos-preview) | Default mode | Auto-applies when thinking is unset; thinking: {type: "disabled"} not supported |
Claude Opus 4.7 (claude-opus-4-7) | Only supported mode | Manual thinking: {type: "enabled"} is rejected with a 400 error |
Claude Opus 4.6 (claude-opus-4-6) | Supported | Manual budget_tokens deprecated, plan to migrate |
Claude Sonnet 4.6 (claude-sonnet-4-6) | Supported | Manual budget_tokens deprecated, plan to migrate |
Important: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still requirethinking: {type: "enabled"}withbudget_tokens.
How Adaptive Thinking Works
In adaptive mode, thinking is optional for the model. At the default effort level (high), Claude almost always thinks. At lower effort levels, Claude may skip thinking for simpler problems.
Adaptive thinking also automatically enables interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to reason about tool results before taking the next action.
How to Use Adaptive Thinking
Basic Usage
To enable adaptive thinking, set thinking.type to "adaptive" in your API request. No beta header is required.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Using the Effort Parameter
You can combine adaptive thinking with the effort parameter to guide how much thinking Claude does. The effort level acts as soft guidance—Claude will generally follow your hint but may still adjust based on actual request complexity.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # Options: low, medium, high
},
messages=[
{
"role": "user",
"content": "Write a comprehensive analysis of quantum computing's impact on cryptography."
}
]
)
Effort Levels
| Effort Level | Behavior | Best For |
|---|---|---|
low | Minimal thinking; may skip for simple tasks | High-throughput, low-complexity workloads |
medium | Balanced thinking allocation | General-purpose use |
high (default) | Almost always thinks | Complex reasoning, agentic workflows |
Adaptive Thinking in Agentic Workflows
One of the most powerful applications of adaptive thinking is in agentic workflows where Claude uses tools. With interleaved thinking enabled automatically, Claude can:
- Receive a complex user request
- Think about how to approach it
- Call a tool (e.g., search, code execution)
- Think about the tool results
- Decide on the next action
- Continue until the task is complete
- Multi-step research tasks where Claude needs to search, read, and synthesize
- Code generation and debugging where Claude writes code, runs it, and iterates
- Data analysis pipelines where Claude queries databases, processes results, and generates reports
Example: Agent with Adaptive Thinking
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive"},
tools=[
{
"name": "web_search",
"description": "Search the web for information",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
messages=[
{
"role": "user",
"content": "Research the latest breakthroughs in fusion energy and summarize the top 3 developments from 2024."
}
]
)
Migration Guide: Moving from Manual Budgets
If you're currently using thinking: {type: "enabled", budget_tokens: N} on Opus 4.6 or Sonnet 4.6, here's how to migrate:
Step 1: Update Your Code
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (recommended):
thinking={
"type": "adaptive"
}
Or with effort control:
thinking={
"type": "adaptive",
"effort": "high"
}
Step 2: Test and Compare
Run your existing workloads with adaptive thinking and compare:
- Response quality: Adaptive thinking often matches or exceeds fixed budgets
- Token usage: You may see cost savings on simpler requests
- Latency: Adaptive thinking may be faster on simple tasks
Step 3: Remove Beta Headers
If you were using any beta headers for extended thinking, they are no longer required with adaptive thinking.
Best Practices
- Start with default effort: Begin with
effort: "high"(the default) and adjust down only if you see unnecessary thinking on simple tasks.
- Use for agentic workflows: Adaptive thinking shines when Claude needs to reason between tool calls. Enable it for any multi-step agent.
- Monitor token usage: While adaptive thinking is efficient, you should still set appropriate
max_tokenslimits to cap total response size.
- Test bimodal workloads: If your application handles both simple Q&A and complex reasoning, adaptive thinking will likely outperform fixed budgets.
- Check model compatibility: Remember that older models (Opus 4.5 and earlier) don't support adaptive thinking—you'll need to use manual budgets for those.
Limitations and Considerations
- Not supported on older models: You cannot use adaptive thinking on Sonnet 4.5, Opus 4.5, or earlier models.
- No precise cost control: If you need predictable latency or exact token budgets, manual
budget_tokensis still available on Opus 4.6 and Sonnet 4.6 (though deprecated). - Effort is soft guidance: The
effortparameter is a hint, not a hard constraint. Claude may still think more or less than you expect.
Key Takeaways
- Adaptive thinking replaces manual budget_tokens on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and is the default on Mythos Preview.
- Claude dynamically decides when and how much to think based on request complexity, improving both performance and cost efficiency.
- Interleaved thinking is automatically enabled, making adaptive thinking ideal for agentic workflows with tool calls.
- Use the effort parameter (
low,medium,high) to provide soft guidance on thinking allocation. - Migrate now: Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release.