Mastering Adaptive Thinking in Claude: Smarter Reasoning Without Manual Budgets
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens, optimize performance, and simplify your API code—no manual budgets required.
Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking based on request complexity. You replace manual budget_tokens with a simple 'adaptive' flag and optionally set an effort level. It's the recommended mode for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview.
Introduction
If you've worked with Claude's extended thinking feature, you know the power of letting the model "think before it speaks." But manually setting a budget_tokens value for every request can be tedious and inefficient—especially when your workload mixes simple questions with complex reasoning tasks.
Enter adaptive thinking. This new mode, now the recommended approach for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and the Claude Mythos Preview, lets Claude decide when and how much to think. The result? Better performance, simpler code, and smarter resource allocation.
In this guide, you'll learn exactly what adaptive thinking is, which models support it, how to use it in your API calls, and how to fine-tune it with the effort parameter. By the end, you'll be able to replace clunky budget_tokens configurations with a clean, adaptive approach.
What Is Adaptive Thinking?
Adaptive thinking is a new mode for Claude's extended thinking capability. Instead of you specifying a fixed number of thinking tokens (budget_tokens), Claude evaluates each incoming request and dynamically determines:
- Whether to use extended thinking at all
- How many tokens to allocate for thinking
Additionally, adaptive thinking automatically enables interleaved thinking, meaning Claude can think between tool calls in agentic workflows. This makes it ideal for multi-step tasks where the model needs to reason after each tool result.
Supported Models
Not all Claude models support adaptive thinking. Here's the current compatibility:
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
Claude Mythos Preview (claude-mythos-preview) | ✅ Default mode | Thinking auto-applies when unset; thinking: {type: "disabled"} not supported |
Claude Opus 4.7 (claude-opus-4-7) | ✅ Only supported mode | Manual thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error |
Claude Opus 4.6 (claude-opus-4-6) | ✅ Supported | Manual budget_tokens is deprecated but still functional |
Claude Sonnet 4.6 (claude-sonnet-4-6) | ✅ Supported | Manual budget_tokens is deprecated but still functional |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | ❌ Not supported | Must use thinking.type: "enabled" with budget_tokens |
Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"andbudget_tokensare deprecated and will be removed in a future model release. Migrate to adaptive thinking now.
How to Use Adaptive Thinking (Basic)
Using adaptive thinking is straightforward. Instead of:
{
"thinking": {
"type": "enabled",
"budget_tokens": 16000
}
}
You simply write:
{
"thinking": {
"type": "adaptive"
}
}
Here's a complete Python example:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
And the equivalent in TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
No beta headers are required—adaptive thinking is GA for supported models.
Fine-Tuning with the Effort Parameter
Adaptive thinking also introduces the effort parameter, which gives you soft control over how much thinking Claude should do. Think of it as a dial between "minimal thinking" and "maximum reasoning."
| Effort Level | Behavior |
|---|---|
low | Claude may skip thinking for simple requests; uses minimal tokens when it does think |
medium | Balanced approach; thinks for moderately complex tasks |
high (default) | Claude almost always thinks; allocates significant tokens for reasoning |
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # or "low", "high"
},
messages=[
{
"role": "user",
"content": "Design a scalable microservices architecture for an e-commerce platform."
}
]
)
When to Use Each Effort Level
low: Use for high-throughput, low-complexity tasks like simple classification, extraction, or Q&A where you want to minimize latency and cost.medium: Good for general-purpose use where you expect a mix of simple and complex requests.high: Best for complex reasoning, multi-step agentic workflows, mathematical proofs, code generation, and any task where quality is more important than speed.
Adaptive Thinking in Agentic Workflows
One of the biggest advantages of adaptive thinking is interleaved thinking. In traditional extended thinking, Claude thinks once before responding. With adaptive mode, Claude can think between tool calls, evaluating results and planning the next step.
This makes adaptive thinking a game-changer for agentic workflows:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": "high"},
tools=[
{
"name": "search_database",
"description": "Search the product database",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "calculate_shipping",
"description": "Calculate shipping cost",
"input_schema": {
"type": "object",
"properties": {
"weight": {"type": "number"},
"destination": {"type": "string"}
},
"required": ["weight", "destination"]
}
}
],
messages=[
{
"role": "user",
"content": "Find the cheapest laptop under $1000 and calculate shipping to New York."
}
]
)
In this scenario, Claude can:
- Think about how to approach the task
- Call
search_databaseto find laptops - Think about the results and decide which is cheapest
- Call
calculate_shippingwith the selected laptop's weight - Think about the final answer and present it to the user
Migrating from Manual Budgets
If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration checklist:
- Identify your models: Check if you're using Opus 4.6, Sonnet 4.6, or newer. Older models still require manual budgets.
- Replace the thinking block: Change
{"type": "enabled", "budget_tokens": N}to{"type": "adaptive"}. - Add an effort level (optional): If you had a high budget, start with
effort: "high". For low budgets, tryeffort: "low". - Test and compare: Run your existing test suite and compare quality and latency.
- Remove beta headers: Adaptive thinking doesn't require any beta headers.
Best Practices
- Start with
effort: "high"for complex tasks, then dial down if you need lower latency or cost. - Monitor token usage—adaptive thinking can sometimes use more tokens than a fixed budget on very complex requests. Set
max_tokensappropriately. - Combine with prompt caching for multi-turn conversations to reduce costs further.
- Use interleaved thinking to your advantage in agentic workflows—let Claude reason after each tool result.
- Test edge cases: Adaptive thinking may skip thinking for very simple requests at lower effort levels. If you always need thinking, stick with
effort: "high".
Conclusion
Adaptive thinking represents a major step forward in how Claude handles reasoning. By removing the need for manual token budgets and introducing the effort parameter, Anthropic has made extended thinking both more powerful and easier to use.
Whether you're building a simple Q&A bot or a complex agentic system, adaptive thinking can help you get better results with less code. Migrate your workloads today to take advantage of this smarter, more flexible approach.
Key Takeaways
- Adaptive thinking replaces manual
budget_tokenswith a dynamic system where Claude decides when and how much to think based on request complexity. - Supported on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview—older models still require the manual approach.
- The
effortparameter (low,medium,high) gives you soft control over thinking allocation without hard budgets. - Interleaved thinking is automatic in adaptive mode, making it ideal for agentic workflows with multiple tool calls.
- Migrate now if you're on Opus 4.6 or Sonnet 4.6—manual
budget_tokensis deprecated and will be removed in a future release.