Mastering Adaptive Thinking in Claude: A Complete Guide to Dynamic Reasoning
Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Includes API setup, effort parameters, code examples, and best practices.
This guide explains how to use Claude's adaptive thinking mode, which lets the model dynamically decide when and how much to think. You'll learn how to set it up via the API, control thinking depth with the effort parameter, and optimize for cost and latency.
Introduction
Claude's extended thinking capability allows the model to "think" before responding, producing more accurate and nuanced answers—especially on complex tasks like math, coding, and multi-step reasoning. However, manually setting a fixed thinking token budget (budget_tokens) often leads to inefficiency: you either waste tokens on simple queries or under-think hard problems.
In this guide, you'll learn:
- How adaptive thinking works under the hood
- How to enable it in your API calls
- How to use the
effortparameter to control thinking depth - Best practices for cost, latency, and performance
- Migration tips if you're coming from
budget_tokens
Why Adaptive Thinking?
Traditional extended thinking with a fixed budget_tokens has two major drawbacks:
- Overthinking simple requests – You waste tokens and increase latency when the model thinks deeply about trivial questions.
- Underthinking complex requests – A small budget may truncate reasoning on genuinely hard problems, degrading answer quality.
- Whether to think at all
- How many thinking tokens to allocate
- When to interleave thinking between tool calls (for agentic workflows)
Supported Models
Adaptive thinking is available on:
| Model | API Name | Notes |
|---|---|---|
| Claude Mythos Preview | claude-mythos-preview | Adaptive is default; thinking: disabled not supported |
| Claude Opus 4.7 | claude-opus-4-7 | Only supported mode; manual budget_tokens rejected |
| Claude Opus 4.6 | claude-opus-4-6 | budget_tokens deprecated; migrate to adaptive |
| Claude Sonnet 4.6 | claude-sonnet-4-6 | budget_tokens deprecated; migrate to adaptive |
Warning: On Opus 4.6 and Sonnet 4.6,thinking.type: "enabled"withbudget_tokensis deprecated and will be removed in a future release. Plan to migrate to adaptive thinking.
Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking.type: "enabled" with budget_tokens.
How Adaptive Thinking Works
When you set thinking.type: "adaptive", Claude:
- Evaluates request complexity – The model analyzes the prompt, tools, and context to gauge difficulty.
- Decides whether to think – At default effort (
high), Claude almost always thinks. At lower effort levels, it may skip thinking for simple problems. - Allocates thinking tokens dynamically – The model uses as many or as few tokens as needed, up to the model's maximum context window.
- Enables interleaved thinking – Claude can think between tool calls, which is critical for agentic loops where the model needs to reason about tool outputs before acting again.
How to Use Adaptive Thinking
Basic Setup
Set thinking.type to "adaptive" in your API request. No budget_tokens is needed.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={"type": "adaptive"},
messages=[
{"role": "user", "content": "Solve this equation: 3x + 7 = 22"}
]
)
print(response.content[0].text)
TypeScript example:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: { type: 'adaptive' },
messages: [
{ role: 'user', content: 'Solve this equation: 3x + 7 = 22' }
]
});
console.log(response.content[0].text);
Using the Effort Parameter
The effort parameter gives you soft control over how much thinking Claude does. It's optional and defaults to high.
| Effort Level | Behavior | Available On |
|---|---|---|
max | Always thinks with no constraints on depth | Mythos Preview, Opus 4.7, Opus 4.6, Sonnet 4.6 |
xhigh | Always thinks deeply with extended exploration | Opus 4.7 |
high (default) | Always thinks; deep reasoning on complex tasks | All adaptive models |
medium | Moderate thinking; may skip for very simple queries | All adaptive models |
low | Minimizes thinking; skips for most queries | All adaptive models |
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "adaptive",
"effort": "medium" # or "low", "high", "xhigh", "max"
},
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
TypeScript example with effort:
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 4096,
thinking: {
type: 'adaptive',
effort: 'medium'
},
messages: [
{ role: 'user', content: 'What is the capital of France?' }
]
});
Adaptive Thinking in Agentic Workflows
One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to:
- Analyze tool outputs before deciding the next action
- Plan multi-step sequences
- Recover from errors or unexpected results
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={"type": "adaptive"},
tools=[
{
"name": "web_search",
"description": "Search the web for information",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
],
messages=[
{"role": "user", "content": "Research the latest AI breakthroughs in 2025 and summarize them in 3 bullet points."}
]
)
With adaptive thinking, Claude will:
- Think about how to break down the research task
- Call the search tool
- Think again about the results
- Call again if needed
- Finally compose the summary
Migrating from Fixed Budget Tokens
If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:
Before (deprecated)
thinking={
"type": "enabled",
"budget_tokens": 2048
}
After (recommended)
thinking={
"type": "adaptive",
"effort": "high" # optional, defaults to high
}
When to keep the old approach
- Predictable latency: If you need guaranteed response times, fixed
budget_tokensstill works on Opus 4.6 and Sonnet 4.6 (but is deprecated). - Cost control: If you must cap thinking token usage strictly, adaptive thinking does not support hard caps. Use
budget_tokensas a temporary measure while you evaluate adaptive.
Note: On Opus 4.7, thinking.type: "enabled" is rejected with a 400 error. You must use adaptive thinking.
Best Practices
1. Start with high effort
The default high effort works well for most use cases. It provides deep reasoning on complex tasks while still being efficient.
2. Use medium or low for cost-sensitive apps
If you're building a high-volume chatbot that handles mostly simple queries, medium or low can significantly reduce token usage without sacrificing quality on the rare complex question.
3. Use max or xhigh for critical reasoning
For tasks like mathematical proofs, code generation with complex logic, or multi-step planning, max (or xhigh on Opus 4.7) ensures Claude explores all reasoning paths.
4. Combine with tool use for agents
Adaptive thinking shines in agentic workflows. Enable interleaved thinking by setting thinking.type: "adaptive" alongside your tool definitions.
5. Monitor token usage
Even though adaptive thinking is more efficient, it can still use many tokens on hard problems. Use the usage field in the API response to track thinking tokens:
print(response.usage)
Output: {"input_tokens": 25, "output_tokens": 150, "thinking_tokens": 80}
6. Test with your workload
Every application is different. Run A/B tests comparing adaptive thinking (with various effort levels) against your current budget_tokens configuration to find the best balance of cost, latency, and quality.
Limitations
- No hard token cap: Adaptive thinking does not support a maximum thinking token budget. If you need strict cost control, consider using
budget_tokenson models that still support it (Opus 4.6, Sonnet 4.6). - Not available on older models: Sonnet 4.5, Opus 4.5, and earlier require the old
thinking.type: "enabled"approach. - Effort is soft guidance: The
effortparameter is not a hard constraint. Claude may still think more or less than expected.
Key Takeaways
- Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the old fixed
budget_tokensapproach. - Supported on Claude Mythos Preview, Opus 4.7, Opus 4.6, and Sonnet 4.6. On Opus 4.7, it's the only option.
- Use the
effortparameter (low,medium,high,xhigh,max) to guide thinking depth without hard caps. - Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
- Migrate now if you're using
budget_tokenson Opus 4.6 or Sonnet 4.6—the old approach is deprecated and will be removed.