Mastering Adaptive Thinking in Claude: A Practical Guide to Smarter, Dynamic Reasoning
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens, improve agentic workflows, and optimize API costs with the effort parameter.
This guide explains how to use Claude's adaptive thinking mode to let the model dynamically decide when and how much to think, replacing manual budget_tokens. You'll learn how to implement it in the API, control thinking depth with the effort parameter, and optimize performance for agentic tasks.
Introduction
Claude's extended thinking capability has been a game-changer for complex reasoning tasks, but manually setting a thinking token budget (budget_tokens) often leads to either wasted tokens on simple queries or insufficient reasoning on hard problems. Enter adaptive thinking — a new mode that lets Claude dynamically determine when and how much to think, based on the complexity of each request.
This guide will walk you through everything you need to know about adaptive thinking: what it is, how it works, which models support it, and how to use it effectively in your API calls. By the end, you'll be able to replace rigid thinking budgets with a smarter, more efficient approach.
What Is Adaptive Thinking?
Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Instead of you manually setting a budget_tokens value, Claude evaluates each incoming request and decides:
- Whether to use extended thinking at all
- How much thinking to apply
Note: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} syntax is rejected with a 400 error.
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
Claude Mythos Preview (claude-mythos-preview) | Default (auto-applies) | thinking: {type: "disabled"} not supported |
Claude Opus 4.7 (claude-opus-4-7) | Only supported mode | Manual enabled rejected |
Claude Opus 4.6 (claude-opus-4-6) | Supported | Manual enabled deprecated |
Claude Sonnet 4.6 (claude-sonnet-4-6) | Supported | Manual enabled deprecated |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | Not supported | Must use thinking: {type: "enabled"} |
budget_tokens to adaptive thinking as soon as possible — the old syntax will be removed in a future model release.
How Adaptive Thinking Works
In adaptive mode, thinking becomes optional for the model. Here's what happens under the hood:
- Request arrives — Claude receives your prompt
- Complexity assessment — The model evaluates how much reasoning is required
- Dynamic allocation — Claude decides whether to think and, if so, how many tokens to use
- Interleaved thinking — For agentic workflows, Claude can think between tool calls, not just before the final response
How to Use Adaptive Thinking in the API
Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Here's a complete Python example:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
And here's the equivalent in TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
No beta headers are required — adaptive thinking is generally available for supported models.
Controlling Thinking Depth with the Effort Parameter
Adaptive thinking comes with an optional effort parameter that acts as a soft guide for how much thinking Claude should do. The effort parameter accepts one of three values:
"low"— Minimal thinking; Claude may skip thinking for simple problems"medium"— Balanced thinking"high"— Maximum thinking (default); Claude almost always thinks
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # or "low", "high"
},
messages=[
{
"role": "user",
"content": "Write a Python function to merge two sorted lists."
}
]
)
When to Use Each Effort Level
| Effort | Best For | Trade-off |
|---|---|---|
low | High-volume, simple queries (e.g., classification, extraction) | May miss nuance on borderline cases |
medium | General-purpose tasks with moderate complexity | Good balance of speed and depth |
high | Complex reasoning, math, code generation, agentic workflows | Higher token usage, slower response |
Adaptive Thinking for Agentic Workflows
One of the biggest benefits of adaptive thinking is its support for interleaved thinking in agentic workflows. When you combine adaptive thinking with tools, Claude can think between tool calls — not just before the final answer.
Consider a multi-step task like:
- Search a database for customer records
- Analyze the records for patterns
- Generate a summary report
Here's a simplified example:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": "high"},
tools=[
{
"name": "search_database",
"description": "Search customer records",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "generate_report",
"description": "Generate a summary report",
"input_schema": {
"type": "object",
"properties": {
"data": {"type": "string"}
},
"required": ["data"]
}
}
],
messages=[
{
"role": "user",
"content": "Find all customers who churned last quarter and create a report."
}
]
)
Migrating from Manual Budget Tokens
If you're currently using thinking: {type: "enabled", budget_tokens: 16000}, here's how to migrate:
- Identify your model — Check if it supports adaptive thinking (Opus 4.6+, Sonnet 4.6+, or Mythos Preview)
- Replace the thinking block — Change
{"type": "enabled", "budget_tokens": N}to{"type": "adaptive"} - Optionally add effort — If you previously used a high budget, start with
"effort": "high" - Test and compare — Run your existing test suite to verify quality hasn't degraded
thinking={"type": "enabled", "budget_tokens": 16000}
After (recommended):
thinking={"type": "adaptive", "effort": "high"}
Best Practices
- Start with default effort — Use
"effort": "high"(or omit it) for most workloads, then dial down if you need faster responses - Monitor token usage — Adaptive thinking can sometimes use more tokens than a fixed budget on complex queries, but it saves on simple ones
- Combine with prompt caching — For multi-turn conversations, use prompt caching to reduce latency and costs
- Test with your data — Run A/B tests comparing adaptive vs. fixed budget on your specific use case
- Use for agentic tasks — Adaptive thinking shines in multi-step tool-using workflows; prefer it over fixed budgets for agents
Common Pitfalls
- Assuming zero thinking — Even with
"effort": "low", Claude may still think for complex queries - Ignoring model compatibility — Older models (Sonnet 4.5, Opus 4.5) don't support adaptive thinking; check the model name
- Setting max_tokens too low — Adaptive thinking can use significant tokens; ensure
max_tokensis high enough to accommodate both thinking and response
Key Takeaways
- Adaptive thinking lets Claude dynamically decide when and how much to think, replacing rigid
budget_tokensfor smarter resource allocation. - It's the only supported mode on Claude Opus 4.7 and the recommended mode on Opus 4.6 and Sonnet 4.6; manual
enabledis deprecated. - The
effortparameter (low,medium,high) gives you soft control over thinking depth without hard token limits. - Interleaved thinking makes adaptive mode ideal for agentic workflows where Claude reasons between tool calls.
- Migrate now if you're using
budget_tokenson supported models — the old syntax will be removed in future releases.