Adaptive Thinking in Claude API: Smarter Extended Thinking Without Manual Budgets
Learn how to use adaptive thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. Dynamic extended thinking, effort control, and code examples included.
Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking, replacing manual budget_tokens. It's the only mode on Opus 4.7 and default on Mythos Preview. Use thinking.type: 'adaptive' with an optional effort parameter.
Introduction
Extended thinking has been one of Claude's most powerful capabilities, allowing the model to reason step-by-step before generating a final answer. However, manually setting a budget_tokens value often led to either wasted tokens (overthinking simple tasks) or insufficient reasoning (underthinking complex ones).
This guide covers everything you need to know: supported models, how adaptive thinking works, the effort parameter, code examples, and migration tips.
Supported Models
Adaptive thinking is available on:
- Claude Mythos Preview (
claude-mythos-preview) — adaptive thinking is the default;thinking: {type: "disabled"}is not supported. - Claude Opus 4.7 (
claude-opus-4-7) — adaptive thinking is the only supported thinking mode. Manualthinking: {type: "enabled", budget_tokens: N}is rejected with a 400 error. - Claude Opus 4.6 (
claude-opus-4-6) — supports adaptive thinking; manualbudget_tokensis deprecated. - Claude Sonnet 4.6 (
claude-sonnet-4-6) — supports adaptive thinking; manualbudget_tokensis deprecated.
Important: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking: {type: "enabled", budget_tokens: N}.
How Adaptive Thinking Works
In adaptive mode, thinking is optional for Claude. The model evaluates each request and decides:
- Whether to use extended thinking at all
- How much thinking to apply
high), Claude almost always thinks. At lower effort levels, it may skip thinking for simple problems—saving tokens and reducing latency.
Adaptive thinking also automatically enables interleaved thinking: Claude can think between tool calls, making it especially effective for agentic workflows where the model needs to reason after receiving tool results.
Using Adaptive Thinking in the API
Basic Usage
Set thinking.type to "adaptive" in your API request:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Adaptive Thinking with the Effort Parameter
You can guide how much thinking Claude does using the effort parameter. The effort level accepts values from 0.0 to 1.0 (inclusive), with 0.5 as the default:
- Low effort (e.g., 0.1): Minimal thinking; Claude may skip thinking for simple tasks. Best for high-volume, low-complexity requests.
- Medium effort (e.g., 0.5): Balanced thinking. The default.
- High effort (e.g., 0.9 or 1.0): Maximum thinking. Best for complex reasoning, math, coding, and agentic workflows.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={
"type": "adaptive",
"effort": 0.9 # High effort for complex reasoning
},
messages=[
{
"role": "user",
"content": "Design a distributed caching system that handles cache invalidation across 100 nodes."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 32000,
thinking: {
type: 'adaptive',
effort: 0.9
},
messages: [
{
role: 'user',
content: 'Design a distributed caching system that handles cache invalidation across 100 nodes.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
Adaptive Thinking for Agentic Workflows
One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to:
- Reason about the user's request
- Call a tool
- Think about the tool's output
- Call another tool or produce a final answer
budget_tokens, thinking was a single block at the start. With adaptive thinking, Claude can distribute thinking across the entire conversation.
Example: Multi-step Tool Use
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=32000,
thinking={"type": "adaptive", "effort": 0.8},
tools=[
{
"name": "search_database",
"description": "Search a database for records",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "calculate",
"description": "Perform a calculation",
"input_schema": {
"type": "object",
"properties": {
"expression": {"type": "string"}
},
"required": ["expression"]
}
}
],
messages=[
{
"role": "user",
"content": "Find all orders from last month and calculate the total revenue."
}
]
)
Claude will think, call search_database, think about results, call calculate, think, then respond
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
elif block.type == "tool_use":
print(f"\nTool call: {block.name}({block.input})")
Migration Guide: From Manual to Adaptive Thinking
If you're currently using thinking: {type: "enabled", budget_tokens: N}, here's how to migrate:
Step 1: Update Your Thinking Configuration
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 16000
}
After (recommended):
thinking={
"type": "adaptive",
"effort": 0.7 # Adjust based on your workload
}
Step 2: Remove Beta Headers
Adaptive thinking does not require any beta header. If you were using anthropic-beta: thinking-2025-01-02, you can remove it.
Step 3: Test with Your Workloads
- For simple tasks (Q&A, summarization), try low effort (0.3–0.5)
- For complex reasoning (math, code generation), try high effort (0.8–1.0)
- For agentic workflows, use high effort (0.8–1.0) to maximize interleaved thinking
Step 4: Monitor and Adjust
Track token usage and response quality. Adaptive thinking often reduces token consumption for bimodal workloads (mixing simple and complex requests).
When to Use Adaptive Thinking vs. Manual Budget
| Use Case | Recommended Mode |
|---|---|
| Bimodal workloads (mix of simple & complex) | Adaptive thinking |
| Long-horizon agentic workflows | Adaptive thinking |
| Predictable latency requirements | Manual budget (Opus 4.6/Sonnet 4.6 only) |
| Precise cost control | Manual budget (Opus 4.6/Sonnet 4.6 only) |
| Opus 4.7 or Mythos Preview | Adaptive thinking (only option) |
Warning: Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release. Plan to migrate to adaptive thinking.
Best Practices
- Start with default effort (0.5) and adjust based on your workload's complexity.
- Use high effort (0.8–1.0) for agentic workflows to leverage interleaved thinking.
- Use lower effort (0.2–0.4) for high-volume, simple tasks to save tokens and reduce latency.
- Set
max_tokensgenerously (e.g., 16000–32000) to allow Claude room to think and respond. - Handle thinking blocks in your response processing to display or log Claude's reasoning.
Key Takeaways
- Adaptive thinking replaces manual
budget_tokens— Claude dynamically decides when and how much to think, optimizing for both performance and cost. - It's the only mode on Opus 4.7 and the default on Mythos Preview. Manual thinking is rejected on Opus 4.7.
- Use the
effortparameter (0.0–1.0) to guide thinking depth. Default is 0.5. - Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
- Migrate now if you're using manual
budget_tokenson Opus 4.6 or Sonnet 4.6 — it's deprecated and will be removed in a future release.