Mastering Adaptive Thinking in Claude API: Dynamic Reasoning for Smarter AI Workflows
Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning, optimize performance, and simplify API calls across Opus 4.7, Sonnet 4.6, and Mythos Preview.
This guide explains how to use adaptive thinking in Claude API to let the model dynamically decide when and how much to think, replacing manual token budgets. You'll learn setup, effort tuning, and best practices for Opus 4.7, Sonnet 4.6, and Mythos Preview.
Introduction
Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But manually setting a budget_tokens value for every request can be tedious and suboptimal—especially when your workload mixes simple lookups with deep analytical queries. Enter adaptive thinking: a smarter, dynamic approach that lets Claude decide when and how much to think based on the complexity of each request.
Adaptive thinking is now the recommended (and in some cases, only) thinking mode for Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview. This guide will walk you through everything you need to know to adopt adaptive thinking in your API workflows.
What Is Adaptive Thinking?
Adaptive thinking replaces the old thinking: {type: "enabled", budget_tokens: N} pattern with a single flag: thinking: {type: "adaptive"}. When you enable it, Claude evaluates each incoming request and decides:
- Whether extended thinking is needed at all
- How many thinking tokens to allocate
- When to interleave thinking between tool calls (interleaved thinking)
Key Benefits
- Better performance on bimodal tasks – Mixes of simple and complex queries within the same conversation
- Ideal for long-horizon agentic workflows – Interleaved thinking allows Claude to reason between tool calls
- Simpler API code – No need to calculate or adjust
budget_tokensper request - Future-proof – Manual
budget_tokensis deprecated on Opus 4.6 and Sonnet 4.6, and rejected on Opus 4.7
Supported Models
| Model | Adaptive Thinking Support | Notes |
|---|---|---|
Claude Mythos Preview (claude-mythos-preview) | Default mode | Thinking is auto-applied; cannot be disabled |
Claude Opus 4.7 (claude-opus-4-7) | Only supported mode | Manual thinking: {type: "enabled"} returns a 400 error |
Claude Opus 4.6 (claude-opus-4-6) | Supported | Manual budget_tokens deprecated |
Claude Sonnet 4.6 (claude-sonnet-4-6) | Supported | Manual budget_tokens deprecated |
| Older models (Sonnet 4.5, Opus 4.5, etc.) | Not supported | Must use thinking: {type: "enabled"} with budget_tokens |
Warning: If you're using Opus 4.6 or Sonnet 4.6 with budget_tokens, plan to migrate to adaptive thinking soon. The old syntax will be removed in a future model release.
How to Use Adaptive Thinking
Basic Setup
To enable adaptive thinking, set thinking.type to "adaptive" in your API request. Here's a complete Python example:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
)
for block in response.content:
if block.type == "thinking":
print(f"\nThinking: {block.thinking}")
elif block.type == "text":
print(f"\nResponse: {block.text}")
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 16000,
thinking: { type: 'adaptive' },
messages: [
{
role: 'user',
content: 'Explain why the sum of two even numbers is always even.'
}
]
});
for (const block of response.content) {
if (block.type === 'thinking') {
console.log(\nThinking: ${block.thinking});
} else if (block.type === 'text') {
console.log(\nResponse: ${block.text});
}
}
That's it! No budget_tokens, no beta headers, no extra configuration. Claude handles the rest.
Controlling Thinking Depth with the Effort Parameter
Adaptive thinking also supports an optional effort parameter that acts as a soft guide for how much thinking Claude should do. This is useful when you want to nudge the model toward deeper reasoning without setting a hard token limit.
Effort Levels
| Effort Level | Behavior |
|---|---|
low | Minimal thinking; Claude may skip thinking for simple queries |
medium | Balanced approach; moderate reasoning for most tasks |
high (default) | Claude almost always thinks, even for simple problems |
Example with Effort
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=16000,
thinking={
"type": "adaptive",
"effort": "medium" # Optional: defaults to "high"
},
messages=[
{
"role": "user",
"content": "Design a scalable microservice architecture for an e-commerce platform."
}
]
)
When to use each effort level:
low– High-volume, latency-sensitive applications where most queries are simple (e.g., FAQ bots, basic classification)medium– General-purpose assistants that handle a mix of simple and complex questionshigh– Research, analysis, coding, or agentic workflows where thorough reasoning is critical
Adaptive Thinking in Agentic Workflows
One of the most powerful features of adaptive thinking is interleaved thinking—the ability for Claude to think between tool calls. This is automatically enabled when you use adaptive thinking.
Consider a multi-step agent that needs to:
- Search a database
- Analyze results
- Call an external API
- Synthesize a final answer
budget_tokens, you'd have to allocate a single thinking budget upfront, which might be too small for the final synthesis or too large for the initial search. Adaptive thinking dynamically adjusts at each step.
Example: Agent with Interleaved Thinking
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=32000,
thinking={"type": "adaptive"},
tools=[
{
"name": "search_database",
"description": "Search the internal database",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
},
{
"name": "call_external_api",
"description": "Call an external service",
"input_schema": {
"type": "object",
"properties": {
"endpoint": {"type": "string"},
"payload": {"type": "object"}
},
"required": ["endpoint", "payload"]
}
}
],
messages=[
{
"role": "user",
"content": "Find the latest sales data, enrich it with weather data from the external API, and summarize the trends."
}
]
)
Claude will think before each tool call, after receiving results, and before producing the final summary—all without you having to manage token budgets.
Migrating from Manual Budget Tokens
If you're currently using thinking: {type: "enabled", budget_tokens: 8000}, here's your migration checklist:
- Identify all API calls using the old syntax on Opus 4.6, Sonnet 4.6, or Opus 4.7
- Replace
thinking: {type: "enabled", budget_tokens: N}withthinking: {type: "adaptive"} - Optionally add an
effortparameter if you need to control thinking depth - Test with a representative sample of your workload to ensure quality remains consistent
- Monitor token usage—you may see cost savings on simple queries and better quality on complex ones
Before vs. After
Before (deprecated):thinking={
"type": "enabled",
"budget_tokens": 8000
}
After (recommended):
thinking={
"type": "adaptive"
}
Best Practices
- Start with the default
higheffort – It provides the most consistent quality. Only lower effort if you have specific latency or cost constraints. - Use adaptive thinking for agentic workflows – Interleaved thinking is a major advantage over manual budgets.
- Don't mix adaptive and manual thinking – On Opus 4.7, manual thinking is rejected. On other models, stick to one mode per request.
- Monitor your thinking blocks – The API response still includes
thinkingcontent blocks, so you can inspect what Claude reasoned. - Test across models – Behavior may vary slightly between Opus 4.7 and Sonnet 4.6. Validate your use case on each.
Limitations and Considerations
- Not supported on older models – Sonnet 4.5, Opus 4.5, and earlier still require
budget_tokens. - No hard latency guarantees – Because thinking is dynamic, response times may vary more than with a fixed budget.
- Effort is a soft guide – Claude may still think more or less than you'd expect at a given effort level.
- Mythos Preview auto-applies thinking – You cannot disable thinking on this model.
Conclusion
Adaptive thinking represents a major step forward in how Claude handles reasoning. By removing the need to manually set token budgets, it simplifies your code, improves performance on mixed workloads, and enables more natural agentic behavior through interleaved thinking.
Whether you're building a simple Q&A bot or a complex multi-step agent, adaptive thinking is now the default choice for Claude's most capable models. Migrate today to future-proof your applications and unlock Claude's full reasoning potential.
Key Takeaways
- Adaptive thinking replaces manual
budget_tokens– Setthinking: {type: "adaptive"}and let Claude decide how much to think per request. - Supported on Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview – Older models still require the legacy syntax.
- Use the
effortparameter to softly guide thinking depth:low,medium, orhigh(default). - Interleaved thinking is automatic – Claude can reason between tool calls, making adaptive thinking ideal for agentic workflows.
- Manual
budget_tokensis deprecated – Migrate now to avoid breaking changes in future model releases.