Guide2026-05-06

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens based on task complexity, with practical API examples and best practices.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on complex tasks, supports interleaved thinking between tool calls, and is the only mode on Claude Opus 4.7.

adaptive thinkingClaude APIextended thinkingagentic workflowsreasoning optimization

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed amount of thinking tokens for every request. This worked well for predictable workloads, but it often wasted tokens on simple queries or fell short on unexpectedly complex ones.

Adaptive thinking changes the game. Instead of a one-size-fits-all budget, Claude now decides dynamically—per request—whether to use extended thinking and how much reasoning to apply. The result: smarter token usage, better performance on mixed workloads, and a simpler API.

This guide covers everything you need to know: which models support it, how to use it in your code, how to tune it with the effort parameter, and when to stick with the old approach.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. It is also the default mode on Claude Mythos Preview.

Instead of setting a fixed budget_tokens value, you set thinking.type to "adaptive". Claude then evaluates each request's complexity and decides:

Whether to use extended thinking at all
How many thinking tokens to allocate

At the default effort level (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems, saving you tokens and latency.

Key Benefits

Better performance on bimodal tasks (mixing simple and complex queries)
Ideal for long-horizon agentic workflows where reasoning depth varies
Automatic interleaved thinking—Claude can think between tool calls, improving multi-step reasoning
No beta header required—just set the parameter and go

Note: Adaptive thinking is eligible for Zero Data Retention (ZDR). If your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview	✅ Default	Thinking auto-applies; `thinking: {type: "disabled"}` is not supported
Claude Opus 4.7	✅ Only mode	Manual `thinking: {type: "enabled"}` is rejected with a 400 error
Claude Opus 4.6	✅ Supported	Manual `budget_tokens` is deprecated but still functional
Claude Sonnet 4.6	✅ Supported	Manual `budget_tokens` is deprecated but still functional
Older models (Sonnet 4.5, Opus 4.5, etc.)	❌ Not supported	Must use `thinking: {type: "enabled"}` with `budget_tokens`

How to Use Adaptive Thinking

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Here's a Python example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

And the equivalent in TypeScript:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

That's it. No budget_tokens, no beta headers. Claude handles the rest.

Tuning with the Effort Parameter

Adaptive thinking pairs with the optional effort parameter to give you soft control over how much thinking Claude does. The effort level acts as guidance, not a hard limit.

Available Effort Levels

Effort Level	Behavior
`low`	Minimal thinking; Claude may skip thinking for simple tasks
`medium`	Moderate thinking; a balanced approach
`high` (default)	Maximum thinking; Claude almost always thinks

Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "Write a Python script to analyze a CSV file."
        }
    ]
)

When to Use Each Effort Level

low: High-throughput tasks like simple classification, extraction, or Q&A where deep reasoning isn't needed.
medium: General-purpose use—good for most chat applications, content generation, and moderate analysis.
high: Complex reasoning, multi-step agentic workflows, mathematical proofs, code generation, and tasks requiring deep analysis.

Adaptive Thinking in Agentic Workflows

One of the biggest advantages of adaptive thinking is automatic interleaved thinking. In agentic workflows where Claude makes multiple tool calls, adaptive thinking allows Claude to "think" between each call. This leads to:

Better planning across steps
More coherent multi-turn reasoning
Improved error recovery when a tool returns unexpected results

Here's a conceptual example of an agentic loop with adaptive thinking:

import anthropic
client = anthropic.Anthropic()
def run_agentic_task(user_query):
    messages = [{"role": "user", "content": user_query}]
    
    while True:
        response = client.messages.create(
            model="claude-opus-4-7",
            max_tokens=16000,
            thinking={"type": "adaptive"},
            tools=[
                {
                    "name": "search_database",
                    "description": "Search the internal database",
                    "input_schema": {
                        "type": "object",
                        "properties": {
                            "query": {"type": "string"}
                        },
                        "required": ["query"]
                    }
                },
                {
                    "name": "calculate",
                    "description": "Perform a calculation",
                    "input_schema": {
                        "type": "object",
                        "properties": {
                            "expression": {"type": "string"}
                        },
                        "required": ["expression"]
                    }
                }
            ],
            messages=messages
        )
        
        # Process response and handle tool calls...
        # Adaptive thinking happens automatically between tool calls

Migration Guide: From Fixed Budget to Adaptive

If you're currently using thinking: {type: "enabled", budget_tokens: N}, here's how to migrate:

Step 1: Identify Your Models

Check which models you're using. If you're on Opus 4.7, you must migrate—the old format is rejected. If you're on Opus 4.6 or Sonnet 4.6, the old format still works but is deprecated.

Step 2: Replace the Parameter

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended):

thinking={
    "type": "adaptive",
    "effort": "high"
}

Step 3: Test and Tune

Run your existing test suite with adaptive thinking. If you notice Claude thinking too much or too little, adjust the effort parameter. For most workloads, high matches or exceeds the performance of a generous fixed budget.

Step 4: Monitor Costs

Adaptive thinking can reduce costs on simple queries (where Claude skips thinking) but may increase thinking on complex ones. Monitor your token usage and adjust effort accordingly.

When to Use Fixed Budget Instead

Adaptive thinking is the recommended approach, but fixed budget_tokens is still available on Opus 4.6 and Sonnet 4.6 for specific use cases:

Predictable latency: If you need every response to take roughly the same time, a fixed budget ensures consistent thinking duration.
Strict cost control: If you need to cap thinking tokens per request for budgeting reasons.
Legacy systems: While you migrate, the old format still works (but plan to update).

Warning: Fixed budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release. Migrate to adaptive thinking as soon as possible.

Best Practices

Start with high effort for most workloads. It provides the best reasoning quality and you can dial down if needed.
Use medium or low for high-throughput applications where speed matters more than deep reasoning.
Combine with tool use for agentic workflows—adaptive thinking's interleaved mode shines here.
Monitor your token usage after switching. You may see significant savings on simple tasks.
Test on your specific use case before rolling out to production. While adaptive thinking generally improves performance, results can vary by workload.

Key Takeaways

Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the need for manual budget_tokens.
It's the only supported mode on Claude Opus 4.7 and the default on Claude Mythos Preview. Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6.
The effort parameter (low, medium, high) gives you soft control over thinking depth without hard token limits.
Automatic interleaved thinking makes adaptive thinking ideal for agentic workflows with multiple tool calls.
Migrate now if you're using fixed budgets—the old format will be removed in future model releases.