BeClaude
Guide2026-04-24

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort based on task complexity, with practical API examples and best practices.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning, eliminating manual token budgets. This guide shows how to enable it, tune effort levels, and optimize for agentic workflows.

adaptive thinkingClaude APIextended thinkingagentic workflowseffort parameter

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—guessing how much thinking your task required. Too low, and Claude might not reason deeply enough. Too high, and you waste tokens on simple queries.

Adaptive thinking solves this problem. Instead of a fixed token budget, Claude evaluates each request's complexity and dynamically decides whether—and how much—to use extended thinking. This feature is now the recommended approach for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and the new Mythos Preview model.

In this guide, you'll learn:

  • How adaptive thinking works under the hood
  • Which models support it
  • How to implement it in your API calls
  • How to tune effort levels for different workloads
  • Best practices for agentic workflows

What Is Adaptive Thinking?

Adaptive thinking is a new thinking mode where Claude autonomously determines the appropriate amount of reasoning for each request. Instead of you specifying thinking: {type: "enabled", budget_tokens: 20000}, you simply set thinking: {type: "adaptive"} and let Claude decide.

Key Benefits

  • No more guesswork: Eliminates the trial-and-error of setting token budgets
  • Cost efficiency: Claude skips deep thinking for simple queries, saving tokens
  • Better performance: Particularly effective for bimodal tasks (mix of simple and complex) and long-horizon agentic workflows
  • Interleaved thinking: Automatically enables thinking between tool calls—critical for multi-step agents

Supported Models

ModelSupport LevelNotes
Claude Mythos Preview (claude-mythos-preview)Default modeThinking auto-applies when unset; disabled not supported
Claude Opus 4.7 (claude-opus-4-7)Only modeManual enabled with budget_tokens rejected (400 error)
Claude Opus 4.6 (claude-opus-4-6)SupportedManual mode deprecated; migrate to adaptive
Claude Sonnet 4.6 (claude-sonnet-4-6)SupportedManual mode deprecated; migrate to adaptive
Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future release. Plan your migration now.

How Adaptive Thinking Works

In adaptive mode, thinking is optional for the model. Claude evaluates each request and decides:

  • Whether to think at all – For simple questions (e.g., "What's 2+2?"), Claude may skip thinking entirely.
  • How much to think – For complex reasoning, Claude allocates more thinking tokens.
  • When to think – In multi-turn or tool-using workflows, Claude can think between each step (interleaved thinking).
At the default effort level (high), Claude almost always thinks. Lower effort levels may skip thinking for straightforward problems.

How to Use Adaptive Thinking

Basic Implementation (Python)

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Tuning Effort Levels

Adaptive thinking pairs with the effort parameter to give you fine-grained control over how much thinking Claude applies. The effort parameter accepts values from 0.0 to 1.0:

Effort ValueBehaviorUse Case
0.0Minimal thinking; Claude may skip thinking for most queriesHigh-throughput, low-complexity tasks
0.5Balanced approach; thinks for moderately complex queriesGeneral-purpose use
1.0 (default high)Maximum thinking; Claude almost always thinks deeplyComplex reasoning, math, coding, analysis

Example with Effort Control

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": 0.7  # Between balanced and maximum
    },
    messages=[
        {
            "role": "user",
            "content": "Design a distributed caching system for a global e-commerce platform."
        }
    ]
)

Adaptive Thinking for Agentic Workflows

One of the most powerful features of adaptive thinking is interleaved thinking—the ability to think between tool calls. This is automatically enabled in adaptive mode.

Why It Matters for Agents

In traditional agentic workflows, Claude makes a tool call, gets a result, then decides the next step. Without interleaved thinking, Claude might rush into actions without sufficient reasoning. With adaptive thinking, Claude can:

  • Receive a complex user request
  • Think about the approach
  • Make a tool call (e.g., search the web)
  • Think about the results
  • Decide to call another tool
  • Think about how to combine results
  • Deliver a final response

Example: Research Agent

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for current information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Research the latest AI breakthroughs in 2025 and summarize them."
        }
    ]
)

In this workflow, Claude will think before each search query, evaluate results, and determine whether additional searches are needed—all without you managing a thinking budget.

Migrating from Manual to Adaptive Thinking

If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration path:

Step 1: Identify All API Calls

Search your codebase for thinking.type: "enabled" or thinking: {type: "enabled"}.

Step 2: Replace with Adaptive

Change:
thinking={"type": "enabled", "budget_tokens": 20000}
To:
thinking={"type": "adaptive"}

Step 3: Add Effort Parameter (Optional)

If you need to control thinking depth, add the effort parameter:
thinking={"type": "adaptive", "effort": 0.8}

Step 4: Test and Monitor

Run your workloads and compare:
  • Response quality
  • Token usage
  • Latency
You may find adaptive thinking uses fewer tokens for simple queries while maintaining quality on complex ones.

Best Practices

  • Start with default effort (high/1.0): This gives you the full benefit of adaptive thinking. Reduce effort only if you need faster responses or lower costs.
  • Use with tool-using agents: Adaptive thinking shines in multi-step workflows. The interleaved thinking capability means Claude reasons between every action.
  • Monitor token usage: While adaptive thinking is efficient, complex tasks may still use significant tokens. Set max_tokens appropriately.
  • Combine with prompt engineering: Clear, well-structured prompts help Claude accurately assess complexity and allocate thinking appropriately.
  • Test bimodal workloads: Adaptive thinking is especially effective when your application handles both simple and complex queries. Run A/B tests against fixed budgets.

Limitations and Considerations

  • Not available on older models: Sonnet 4.5, Opus 4.5, and earlier models still require manual budget_tokens.
  • Predictable latency: If you need guaranteed response times, adaptive thinking may introduce variability. Consider fixed budgets for latency-sensitive applications.
  • Cost control: While adaptive thinking can save tokens, it may also use more than expected on unexpectedly complex inputs. Set max_tokens as a safety net.

Conclusion

Adaptive thinking represents a significant evolution in how Claude handles reasoning. By letting the model decide when and how much to think, you get better performance, lower costs, and simpler code. Whether you're building a simple Q&A bot or a complex agentic system, adaptive thinking is now the recommended approach.

Key Takeaways

  • Adaptive thinking replaces manual token budgets: Set thinking: {type: "adaptive"} and let Claude handle the rest.
  • Interleaved thinking is automatic: Claude reasons between tool calls, making it ideal for agentic workflows.
  • Effort parameter gives you control: Tune effort from 0.0 to 1.0 to balance speed and reasoning depth.
  • Migrate now: Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6, and unsupported on Opus 4.7.
  • Test bimodal workloads: Adaptive thinking excels when your application mixes simple and complex queries.