BeClaude
Guide2026-05-03

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use adaptive thinking with Claude Opus 4.7, Sonnet 4.6, and Mythos Preview. Includes code examples, effort parameters, and migration tips from fixed budget_tokens.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking per request, replacing fixed budget_tokens. It's the only mode on Opus 4.7 and default on Mythos Preview. Use thinking.type: 'adaptive' and optionally the effort parameter to control depth.

adaptive thinkingextended thinkingClaude APIagentic workflowseffort parameter

Introduction

Claude's extended thinking capability has always been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning before generating a response. This worked well for predictable workloads, but for tasks with varying complexity, you often either wasted tokens on simple queries or ran out of thinking room on hard ones.

Adaptive thinking changes that. Instead of a one-size-fits-all budget, Claude now dynamically decides whether and how much to think based on the complexity of each individual request. This feature is the recommended way to use extended thinking on Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6, and it's the default mode on Claude Mythos Preview.

In this guide, you'll learn:

  • How adaptive thinking works under the hood
  • How to use it in your API calls (with code examples)
  • How to control thinking depth with the effort parameter
  • Best practices for migrating from fixed budget_tokens
  • When to stick with manual thinking (and when not to)

What Is Adaptive Thinking?

Adaptive thinking is a new mode for Claude's extended thinking feature. When enabled, Claude evaluates each incoming request and decides:

  • Whether thinking is needed at all
  • How many tokens to allocate to thinking
This evaluation happens per-request, so a simple question like "What's 2+2?" might skip thinking entirely, while a multi-step math proof or a complex agentic workflow might trigger deep reasoning.

Key Benefits

  • Cost efficiency: No wasted tokens on trivial queries
  • Better performance on bimodal tasks: Handles both simple and complex requests in the same conversation
  • Ideal for agentic workflows: Adaptive thinking automatically enables interleaved thinking, meaning Claude can think between tool calls—critical for multi-step agents
  • No more guesswork: You don't need to tune budget_tokens per use case

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview✅ Defaultthinking: {type: "disabled"} not supported
Claude Opus 4.7✅ Only modeManual thinking: {type: "enabled"} rejected with 400 error
Claude Opus 4.6✅ SupportedManual mode deprecated, plan to migrate
Claude Sonnet 4.6✅ SupportedManual mode deprecated, plan to migrate
Older models (Sonnet 4.5, Opus 4.5, etc.)❌ Not supportedMust use thinking.type: "enabled" with budget_tokens
⚠️ Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" and budget_tokens are deprecated and will be removed in a future model release. Migrate to adaptive thinking now.

How to Use Adaptive Thinking

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.

Basic Example (Python)

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

Basic Example (TypeScript)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Controlling Thinking Depth with the Effort Parameter

Adaptive thinking can be combined with the optional effort parameter to guide how much thinking Claude should do. The effort level acts as soft guidance, not a hard limit.

Effort LevelBehavior
lowMinimal thinking; may skip thinking for simple problems
mediumBalanced thinking allocation
high (default)Claude almost always thinks, and allocates more tokens

Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable microservices architecture for an e-commerce platform."
        }
    ]
)

When to Use Each Effort Level

  • low: Use for high-throughput, low-complexity tasks where you want minimal latency and cost. Example: simple classification, basic Q&A.
  • medium: Good default for most workloads. Balances reasoning depth with speed.
  • high: Use for complex reasoning, multi-step agentic workflows, or when you need Claude to thoroughly explore a problem. This is the default and works well for most use cases.

Adaptive Thinking in Agentic Workflows

One of the most powerful features of adaptive thinking is interleaved thinking. In agentic workflows where Claude makes multiple tool calls, adaptive thinking allows Claude to think between each tool call, not just at the beginning.

This is critical for:

  • Multi-step research agents that need to reason after each search result
  • Code generation agents that iterate on solutions
  • Decision-making agents that evaluate intermediate results

Example: Agent with Tool Calls

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for current information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Research the latest AI regulations in the EU and summarize them."
        }
    ]
)

With adaptive thinking, Claude will think before the first tool call, then think again after receiving results, before making the next call or generating the final summary.

Migrating from Fixed budget_tokens

If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration path:

Step 1: Identify affected models

Check which models you're using. Only Opus 4.6 and Sonnet 4.6 are affected. Older models still require the old syntax.

Step 2: Replace the thinking block

Before (deprecated):
thinking={
    "type": "enabled",
    "budget_tokens": 8000
}
After (recommended):
thinking={
    "type": "adaptive",
    "effort": "high"  # optional, defaults to high
}

Step 3: Test and adjust

  • Start with effort: "high" to match your previous deep-thinking behavior
  • If you notice higher token usage, try effort: "medium"
  • For simple tasks, effort: "low" can significantly reduce costs

Step 4: Remove any budget_tokens logic

Your application code should no longer calculate or pass budget_tokens. Let Claude handle allocation.

Best Practices

  • Start with effort: "high" for complex reasoning tasks, then dial down if you see overthinking.
  • Use adaptive thinking for agentic workflows—interleaved thinking is a major advantage over fixed budgets.
  • Monitor token usage in your logs. Adaptive thinking may use more tokens on complex queries than your old fixed budget, but it will save on simple ones.
  • Don't mix modes on Opus 4.7—adaptive thinking is the only option. Trying to use type: "enabled" will return a 400 error.
  • For Mythos Preview, adaptive thinking is the default. You cannot disable thinking entirely.

When to Use Manual Thinking (and When Not To)

Adaptive thinking is recommended for almost all use cases. However, there are two scenarios where manual budget_tokens might still be preferable (temporarily):

  • Predictable latency: If you need guaranteed response times and can't tolerate variability, fixed budgets give you deterministic thinking time.
  • Strict cost control: If you need to cap thinking tokens per request for budgeting reasons.
But remember: manual thinking is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed. Plan your migration now.

Conclusion

Adaptive thinking represents a significant step forward in how Claude handles reasoning. By letting the model decide when and how much to think, you get:

  • Better performance on mixed-complexity workloads
  • Cost savings on simple queries
  • Improved agentic behavior with interleaved thinking
  • No more manual token budget tuning
Whether you're building a simple Q&A bot or a complex multi-agent system, adaptive thinking should be your default choice for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview.

Key Takeaways

  • Adaptive thinking replaces fixed budget_tokens on Opus 4.7, Opus 4.6, and Sonnet 4.6. On Opus 4.7, it's the only supported mode.
  • Use thinking.type: "adaptive" with an optional effort parameter (low, medium, high) to guide thinking depth.
  • Interleaved thinking is automatic in adaptive mode, making it ideal for agentic workflows with multiple tool calls.
  • Migrate now if you're using budget_tokens on Opus 4.6 or Sonnet 4.6—the old syntax is deprecated and will be removed.
  • Start with effort: "high" for complex tasks, then adjust based on your observed token usage and latency requirements.