BeClaude
Guide2026-05-05

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort. Includes API setup, effort parameter tuning, and best practices for agentic workflows.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request. Instead of a fixed token budget, you set an effort level (low/medium/high) and Claude adjusts thinking depth automatically—ideal for bimodal tasks and long-horizon agents.

adaptive thinkingClaude APIextended thinkingagentic workflowsClaude Opus

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning. This one-size-fits-all approach often led to either wasted tokens on simple queries or insufficient thinking on truly complex problems.

Adaptive thinking solves this. Now, Claude dynamically decides whether to think and how much to think based on the complexity of each individual request. This guide will walk you through everything you need to know to implement adaptive thinking in your Claude API applications.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It's also the default mode on Claude Mythos Preview.

Instead of locking in a fixed thinking budget, you set an effort level (low, medium, or high), and Claude decides the rest. For simple questions, Claude may skip thinking entirely. For complex, multi-step problems, Claude can allocate substantial reasoning tokens.

Key benefit: Adaptive thinking automatically enables interleaved thinking—Claude can think between tool calls, making it exceptionally effective for agentic workflows where the model needs to reason, act, observe, and reason again.

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview✅ DefaultAuto-applies when thinking is unset; thinking: {type: "disabled"} not supported
Claude Opus 4.7✅ Only modeManual budget_tokens rejected with 400 error
Claude Opus 4.6✅ SupportedManual mode deprecated; migrate to adaptive
Claude Sonnet 4.6✅ SupportedManual mode deprecated; migrate to adaptive
Older models (Sonnet 4.5, Opus 4.5)❌ Not supportedMust use thinking.type: "enabled" with budget_tokens

How Adaptive Thinking Works

When you set thinking.type to "adaptive", Claude evaluates each incoming request independently:

  • Complexity assessment: Claude analyzes the prompt to gauge reasoning depth needed.
  • Thinking decision: At default high effort, Claude almost always thinks. At lower effort levels, it may skip thinking for straightforward queries.
  • Dynamic allocation: Thinking tokens are allocated per-request, not from a fixed pool.
  • Interleaved thinking: In multi-turn or tool-using scenarios, Claude can think between each action, enabling more coherent agentic behavior.

Getting Started: Basic API Usage

Here's how to enable adaptive thinking in your API calls. The key change is setting thinking.type to "adaptive" instead of "enabled".

Python Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Controlling Thinking Depth with the Effort Parameter

Adaptive thinking becomes truly powerful when you combine it with the effort parameter. This acts as soft guidance for how much thinking Claude should allocate.

Effort LevelBehaviorBest For
lowMinimal thinking; may skip for simple tasksHigh-throughput, low-latency applications
mediumBalanced approachGeneral-purpose use
high (default)Almost always thinks; deep reasoningComplex analysis, code generation, agentic tasks

Example with Effort Control

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {
            "role": "user",
            "content": "Write a Python function to merge two sorted lists."
        }
    ]
)
Note: On Claude Opus 4.7, the effort parameter is not yet supported. On Opus 4.6 and Sonnet 4.6, it is available and recommended.

Adaptive Thinking for Agentic Workflows

This is where adaptive thinking truly shines. In agentic workflows, Claude typically:

  • Receives a user request
  • Thinks about how to approach it
  • Calls a tool (e.g., searches a database, runs code)
  • Observes the result
  • Thinks again about next steps
  • Calls another tool or produces a final answer
With manual budget_tokens, Claude had to allocate all thinking tokens upfront—often running out before the final reasoning step. Adaptive thinking with interleaved thinking solves this by allowing Claude to think between each tool call.

Agentic Workflow Example

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high"
    },
    tools=[
        {
            "name": "search_knowledge_base",
            "description": "Search the internal knowledge base",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "run_sql_query",
            "description": "Execute a SQL query on the database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find all customers who purchased more than $1000 in the last month and send them a discount code."
        }
    ]
)

In this scenario, Claude will:

  • Think about how to break down the task
  • Call search_knowledge_base to find the discount policy
  • Think about the SQL query needed
  • Call run_sql_query to find qualifying customers
  • Think about how to format the results
  • Produce the final answer

Migration Guide: Moving from Manual to Adaptive

If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration path:

Step 1: Identify Your Models

Check which models you're using. Only Opus 4.6, Sonnet 4.6, and newer support adaptive thinking.

Step 2: Update Your API Calls

Before (deprecated):
thinking={
    "type": "enabled",
    "budget_tokens": 8000
}
After (recommended):
thinking={
    "type": "adaptive",
    "effort": "high"  # Optional; defaults to high
}

Step 3: Test and Monitor

  • Start with effort: "high" to match your current behavior
  • Monitor latency and token usage
  • Experiment with medium or low for simpler tasks

Step 4: Handle Older Models

For models that don't support adaptive thinking (Sonnet 4.5, Opus 4.5, etc.), continue using thinking.type: "enabled" with budget_tokens.

Best Practices

1. Match Effort to Task Complexity

  • Simple Q&A: Use effort: "low" for faster responses
  • Code generation: Use effort: "medium" as a starting point
  • Complex analysis/agents: Use effort: "high" for maximum reasoning depth

2. Set Appropriate max_tokens

Adaptive thinking can use more tokens than you expect for complex tasks. Set max_tokens generously (e.g., 16,000–32,000) to avoid truncated responses.

3. Handle Thinking Blocks in Your Code

Always iterate over response.content and handle both thinking and text blocks:
for block in response.content:
    if block.type == "thinking":
        # Optionally log or display thinking
        log_debug(f"Claude is thinking: {block.thinking[:100]}...")
    elif block.type == "text":
        return block.text

4. Use for Bimodal Workloads

Adaptive thinking excels when your application handles both simple and complex requests. The model automatically adjusts, saving tokens on easy queries while going deep on hard ones.

Limitations and Considerations

  • No ZDR on older models: Adaptive thinking is eligible for Zero Data Retention on supported models.
  • Effort is guidance, not a hard limit: Claude may use more or less thinking than the effort level suggests.
  • Opus 4.7 restrictions: On Opus 4.7, adaptive thinking is the only mode; you cannot disable thinking or use manual budgets.
  • Latency variability: Because thinking depth varies per request, response times may be less predictable than with fixed budgets.

Key Takeaways

  • Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the need for manual budget_tokens configuration.
  • Use the effort parameter (low/medium/high) to guide thinking depth without hard-coding token limits.
  • Interleaved thinking makes adaptive mode ideal for agentic workflows where Claude reasons between tool calls.
  • Migrate from thinking.type: "enabled" to thinking.type: "adaptive" on Opus 4.6, Sonnet 4.6, and newer models—manual budgets are deprecated.
  • On Claude Opus 4.7, adaptive thinking is the only supported mode; manual budgets are rejected with a 400 error.