Guide2026-05-05

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort. Includes API setup, effort parameter tuning, and best practices for agentic workflows.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request. Instead of a fixed token budget, you set an effort level (low/medium/high) and Claude adjusts thinking depth automatically—ideal for bimodal tasks and long-horizon agents.

adaptive thinkingClaude APIextended thinkingagentic workflowsClaude Opus

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning. This one-size-fits-all approach often led to either wasted tokens on simple queries or insufficient thinking on truly complex problems.

Adaptive thinking solves this. Now, Claude dynamically decides whether to think and how much to think based on the complexity of each individual request. This guide will walk you through everything you need to know to implement adaptive thinking in your Claude API applications.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It's also the default mode on Claude Mythos Preview.

Instead of locking in a fixed thinking budget, you set an effort level (low, medium, or high), and Claude decides the rest. For simple questions, Claude may skip thinking entirely. For complex, multi-step problems, Claude can allocate substantial reasoning tokens.

Key benefit: Adaptive thinking automatically enables interleaved thinking—Claude can think between tool calls, making it exceptionally effective for agentic workflows where the model needs to reason, act, observe, and reason again.

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview	✅ Default	Auto-applies when thinking is unset; `thinking: {type: "disabled"}` not supported
Claude Opus 4.7	✅ Only mode	Manual `budget_tokens` rejected with 400 error
Claude Opus 4.6	✅ Supported	Manual mode deprecated; migrate to adaptive
Claude Sonnet 4.6	✅ Supported	Manual mode deprecated; migrate to adaptive
Older models (Sonnet 4.5, Opus 4.5)	❌ Not supported	Must use `thinking.type: "enabled"` with `budget_tokens`

How Adaptive Thinking Works

When you set thinking.type to "adaptive", Claude evaluates each incoming request independently:

Complexity assessment: Claude analyzes the prompt to gauge reasoning depth needed.
Thinking decision: At default high effort, Claude almost always thinks. At lower effort levels, it may skip thinking for straightforward queries.
Dynamic allocation: Thinking tokens are allocated per-request, not from a fixed pool.
Interleaved thinking: In multi-turn or tool-using scenarios, Claude can think between each action, enabling more coherent agentic behavior.

Getting Started: Basic API Usage

Here's how to enable adaptive thinking in your API calls. The key change is setting thinking.type to "adaptive" instead of "enabled".

Python Example

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 16000,
    thinking: { type: 'adaptive' },
    messages: [
        {
            role: 'user',
            content: 'Explain why the sum of two even numbers is always even.'
        }
    ]
});
for (const block of response.content) {
    if (block.type === 'thinking') {
        console.log(\nThinking: ${block.thinking});
    } else if (block.type === 'text') {
        console.log(\nResponse: ${block.text});
    }
}

Controlling Thinking Depth with the Effort Parameter

Adaptive thinking becomes truly powerful when you combine it with the effort parameter. This acts as soft guidance for how much thinking Claude should allocate.

Effort Level	Behavior	Best For
`low`	Minimal thinking; may skip for simple tasks	High-throughput, low-latency applications
`medium`	Balanced approach	General-purpose use
`high` (default)	Almost always thinks; deep reasoning	Complex analysis, code generation, agentic tasks

Example with Effort Control

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {
            "role": "user",
            "content": "Write a Python function to merge two sorted lists."
        }
    ]
)

Note: On Claude Opus 4.7, the effort parameter is not yet supported. On Opus 4.6 and Sonnet 4.6, it is available and recommended.

Adaptive Thinking for Agentic Workflows

This is where adaptive thinking truly shines. In agentic workflows, Claude typically:

Receives a user request
Thinks about how to approach it
Calls a tool (e.g., searches a database, runs code)
Observes the result
Thinks again about next steps
Calls another tool or produces a final answer

With manual budget_tokens, Claude had to allocate all thinking tokens upfront—often running out before the final reasoning step. Adaptive thinking with interleaved thinking solves this by allowing Claude to think between each tool call.

Agentic Workflow Example

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high"
    },
    tools=[
        {
            "name": "search_knowledge_base",
            "description": "Search the internal knowledge base",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "run_sql_query",
            "description": "Execute a SQL query on the database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find all customers who purchased more than $1000 in the last month and send them a discount code."
        }
    ]
)

In this scenario, Claude will:

Think about how to break down the task
Call search_knowledge_base to find the discount policy
Think about the SQL query needed
Call run_sql_query to find qualifying customers
Think about how to format the results
Produce the final answer

Migration Guide: Moving from Manual to Adaptive

If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration path:

Step 1: Identify Your Models

Check which models you're using. Only Opus 4.6, Sonnet 4.6, and newer support adaptive thinking.

Step 2: Update Your API Calls

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended):

thinking={
    "type": "adaptive",
    "effort": "high"  # Optional; defaults to high
}

Step 3: Test and Monitor

Start with effort: "high" to match your current behavior
Monitor latency and token usage
Experiment with medium or low for simpler tasks

Step 4: Handle Older Models

For models that don't support adaptive thinking (Sonnet 4.5, Opus 4.5, etc.), continue using thinking.type: "enabled" with budget_tokens.

Best Practices

1. Match Effort to Task Complexity

Simple Q&A: Use effort: "low" for faster responses
Code generation: Use effort: "medium" as a starting point
Complex analysis/agents: Use effort: "high" for maximum reasoning depth

2. Set Appropriate max_tokens

Adaptive thinking can use more tokens than you expect for complex tasks. Set max_tokens generously (e.g., 16,000–32,000) to avoid truncated responses.

3. Handle Thinking Blocks in Your Code

Always iterate over response.content and handle both thinking and text blocks:

for block in response.content:
    if block.type == "thinking":
        # Optionally log or display thinking
        log_debug(f"Claude is thinking: {block.thinking[:100]}...")
    elif block.type == "text":
        return block.text

4. Use for Bimodal Workloads

Adaptive thinking excels when your application handles both simple and complex requests. The model automatically adjusts, saving tokens on easy queries while going deep on hard ones.

Limitations and Considerations

No ZDR on older models: Adaptive thinking is eligible for Zero Data Retention on supported models.
Effort is guidance, not a hard limit: Claude may use more or less thinking than the effort level suggests.
Opus 4.7 restrictions: On Opus 4.7, adaptive thinking is the only mode; you cannot disable thinking or use manual budgets.
Latency variability: Because thinking depth varies per request, response times may be less predictable than with fixed budgets.

Key Takeaways

Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the need for manual budget_tokens configuration.
Use the effort parameter (low/medium/high) to guide thinking depth without hard-coding token limits.
Interleaved thinking makes adaptive mode ideal for agentic workflows where Claude reasons between tool calls.
Migrate from thinking.type: "enabled" to thinking.type: "adaptive" on Opus 4.6, Sonnet 4.6, and newer models—manual budgets are deprecated.
On Claude Opus 4.7, adaptive thinking is the only supported mode; manual budgets are rejected with a 400 error.