GuideBeginnerAgents2026-05-12

Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens, improve agentic workflows, and optimize performance without manual budget settings.

Quick Answer

Adaptive thinking lets Claude automatically decide when and how much to use extended reasoning based on task complexity. It replaces manual token budgets, supports interleaved thinking between tool calls, and offers effort levels (low to max) for fine-grained control. Ideal for agentic workflows and bimodal tasks.

adaptive thinkingextended thinkingClaude APIagentic workflowsreasoning optimization

Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But manually setting a thinking token budget for every request can be tedious and suboptimal—especially when your workload mixes simple queries with deep analytical problems. Enter adaptive thinking, the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6.

Adaptive thinking shifts the burden of deciding when and how much to think from you to Claude. The model evaluates each request's complexity in real time and allocates reasoning tokens dynamically. This leads to better performance on bimodal tasks (e.g., a mix of quick lookups and deep analysis) and long-horizon agentic workflows where intermediate reasoning between tool calls is critical.

In this guide, you'll learn exactly how adaptive thinking works, how to implement it in your API calls, and how to tune it with the effort parameter for your specific use case.

What Is Adaptive Thinking?

Adaptive thinking is a mode where Claude decides autonomously whether to engage extended reasoning and how many tokens to allocate. Unlike the older thinking: {type: "enabled", budget_tokens: N} approach, you don't set a fixed budget. Instead, Claude examines the input and chooses a reasoning strategy on the fly.

Key characteristics:

Dynamic allocation: Claude spends more tokens on complex problems, less on simple ones.
Interleaved thinking: Claude can think between tool calls, making it ideal for agentic loops.
Effort guidance: You can provide soft hints via the effort parameter to influence thinking depth.

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview	Default (auto-applies)	`thinking: {type: "disabled"}` not supported
Claude Opus 4.7	Only supported mode	Manual `budget_tokens` rejected with 400 error
Claude Opus 4.6	Supported	Manual mode deprecated, plan migration
Claude Sonnet 4.6	Supported	Manual mode deprecated, plan migration

Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future release. Migrate to adaptive thinking now.

How Adaptive Thinking Works

When you set thinking.type: "adaptive", Claude treats thinking as optional. At the default effort level (high), Claude almost always thinks—but for trivial requests, it may skip thinking entirely. At lower effort levels, Claude becomes more selective, reserving deep reasoning for genuinely complex tasks.

Adaptive thinking also enables interleaved thinking. In agentic workflows where Claude makes multiple tool calls, it can pause to think between calls, refining its strategy based on intermediate results. This is a major advantage over fixed-budget thinking, which allocates all reasoning tokens upfront and cannot adjust mid-conversation.

How to Use Adaptive Thinking

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Optionally, include the effort parameter to guide reasoning depth.

Basic Usage (Python)

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={
        "type": "adaptive",
        "effort": "high"  # optional, defaults to "high"
    },
    messages=[
        {"role": "user", "content": "Analyze the quarterly financial report and identify three key risks."}
    ]
)
print(response.content)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: {
    type: 'adaptive',
    effort: 'high'
  },
  messages: [
    { role: 'user', content: 'Analyze the quarterly financial report and identify three key risks.' }
  ]
});
console.log(response.content);

Effort Levels Explained

The effort parameter provides soft guidance. Here's what each level means:

Effort	Behavior	Available On
`max`	Always thinks, no constraints on depth	Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6
`xhigh`	Always thinks deeply with extended exploration	Opus 4.7 only
`high` (default)	Always thinks; deep reasoning on complex tasks	All adaptive models
`medium`	Moderate thinking; may skip for simple queries	All adaptive models
`low`	Minimizes thinking; skips for most queries	All adaptive models

Example: Using Low Effort for Simple Tasks

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    thinking={
        "type": "adaptive",
        "effort": "low"
    },
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

For a trivial question like this, Claude will likely skip extended thinking entirely, saving tokens and reducing latency.

Adaptive Thinking in Agentic Workflows

One of the most powerful use cases for adaptive thinking is agentic workflows—multi-step tasks where Claude uses tools, evaluates results, and decides next actions. With interleaved thinking, Claude can reason between each tool call, adapting its strategy as new information arrives.

Example: Research Agent with Tool Calls

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=8192,
    thinking={
        "type": "adaptive",
        "effort": "high"
    },
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for current information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "Research the latest AI regulations in the EU and summarize the key changes."}
    ]
)

In this scenario, Claude will:

Think about which search queries to run.
Call the search tool.
Think again about the results.
Decide if additional searches are needed.
Finally compose the summary.

Each thinking step is adaptive—Claude spends more tokens on complex analysis and less on straightforward lookups.

Migrating from Fixed Budget to Adaptive

If you're currently using thinking: {type: "enabled", budget_tokens: 16000}, migrating is simple:

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 16000
}

After (recommended):

thinking={
    "type": "adaptive",
    "effort": "high"  # or "medium", "low", etc.
}

No other changes are required. The max_tokens parameter still controls the total response length, including thinking tokens.

When to Use Each Effort Level

max: Mission-critical analysis, complex code generation, multi-step agentic tasks where you want maximum reasoning.
xhigh (Opus 4.7 only): Research-grade deep dives, novel problem solving, or tasks requiring exhaustive exploration.
high: General-purpose reasoning. Best for most production workloads.
medium: High-volume tasks where cost and latency matter, but you still want reasoning for moderately complex queries.
low: Simple Q&A, data extraction, or tasks where speed is paramount.

Limitations and Considerations

Predictable latency: If your application requires consistent response times, adaptive thinking may introduce variability. For such cases, consider using effort: "low" or "medium" to reduce variance.
Cost control: Adaptive thinking can use more tokens than a fixed budget on complex tasks. Monitor your token usage and adjust effort levels accordingly.
Older models: Models like Sonnet 4.5 and Opus 4.5 do not support adaptive thinking. You must use the legacy budget_tokens approach with those models.
No beta header required: Unlike some other Claude features, adaptive thinking is generally available—no special headers needed.

Key Takeaways

Adaptive thinking lets Claude dynamically allocate reasoning tokens based on task complexity, replacing manual budget settings for Opus 4.7, Opus 4.6, and Sonnet 4.6.
Use the effort parameter to guide reasoning depth: low for speed, high for deep reasoning, max for unrestricted exploration.
Interleaved thinking enables smarter agentic workflows by allowing Claude to reason between tool calls, adapting its strategy in real time.
Migrate from fixed budgets now—the legacy budget_tokens approach is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases.
Monitor token usage when using effort: "max" or "high" on complex tasks, as adaptive thinking may consume more tokens than a fixed budget.