Guide2026-05-06

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning resources, improve agentic performance, and simplify your API code—with practical examples and best practices.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing manual token budgets. It improves performance on complex tasks, supports interleaved thinking between tool calls, and is the only mode for Claude Opus 4.7. Use `thinking: {type: "adaptive"}` with optional effort levels.

adaptive thinkingClaude APIextended thinkingagentic workflowsreasoning

Introduction

If you've ever struggled to set the perfect budget_tokens for Claude's extended thinking, you're not alone. Too low, and Claude cuts off mid-reasoning. Too high, and you waste tokens on simple queries. With adaptive thinking, Claude handles that decision for you—dynamically allocating reasoning resources based on the complexity of each request.

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and it's the default mode on Claude Mythos Preview. It's especially powerful for bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows where Claude needs to reason between tool calls.

In this guide, you'll learn exactly how adaptive thinking works, how to implement it in your API calls, and how to fine-tune it with the effort parameter.

What Is Adaptive Thinking?

Adaptive thinking is a replacement for the older thinking: {type: "enabled", budget_tokens: N} approach. Instead of you guessing how many tokens Claude needs to think, Claude evaluates each incoming request and decides:

Whether to use extended thinking at all
How much thinking to allocate

At the default effort level (high), Claude almost always thinks. At lower levels, it may skip thinking for simpler problems—saving you time and money.

Key Benefits

No more token budget guesswork – Claude self-calibrates per request.
Interleaved thinking – Claude can think between tool calls, which is critical for agentic loops.
Better performance on mixed workloads – Simple queries don't waste thinking tokens; complex ones get the depth they need.
Future-proof – Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases.

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview	✅ Default	Auto-applies when thinking is unset; `disabled` not supported
Claude Opus 4.7	✅ Only mode	Manual `enabled` rejected with 400 error
Claude Opus 4.6	✅ Supported	Manual `enabled` deprecated, still functional
Claude Sonnet 4.6	✅ Supported	Manual `enabled` deprecated, still functional
Older models (Sonnet 4.5, Opus 4.5)	❌ Not supported	Must use `thinking: {type: "enabled", budget_tokens: N}`

Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated. Existing configurations still work, but you should migrate to thinking.type: "adaptive" as soon as possible.

How to Use Adaptive Thinking

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.

Basic Example (Python)

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

Basic Example (TypeScript)

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

Fine-Tuning with the Effort Parameter

Adaptive thinking comes with an optional effort parameter that acts as soft guidance for how much thinking Claude should do. The effort level doesn't enforce a hard limit—it nudges Claude toward more or less reasoning.

Effort Levels

Effort	Behavior	Use Case
`low`	Minimal thinking; may skip for simple queries	High-throughput, low-complexity tasks
`medium`	Balanced thinking allocation	General-purpose use
`high` (default)	Almost always thinks; deep reasoning	Complex analysis, math, coding

Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "Analyze the pros and cons of using microservices vs monoliths for a startup."
        }
    ]
)

Tip: Start with effort: "high" for most workloads. Drop to "medium" or "low" only if you're certain your queries are simple or you need faster responses.

Adaptive Thinking in Agentic Workflows

One of the most powerful features of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic loops where Claude needs to reason about tool results before deciding the next action.

Example: Agent with Tool Calls

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "search_database",
            "description": "Search the internal database for user records",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find all users who signed up last week and send them a welcome email."
        }
    ]
)
Claude will think, call the tool, think again about the results, then respond
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking}")
    elif block.type == "tool_use":
        print(f"Tool call: {block.name}({block.input})")
    elif block.type == "text":
        print(f"Final response: {block.text}")

Without adaptive thinking, Claude would do all its reasoning upfront before any tool calls. With interleaved thinking, it can reason after each tool result—leading to more accurate multi-step actions.

Migrating from Manual Budget Tokens

If you're currently using thinking: {type: "enabled", budget_tokens: 8000}, here's how to migrate:

Before (Deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (Recommended)

thinking={
    "type": "adaptive",
    "effort": "high"  # optional, defaults to high
}

What changes:

You no longer need to set budget_tokens.
Claude decides the thinking depth per request.
You may see varying thinking token usage—that's expected and intentional.
For predictable latency or strict cost control, you can still use budget_tokens on Opus 4.6 and Sonnet 4.6, but plan to migrate soon.

Best Practices

Start with effort: "high" – It's the default for a reason. Only lower it after testing your workload.
Use with tool use – The interleaved thinking capability shines in agentic loops. Always enable adaptive thinking when using tools.
Monitor thinking blocks – In your response handling, always check for block.type === "thinking" to capture reasoning traces.
Set max_tokens generously – Adaptive thinking can use more tokens than a fixed budget. Set max_tokens high enough to accommodate both thinking and response.
Test bimodal workloads – If your app handles both simple Q&A and complex analysis, adaptive thinking will automatically adjust—no need to switch modes.

Limitations and Considerations

Not supported on older models – Sonnet 4.5, Opus 4.5, and earlier require manual budget_tokens.
No hard budget control – If you need strict cost predictability, adaptive thinking may not be ideal. Use budget_tokens on supported models as a transitional measure.
Effort is soft guidance – Setting effort: "low" doesn't guarantee Claude won't think deeply; it's a nudge, not a command.

Key Takeaways

Adaptive thinking replaces manual token budgets – Claude dynamically decides when and how much to think, simplifying your code and often improving performance.
It's the only thinking mode on Claude Opus 4.7 – Manual thinking: {type: "enabled"} is rejected with a 400 error on this model.
Interleaved thinking supercharges agentic workflows – Claude can reason between tool calls, leading to more accurate multi-step actions.
Use the effort parameter for soft control – Choose from low, medium, or high to guide Claude's reasoning depth.
Migrate now – Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future release.