Guide2026-05-05

Mastering Adaptive Thinking in Claude: Smarter Reasoning Without Manual Budgets

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens, optimize performance, and simplify your API code—no manual budgets required.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking based on request complexity. You replace manual budget_tokens with a simple 'adaptive' flag and optionally set an effort level. It's the recommended mode for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview.

adaptive thinkingClaude APIextended thinkingeffort parameteragentic workflows

Introduction

If you've worked with Claude's extended thinking feature, you know the power of letting the model "think before it speaks." But manually setting a budget_tokens value for every request can be tedious and inefficient—especially when your workload mixes simple questions with complex reasoning tasks.

Enter adaptive thinking. This new mode, now the recommended approach for Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and the Claude Mythos Preview, lets Claude decide when and how much to think. The result? Better performance, simpler code, and smarter resource allocation.

In this guide, you'll learn exactly what adaptive thinking is, which models support it, how to use it in your API calls, and how to fine-tune it with the effort parameter. By the end, you'll be able to replace clunky budget_tokens configurations with a clean, adaptive approach.

What Is Adaptive Thinking?

Adaptive thinking is a new mode for Claude's extended thinking capability. Instead of you specifying a fixed number of thinking tokens (budget_tokens), Claude evaluates each incoming request and dynamically determines:

Whether to use extended thinking at all
How many tokens to allocate for thinking

This is especially powerful for bimodal workloads—where some requests are trivial (e.g., "What's 2+2?") and others require deep reasoning (e.g., "Prove Fermat's Last Theorem"). With a fixed budget, you either waste tokens on simple tasks or under-allocate for complex ones. Adaptive thinking solves both problems.

Additionally, adaptive thinking automatically enables interleaved thinking, meaning Claude can think between tool calls in agentic workflows. This makes it ideal for multi-step tasks where the model needs to reason after each tool result.

Supported Models

Not all Claude models support adaptive thinking. Here's the current compatibility:

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview (`claude-mythos-preview`)	✅ Default mode	Thinking auto-applies when unset; `thinking: {type: "disabled"}` not supported
Claude Opus 4.7 (`claude-opus-4-7`)	✅ Only supported mode	Manual `thinking: {type: "enabled", budget_tokens: N}` is rejected with a 400 error
Claude Opus 4.6 (`claude-opus-4-6`)	✅ Supported	Manual `budget_tokens` is deprecated but still functional
Claude Sonnet 4.6 (`claude-sonnet-4-6`)	✅ Supported	Manual `budget_tokens` is deprecated but still functional
Older models (Sonnet 4.5, Opus 4.5, etc.)	❌ Not supported	Must use `thinking.type: "enabled"` with `budget_tokens`

Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" and budget_tokens are deprecated and will be removed in a future model release. Migrate to adaptive thinking now.

How to Use Adaptive Thinking (Basic)

Using adaptive thinking is straightforward. Instead of:

{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 16000
  }
}

You simply write:

{
  "thinking": {
    "type": "adaptive"
  }
}

Here's a complete Python example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

And the equivalent in TypeScript:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

No beta headers are required—adaptive thinking is GA for supported models.

Fine-Tuning with the Effort Parameter

Adaptive thinking also introduces the effort parameter, which gives you soft control over how much thinking Claude should do. Think of it as a dial between "minimal thinking" and "maximum reasoning."

Effort Level	Behavior
`low`	Claude may skip thinking for simple requests; uses minimal tokens when it does think
`medium`	Balanced approach; thinks for moderately complex tasks
`high` (default)	Claude almost always thinks; allocates significant tokens for reasoning

Here's how to use it:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable microservices architecture for an e-commerce platform."
        }
    ]
)

When to Use Each Effort Level

low: Use for high-throughput, low-complexity tasks like simple classification, extraction, or Q&A where you want to minimize latency and cost.
medium: Good for general-purpose use where you expect a mix of simple and complex requests.
high: Best for complex reasoning, multi-step agentic workflows, mathematical proofs, code generation, and any task where quality is more important than speed.

Adaptive Thinking in Agentic Workflows

One of the biggest advantages of adaptive thinking is interleaved thinking. In traditional extended thinking, Claude thinks once before responding. With adaptive mode, Claude can think between tool calls, evaluating results and planning the next step.

This makes adaptive thinking a game-changer for agentic workflows:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    tools=[
        {
            "name": "search_database",
            "description": "Search the product database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "calculate_shipping",
            "description": "Calculate shipping cost",
            "input_schema": {
                "type": "object",
                "properties": {
                    "weight": {"type": "number"},
                    "destination": {"type": "string"}
                },
                "required": ["weight", "destination"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find the cheapest laptop under $1000 and calculate shipping to New York."
        }
    ]
)

In this scenario, Claude can:

Think about how to approach the task
Call search_database to find laptops
Think about the results and decide which is cheapest
Call calculate_shipping with the selected laptop's weight
Think about the final answer and present it to the user

Migrating from Manual Budgets

If you're currently using thinking.type: "enabled" with budget_tokens, here's your migration checklist:

Identify your models: Check if you're using Opus 4.6, Sonnet 4.6, or newer. Older models still require manual budgets.
Replace the thinking block: Change {"type": "enabled", "budget_tokens": N} to {"type": "adaptive"}.
Add an effort level (optional): If you had a high budget, start with effort: "high". For low budgets, try effort: "low".
Test and compare: Run your existing test suite and compare quality and latency.
Remove beta headers: Adaptive thinking doesn't require any beta headers.

Best Practices

Start with effort: "high" for complex tasks, then dial down if you need lower latency or cost.
Monitor token usage—adaptive thinking can sometimes use more tokens than a fixed budget on very complex requests. Set max_tokens appropriately.
Combine with prompt caching for multi-turn conversations to reduce costs further.
Use interleaved thinking to your advantage in agentic workflows—let Claude reason after each tool result.
Test edge cases: Adaptive thinking may skip thinking for very simple requests at lower effort levels. If you always need thinking, stick with effort: "high".

Conclusion

Adaptive thinking represents a major step forward in how Claude handles reasoning. By removing the need for manual token budgets and introducing the effort parameter, Anthropic has made extended thinking both more powerful and easier to use.

Whether you're building a simple Q&A bot or a complex agentic system, adaptive thinking can help you get better results with less code. Migrate your workloads today to take advantage of this smarter, more flexible approach.

Key Takeaways

Adaptive thinking replaces manual budget_tokens with a dynamic system where Claude decides when and how much to think based on request complexity.
Supported on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview—older models still require the manual approach.
The effort parameter (low, medium, high) gives you soft control over thinking allocation without hard budgets.
Interleaved thinking is automatic in adaptive mode, making it ideal for agentic workflows with multiple tool calls.
Migrate now if you're on Opus 4.6 or Sonnet 4.6—manual budget_tokens is deprecated and will be removed in a future release.