BeClaude
GuideBeginnerPricing2026-05-12

Mastering Adaptive Thinking in Claude: A Complete Guide to Dynamic Reasoning

Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Includes API setup, effort parameters, code examples, and best practices.

Quick Answer

This guide explains how to use Claude's adaptive thinking mode, which lets the model dynamically decide when and how much to think. You'll learn how to set it up via the API, control thinking depth with the effort parameter, and optimize for cost and latency.

adaptive thinkingextended thinkingClaude APIreasoningagentic workflows

Introduction

Claude's extended thinking capability allows the model to "think" before responding, producing more accurate and nuanced answers—especially on complex tasks like math, coding, and multi-step reasoning. However, manually setting a fixed thinking token budget (budget_tokens) often leads to inefficiency: you either waste tokens on simple queries or under-think hard problems.

Adaptive thinking solves this by letting Claude dynamically determine when and how much to think, based on the complexity of each request. It is now the recommended mode for Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and is the default on Claude Mythos Preview.

In this guide, you'll learn:

  • How adaptive thinking works under the hood
  • How to enable it in your API calls
  • How to use the effort parameter to control thinking depth
  • Best practices for cost, latency, and performance
  • Migration tips if you're coming from budget_tokens
Let's dive in.

Why Adaptive Thinking?

Traditional extended thinking with a fixed budget_tokens has two major drawbacks:

  • Overthinking simple requests – You waste tokens and increase latency when the model thinks deeply about trivial questions.
  • Underthinking complex requests – A small budget may truncate reasoning on genuinely hard problems, degrading answer quality.
Adaptive thinking eliminates both issues. Claude evaluates each request individually and decides:
  • Whether to think at all
  • How many thinking tokens to allocate
  • When to interleave thinking between tool calls (for agentic workflows)
This leads to better performance on bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows.

Supported Models

Adaptive thinking is available on:

ModelAPI NameNotes
Claude Mythos Previewclaude-mythos-previewAdaptive is default; thinking: disabled not supported
Claude Opus 4.7claude-opus-4-7Only supported mode; manual budget_tokens rejected
Claude Opus 4.6claude-opus-4-6budget_tokens deprecated; migrate to adaptive
Claude Sonnet 4.6claude-sonnet-4-6budget_tokens deprecated; migrate to adaptive
Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future release. Plan to migrate to adaptive thinking.

Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking.type: "enabled" with budget_tokens.

How Adaptive Thinking Works

When you set thinking.type: "adaptive", Claude:

  • Evaluates request complexity – The model analyzes the prompt, tools, and context to gauge difficulty.
  • Decides whether to think – At default effort (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simple problems.
  • Allocates thinking tokens dynamically – The model uses as many or as few tokens as needed, up to the model's maximum context window.
  • Enables interleaved thinking – Claude can think between tool calls, which is critical for agentic loops where the model needs to reason about tool outputs before acting again.

How to Use Adaptive Thinking

Basic Setup

Set thinking.type to "adaptive" in your API request. No budget_tokens is needed.

Python example:
import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=4096, thinking={"type": "adaptive"}, messages=[ {"role": "user", "content": "Solve this equation: 3x + 7 = 22"} ] )

print(response.content[0].text)

TypeScript example:
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 4096, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Solve this equation: 3x + 7 = 22' } ] });

console.log(response.content[0].text);

Using the Effort Parameter

The effort parameter gives you soft control over how much thinking Claude does. It's optional and defaults to high.

Effort LevelBehaviorAvailable On
maxAlways thinks with no constraints on depthMythos Preview, Opus 4.7, Opus 4.6, Sonnet 4.6
xhighAlways thinks deeply with extended explorationOpus 4.7
high (default)Always thinks; deep reasoning on complex tasksAll adaptive models
mediumModerate thinking; may skip for very simple queriesAll adaptive models
lowMinimizes thinking; skips for most queriesAll adaptive models
Python example with effort:
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high", "xhigh", "max"
    },
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)
TypeScript example with effort:
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: {
    type: 'adaptive',
    effort: 'medium'
  },
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ]
});

Adaptive Thinking in Agentic Workflows

One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to:

  • Analyze tool outputs before deciding the next action
  • Plan multi-step sequences
  • Recover from errors or unexpected results
Example: A research agent that searches and summarizes
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=8192,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "Research the latest AI breakthroughs in 2025 and summarize them in 3 bullet points."}
    ]
)

With adaptive thinking, Claude will:

  • Think about how to break down the research task
  • Call the search tool
  • Think again about the results
  • Call again if needed
  • Finally compose the summary

Migrating from Fixed Budget Tokens

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Before (deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 2048
}

After (recommended)

thinking={
    "type": "adaptive",
    "effort": "high"  # optional, defaults to high
}

When to keep the old approach

  • Predictable latency: If you need guaranteed response times, fixed budget_tokens still works on Opus 4.6 and Sonnet 4.6 (but is deprecated).
  • Cost control: If you must cap thinking token usage strictly, adaptive thinking does not support hard caps. Use budget_tokens as a temporary measure while you evaluate adaptive.
Note: On Opus 4.7, thinking.type: "enabled" is rejected with a 400 error. You must use adaptive thinking.

Best Practices

1. Start with high effort

The default high effort works well for most use cases. It provides deep reasoning on complex tasks while still being efficient.

2. Use medium or low for cost-sensitive apps

If you're building a high-volume chatbot that handles mostly simple queries, medium or low can significantly reduce token usage without sacrificing quality on the rare complex question.

3. Use max or xhigh for critical reasoning

For tasks like mathematical proofs, code generation with complex logic, or multi-step planning, max (or xhigh on Opus 4.7) ensures Claude explores all reasoning paths.

4. Combine with tool use for agents

Adaptive thinking shines in agentic workflows. Enable interleaved thinking by setting thinking.type: "adaptive" alongside your tool definitions.

5. Monitor token usage

Even though adaptive thinking is more efficient, it can still use many tokens on hard problems. Use the usage field in the API response to track thinking tokens:

print(response.usage)

Output: {"input_tokens": 25, "output_tokens": 150, "thinking_tokens": 80}

6. Test with your workload

Every application is different. Run A/B tests comparing adaptive thinking (with various effort levels) against your current budget_tokens configuration to find the best balance of cost, latency, and quality.

Limitations

  • No hard token cap: Adaptive thinking does not support a maximum thinking token budget. If you need strict cost control, consider using budget_tokens on models that still support it (Opus 4.6, Sonnet 4.6).
  • Not available on older models: Sonnet 4.5, Opus 4.5, and earlier require the old thinking.type: "enabled" approach.
  • Effort is soft guidance: The effort parameter is not a hard constraint. Claude may still think more or less than expected.

Key Takeaways

  • Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the old fixed budget_tokens approach.
  • Supported on Claude Mythos Preview, Opus 4.7, Opus 4.6, and Sonnet 4.6. On Opus 4.7, it's the only option.
  • Use the effort parameter (low, medium, high, xhigh, max) to guide thinking depth without hard caps.
  • Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
  • Migrate now if you're using budget_tokens on Opus 4.6 or Sonnet 4.6—the old approach is deprecated and will be removed.
Adaptive thinking is a powerful tool for getting the most out of Claude's reasoning capabilities while keeping costs and latency under control. Start experimenting today!