GuideBeginnerPricing2026-05-12

Mastering Adaptive Thinking in Claude: A Complete Guide to Dynamic Reasoning

Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Includes API setup, effort parameters, code examples, and best practices.

Quick Answer

This guide explains how to use Claude's adaptive thinking mode, which lets the model dynamically decide when and how much to think. You'll learn how to set it up via the API, control thinking depth with the effort parameter, and optimize for cost and latency.

adaptive thinkingextended thinkingClaude APIreasoningagentic workflows

Introduction

Claude's extended thinking capability allows the model to "think" before responding, producing more accurate and nuanced answers—especially on complex tasks like math, coding, and multi-step reasoning. However, manually setting a fixed thinking token budget (budget_tokens) often leads to inefficiency: you either waste tokens on simple queries or under-think hard problems.

Adaptive thinking solves this by letting Claude dynamically determine when and how much to think, based on the complexity of each request. It is now the recommended mode for Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and is the default on Claude Mythos Preview.

In this guide, you'll learn:

How adaptive thinking works under the hood
How to enable it in your API calls
How to use the effort parameter to control thinking depth
Best practices for cost, latency, and performance
Migration tips if you're coming from budget_tokens

Let's dive in.

Why Adaptive Thinking?

Traditional extended thinking with a fixed budget_tokens has two major drawbacks:

Overthinking simple requests – You waste tokens and increase latency when the model thinks deeply about trivial questions.
Underthinking complex requests – A small budget may truncate reasoning on genuinely hard problems, degrading answer quality.

Adaptive thinking eliminates both issues. Claude evaluates each request individually and decides:

Whether to think at all
How many thinking tokens to allocate
When to interleave thinking between tool calls (for agentic workflows)

This leads to better performance on bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows.

Supported Models

Adaptive thinking is available on:

Model	API Name	Notes
Claude Mythos Preview	`claude-mythos-preview`	Adaptive is default; `thinking: disabled` not supported
Claude Opus 4.7	`claude-opus-4-7`	Only supported mode; manual `budget_tokens` rejected
Claude Opus 4.6	`claude-opus-4-6`	`budget_tokens` deprecated; migrate to adaptive
Claude Sonnet 4.6	`claude-sonnet-4-6`	`budget_tokens` deprecated; migrate to adaptive

Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future release. Plan to migrate to adaptive thinking.

Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking.type: "enabled" with budget_tokens.

How Adaptive Thinking Works

When you set thinking.type: "adaptive", Claude:

Evaluates request complexity – The model analyzes the prompt, tools, and context to gauge difficulty.
Decides whether to think – At default effort (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simple problems.
Allocates thinking tokens dynamically – The model uses as many or as few tokens as needed, up to the model's maximum context window.
Enables interleaved thinking – Claude can think between tool calls, which is critical for agentic loops where the model needs to reason about tool outputs before acting again.

How to Use Adaptive Thinking

Basic Setup

Set thinking.type to "adaptive" in your API request. No budget_tokens is needed.

Python example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    messages=[
        {"role": "user", "content": "Solve this equation: 3x + 7 = 22"}
    ]
)
print(response.content[0].text)

TypeScript example:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: { type: 'adaptive' },
  messages: [
    { role: 'user', content: 'Solve this equation: 3x + 7 = 22' }
  ]
});
console.log(response.content[0].text);

Using the Effort Parameter

The effort parameter gives you soft control over how much thinking Claude does. It's optional and defaults to high.

Effort Level	Behavior	Available On
`max`	Always thinks with no constraints on depth	Mythos Preview, Opus 4.7, Opus 4.6, Sonnet 4.6
`xhigh`	Always thinks deeply with extended exploration	Opus 4.7
`high` (default)	Always thinks; deep reasoning on complex tasks	All adaptive models
`medium`	Moderate thinking; may skip for very simple queries	All adaptive models
`low`	Minimizes thinking; skips for most queries	All adaptive models

Python example with effort:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high", "xhigh", "max"
    },
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

TypeScript example with effort:

const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: {
    type: 'adaptive',
    effort: 'medium'
  },
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ]
});

Adaptive Thinking in Agentic Workflows

One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to:

Analyze tool outputs before deciding the next action
Plan multi-step sequences
Recover from errors or unexpected results

Example: A research agent that searches and summarizes

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=8192,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "Research the latest AI breakthroughs in 2025 and summarize them in 3 bullet points."}
    ]
)

With adaptive thinking, Claude will:

Think about how to break down the research task
Call the search tool
Think again about the results
Call again if needed
Finally compose the summary

Migrating from Fixed Budget Tokens

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Before (deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 2048
}

After (recommended)

thinking={
    "type": "adaptive",
    "effort": "high"  # optional, defaults to high
}

When to keep the old approach

Predictable latency: If you need guaranteed response times, fixed budget_tokens still works on Opus 4.6 and Sonnet 4.6 (but is deprecated).
Cost control: If you must cap thinking token usage strictly, adaptive thinking does not support hard caps. Use budget_tokens as a temporary measure while you evaluate adaptive.

Note: On Opus 4.7, thinking.type: "enabled" is rejected with a 400 error. You must use adaptive thinking.

Best Practices

1. Start with `high` effort

The default high effort works well for most use cases. It provides deep reasoning on complex tasks while still being efficient.

2. Use `medium` or `low` for cost-sensitive apps

If you're building a high-volume chatbot that handles mostly simple queries, medium or low can significantly reduce token usage without sacrificing quality on the rare complex question.

3. Use `max` or `xhigh` for critical reasoning

For tasks like mathematical proofs, code generation with complex logic, or multi-step planning, max (or xhigh on Opus 4.7) ensures Claude explores all reasoning paths.

4. Combine with tool use for agents

Adaptive thinking shines in agentic workflows. Enable interleaved thinking by setting thinking.type: "adaptive" alongside your tool definitions.

5. Monitor token usage

Even though adaptive thinking is more efficient, it can still use many tokens on hard problems. Use the usage field in the API response to track thinking tokens:

print(response.usage)
Output: {"input_tokens": 25, "output_tokens": 150, "thinking_tokens": 80}

6. Test with your workload

Every application is different. Run A/B tests comparing adaptive thinking (with various effort levels) against your current budget_tokens configuration to find the best balance of cost, latency, and quality.

Limitations

No hard token cap: Adaptive thinking does not support a maximum thinking token budget. If you need strict cost control, consider using budget_tokens on models that still support it (Opus 4.6, Sonnet 4.6).
Not available on older models: Sonnet 4.5, Opus 4.5, and earlier require the old thinking.type: "enabled" approach.
Effort is soft guidance: The effort parameter is not a hard constraint. Claude may still think more or less than expected.

Key Takeaways

Adaptive thinking lets Claude dynamically decide when and how much to think, replacing the old fixed budget_tokens approach.
Supported on Claude Mythos Preview, Opus 4.7, Opus 4.6, and Sonnet 4.6. On Opus 4.7, it's the only option.
Use the effort parameter (low, medium, high, xhigh, max) to guide thinking depth without hard caps.
Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
Migrate now if you're using budget_tokens on Opus 4.6 or Sonnet 4.6—the old approach is deprecated and will be removed.

Adaptive thinking is a powerful tool for getting the most out of Claude's reasoning capabilities while keeping costs and latency under control. Start experimenting today!