BeClaude
GuideBeginner2026-05-06

Mastering Adaptive Thinking in Claude: Smarter Reasoning Without Manual Budgets

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort based on task complexity. Includes code examples, effort parameters, and migration tips.

Quick Answer

Adaptive thinking lets Claude automatically decide when and how much to use extended reasoning, replacing manual token budgets. You set thinking.type to 'adaptive' and optionally use the effort parameter to guide depth. It's ideal for bimodal tasks and agentic workflows.

adaptive thinkingClaude APIextended thinkingreasoningagentic workflows

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until now, you had to manually set a budget_tokens value—a fixed number of tokens Claude could use for internal reasoning before generating a response. This worked, but it was rigid: simple requests got more thinking than needed, and complex ones could run out of room.

Adaptive thinking changes that. Instead of you guessing how much thinking a task requires, Claude evaluates each request and dynamically decides whether—and how much—to think. It's smarter, more efficient, and now the default (or only) mode on Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview.

In this guide, you'll learn:

  • What adaptive thinking is and how it differs from manual extended thinking
  • Which models support it
  • How to use it with code examples (Python and TypeScript)
  • How to control thinking depth with the effort parameter
  • Best practices and migration tips
Let's dive in.

What Is Adaptive Thinking?

Adaptive thinking is a new mode for Claude's extended reasoning. Instead of forcing Claude to use a fixed number of thinking tokens on every request, you let Claude decide:

  • Whether to think at all (for simple questions, it may skip thinking)
  • How much to think (for complex tasks, it can use more tokens)
This is especially powerful for bimodal workloads—where some requests are trivial and others require deep reasoning—and for long-horizon agentic workflows, where Claude needs to reason between multiple tool calls.

Adaptive thinking also enables interleaved thinking: Claude can think between tool calls, not just before the first response. This makes it ideal for agents that need to plan, act, observe, and re-plan.

Note: Adaptive thinking is eligible for Zero Data Retention (ZDR). If your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview (claude-mythos-preview)Default (auto-applies when thinking is unset)thinking: {type: "disabled"} is not supported
Claude Opus 4.7 (claude-opus-4-7)Only supported modeManual thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error
Claude Opus 4.6 (claude-opus-4-6)SupportedManual mode deprecated; migrate to adaptive
Claude Sonnet 4.6 (claude-sonnet-4-6)SupportedManual mode deprecated; migrate to adaptive
Older models (Sonnet 4.5, Opus 4.5, etc.)Not supportedMust use thinking.type: "enabled" with budget_tokens

How to Use Adaptive Thinking

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.

Python Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Controlling Thinking Depth with the Effort Parameter

Adaptive thinking alone lets Claude decide how much to think. But you can provide soft guidance using the effort parameter. This is especially useful when you want to:

  • Save costs on simple tasks (low effort)
  • Maximize reasoning on critical problems (high effort)
  • Balance between speed and depth

Effort Levels

Effort LevelBehavior
lowClaude may skip thinking for simple problems; uses minimal thinking tokens
mediumBalanced approach; thinks when beneficial but conserves tokens
high(Default) Claude almost always thinks; uses substantial tokens for complex tasks

Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable architecture for a real-time chat application serving 10 million users."
        }
    ]
)
Tip: Start with effort: "high" for complex reasoning tasks. Use "medium" or "low" for high-throughput applications where latency matters.

Adaptive Thinking in Agentic Workflows

One of the biggest advantages of adaptive thinking is interleaved thinking—Claude can think between tool calls. This is critical for agents that need to:

  • Plan a sequence of actions
  • Execute a tool call
  • Observe the result
  • Re-think and adjust the plan
  • Execute the next tool call
Without adaptive thinking, Claude would exhaust its thinking budget upfront and might not have tokens left for intermediate reasoning. With adaptive thinking, it dynamically allocates thinking tokens throughout the conversation.

Example: Agent with Tool Calls

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    tools=[
        {
            "name": "search_database",
            "description": "Search the internal database for customer records",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "send_email",
            "description": "Send an email to a customer",
            "input_schema": {
                "type": "object",
                "properties": {
                    "to": {"type": "string"},
                    "subject": {"type": "string"},
                    "body": {"type": "string"}
                },
                "required": ["to", "subject", "body"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find the customer with the highest outstanding balance and send them a payment reminder."
        }
    ]
)

In this scenario, Claude can:

  • Think about how to approach the task
  • Call search_database to find the customer
  • Think about the results and decide on an appropriate email tone
  • Call send_email with the right content
All with dynamic thinking allocation between each step.

Migration Guide: From Manual to Adaptive

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Step 1: Identify affected models

Check which models you're using. If you're on Opus 4.6 or Sonnet 4.6, you can migrate now. If you're on Opus 4.7, you must migrate (manual mode is rejected).

Step 2: Replace the thinking block

Before (manual):
{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 8000
  }
}
After (adaptive):
{
  "thinking": {
    "type": "adaptive",
    "effort": "high"
  }
}

Step 3: Adjust max_tokens

With manual mode, max_tokens needed to be greater than budget_tokens. With adaptive mode, max_tokens still limits the total output (thinking + visible text), but you don't need to reserve a separate budget. You can often reduce max_tokens for simpler tasks.

Step 4: Test and compare

Run your existing test suite with adaptive mode. For most workloads, you'll see:

  • Equal or better reasoning quality
  • Lower token usage for simple requests
  • More consistent performance on complex tasks

Best Practices

  • Start with effort: "high" for tasks that previously used manual thinking. You can dial it down later.
  • Use effort: "low" for high-throughput, low-complexity workloads like simple classification or extraction.
  • Monitor token usage—adaptive thinking can reduce costs on bimodal workloads but may increase tokens on uniformly complex tasks.
  • Combine with tool use for agentic workflows; interleaved thinking is a major advantage.
  • Don't set max_tokens too low—Claude needs room for both thinking and visible output. A good starting point is 2x your expected output length.

Limitations and Considerations

  • Not supported on older models (Sonnet 4.5, Opus 4.5, etc.). You must continue using manual budget_tokens for those.
  • Predictable latency: If you need consistent response times, manual mode may still be preferable on Opus 4.6/Sonnet 4.6 (though deprecated).
  • Cost control: Adaptive thinking can use more tokens than expected on very complex tasks. Use the effort parameter to limit this.

Key Takeaways

  • Adaptive thinking replaces manual token budgets on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview. Set thinking.type to "adaptive".
  • Claude dynamically decides how much to think based on request complexity, saving tokens on simple tasks and allocating more for complex ones.
  • Use the effort parameter (low, medium, high) to guide thinking depth without hard constraints.
  • Interleaved thinking enables Claude to reason between tool calls, making it ideal for agentic workflows.
  • Migrate now if you're on Opus 4.6 or Sonnet 4.6—manual mode is deprecated and will be removed in a future release.