BeClaude
GuideBeginnerPricing2026-05-20

Mastering Adaptive Thinking in Claude: A Practical Guide to Smarter AI Reasoning

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort, improve agentic workflows, and optimize API costs with practical code examples.

Quick Answer

This guide explains how to use Claude's adaptive thinking mode to let the model dynamically decide when and how much to reason, replacing manual token budgets for better performance in agentic and bimodal tasks.

adaptive thinkingextended thinkingClaude APIagentic workflowsreasoning optimization

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks, but manually setting a budget_tokens value often leads to either wasted tokens on simple queries or insufficient reasoning on hard problems. Enter adaptive thinking — a smarter, dynamic approach that lets Claude decide when and how much to think based on the complexity of each request.

This guide covers everything you need to know to implement adaptive thinking in your Claude API projects, from basic setup to advanced optimization with the effort parameter.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and the Claude Mythos Preview. Instead of requiring you to set a fixed thinking token budget, it allows Claude to evaluate each request independently and determine:

  • Whether extended thinking is needed at all
  • How much thinking to allocate
This is especially powerful for bimodal workloads (where some queries are trivial and others are complex) and long-horizon agentic workflows (where the model needs to reason between tool calls).

Key Benefits

  • Better performance on mixed-complexity tasks compared to fixed budgets
  • Automatic interleaved thinking — Claude can think between tool calls, improving multi-step reasoning
  • Simpler API calls — no need to guess a budget_tokens value
  • Cost efficiency — lower effort levels can skip thinking for simple problems

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview (claude-mythos-preview)✅ Default modeThinking cannot be disabled
Claude Opus 4.7 (claude-opus-4-7)✅ Only supported modeManual enabled mode rejected with 400 error
Claude Opus 4.6 (claude-opus-4-6)✅ SupportedManual enabled deprecated
Claude Sonnet 4.6 (claude-sonnet-4-6)✅ SupportedManual enabled deprecated
Older models (Sonnet 4.5, Opus 4.5, etc.)❌ Not supportedMust use thinking.type: "enabled" with budget_tokens
Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future model release. Migrate to adaptive thinking as soon as possible.

How Adaptive Thinking Works

When you set thinking.type to "adaptive", Claude's internal router evaluates each incoming request and decides:

  • Is this a thinking-worthy problem? Simple factual questions may skip thinking entirely.
  • How deep should the reasoning go? Complex math, code generation, or multi-step planning gets more thinking tokens.
At the default effort level (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems, saving tokens and latency.

Additionally, adaptive thinking automatically enables interleaved thinking — the model can pause between tool calls to reason about results, plan next steps, or adjust its strategy. This is critical for agentic workflows where the model needs to iterate.

Getting Started: Basic Implementation

Prerequisites

  • An Anthropic API key
  • The Anthropic Python SDK (pip install anthropic) or TypeScript SDK (npm install @anthropic-ai/sdk)
  • Access to a supported model

Python Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Controlling Thinking Depth with the effort Parameter

Adaptive thinking can be combined with an optional effort parameter that acts as a soft guide for how much thinking Claude should do. This is especially useful when you want to balance reasoning quality against latency or cost.

Effort Levels

Effort LevelBehaviorUse Case
lowMinimal thinking; may skip for simple queriesHigh-throughput, low-latency applications
mediumBalanced reasoningGeneral-purpose use
highMaximum reasoning (default)Complex problem-solving, agentic tasks

Python Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable microservices architecture for an e-commerce platform."
        }
    ]
)

TypeScript Example with Effort

const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: {
    type: 'adaptive',
    effort: 'medium'
  },
  messages: [
    {
      role: 'user',
      content: 'Design a scalable microservices architecture for an e-commerce platform.'
    }
  ]
});
Note: The effort parameter is a beta feature. Its behavior may change in future releases. Always test with your specific workload.

Adaptive Thinking in Agentic Workflows

One of the most powerful applications of adaptive thinking is in agentic workflows where Claude uses tools to accomplish multi-step tasks. With interleaved thinking enabled automatically, Claude can:

  • Receive a complex request (e.g., "Research the latest AI papers and summarize them")
  • Think about the approach before making the first tool call
  • Execute a search tool and receive results
  • Think about the results to decide next steps
  • Call additional tools or synthesize the final answer

Example: Research Agent with Adaptive Thinking

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=32000, thinking={"type": "adaptive"}, tools=[ { "name": "web_search", "description": "Search the web for information", "input_schema": { "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } } ], messages=[ { "role": "user", "content": "Find the top 3 AI breakthroughs announced in 2025 and explain their significance." } ] )

The response may contain interleaved thinking and tool_use blocks

for block in response.content: if block.type == "thinking": print(f"[Thinking] {block.thinking[:200]}...") elif block.type == "tool_use": print(f"[Tool Call] {block.name}({block.input})") elif block.type == "text": print(f"[Final] {block.text}")

Migrating from Fixed Budget to Adaptive Thinking

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Before (Deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (Recommended)

thinking={
    "type": "adaptive"
}

Or with effort control:

thinking={
    "type": "adaptive",
    "effort": "high"
}

Migration Checklist

  • ✅ Update your model to a supported version (Opus 4.7, Opus 4.6, Sonnet 4.6, or Mythos Preview)
  • ✅ Replace thinking.type: "enabled" with thinking.type: "adaptive"
  • ✅ Remove the budget_tokens parameter
  • ✅ Optionally add the effort parameter for finer control
  • ✅ Test with your workloads to ensure performance meets expectations

Best Practices

1. Start with Default Effort

Begin with effort: "high" (or omit it entirely) to get the full benefit of adaptive thinking. Only reduce effort if you have specific latency or cost constraints.

2. Use Adaptive Thinking for Agentic Tasks

If your application involves multi-step reasoning with tools, adaptive thinking is almost always superior to fixed budgets because of interleaved thinking.

3. Monitor Thinking Tokens

Even though you don't set a budget, you can still inspect the thinking blocks in the response to understand how much reasoning Claude performed. Use this to validate that the effort level matches your needs.

4. Combine with Prompt Caching

Adaptive thinking works well with prompt caching. Cache your system prompts and tool definitions to reduce latency and cost, while letting Claude dynamically allocate thinking tokens.

5. Test Bimodal Workloads

If your application handles both simple and complex queries, adaptive thinking shines. Test with a representative sample to see how often Claude skips thinking for easy questions.

Limitations and Considerations

  • Predictable latency: If you need guaranteed response times, adaptive thinking may not be ideal because thinking depth varies per request. Consider using fixed budget_tokens on Opus 4.6/Sonnet 4.6 if latency predictability is critical.
  • Cost control: While adaptive thinking can save tokens on simple queries, it may use more tokens on complex ones than a conservative fixed budget. Monitor your usage.
  • Beta features: The effort parameter is in beta. Its behavior may change, and it may not be available in all regions or for all models.

Conclusion

Adaptive thinking represents a significant evolution in how Claude handles reasoning. By letting the model decide when and how much to think, you can achieve better performance on mixed workloads, simplify your code, and unlock more natural agentic behaviors through interleaved thinking.

Whether you're building a simple Q&A bot or a complex multi-agent system, adaptive thinking is now the recommended path forward. Migrate your existing integrations today to take full advantage of Claude's latest reasoning capabilities.

Key Takeaways

  • Adaptive thinking lets Claude dynamically decide when and how much to reason, replacing manual budget_tokens for better performance on bimodal and agentic workloads.
  • Supported on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview — older models still require fixed budgets.
  • Interleaved thinking is automatically enabled, allowing Claude to reason between tool calls in agentic workflows.
  • The optional effort parameter (low, medium, high) provides soft control over thinking depth without a fixed token budget.
  • Migrate now if you're using deprecated thinking.type: "enabled" — it will be removed in future model releases.