BeClaude
GuideBeginnerPricing2026-05-22

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Covers API setup, effort levels, migration from fixed budgets, and best practices.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on bimodal tasks and agentic workflows while reducing unnecessary costs.

adaptive thinkingextended thinkingClaude APIreasoningagentic workflows

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—a fixed number of thinking tokens—for every request. This one-size-fits-all approach often wasted tokens on simple queries or fell short on truly complex ones.

Adaptive thinking changes the game. Instead of you guessing how much thinking a task needs, Claude evaluates each request on the fly and decides whether—and how deeply—to think. The result? Better performance, lower costs, and simpler code.

This guide covers everything you need to know: which models support it, how to configure it, the new effort parameter, and practical migration tips.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. It's also the default mode on Claude Mythos Preview.

Instead of a rigid budget_tokens number, you set thinking.type to "adaptive". Claude then:

  • Evaluates the complexity of each request
  • Decides whether thinking is needed at all
  • Allocates thinking tokens dynamically
  • Enables interleaved thinking—Claude can think between tool calls, which is especially powerful for agentic workflows
Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error.

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview✅ Default modeThinking auto-applies; cannot disable
Claude Opus 4.7✅ Only modeManual thinking rejected
Claude Opus 4.6✅ SupportedManual thinking deprecated
Claude Sonnet 4.6✅ SupportedManual thinking deprecated
Older models (Sonnet 4.5, Opus 4.5, etc.)❌ Not supportedMust use enabled with budget_tokens

How to Use Adaptive Thinking

Basic Setup

Set thinking.type to "adaptive" in your API request. No beta header is required.

Python example:
import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=4096, thinking={"type": "adaptive"}, messages=[ {"role": "user", "content": "Explain the implications of quantum entanglement on cryptography."} ] )

print(response.content)

TypeScript example:
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 4096, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain the implications of quantum entanglement on cryptography.' } ] });

console.log(response.content);

Using the Effort Parameter

Adaptive thinking pairs with the new effort parameter, which gives you soft control over how much thinking Claude does. The effort level acts as guidance, not a hard limit.

Effort LevelBehaviorAvailable On
maxAlways thinks with no constraints on depthMythos, Opus 4.7, Opus 4.6, Sonnet 4.6
xhighAlways thinks deeply with extended explorationOpus 4.7 only
high (default)Always thinks; deep reasoning on complex tasksAll adaptive models
mediumModerate thinking; may skip for simple tasksAll adaptive models
lowMinimal thinking; often skips for simple tasksAll adaptive models
noneNever thinksAll adaptive models
Python example with effort:
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    thinking_config={"effort": "high"},  # or "max", "xhigh", "medium", "low", "none"
    messages=[
        {"role": "user", "content": "Design a distributed consensus algorithm for a network with intermittent connectivity."}
    ]
)
Tip: Start with high (the default) for most workloads. Use max or xhigh for research-grade problems. Use medium or low for high-throughput, simpler tasks.

Why Migrate from Fixed Budgets?

If you're currently using thinking: {type: "enabled", budget_tokens: N} on Opus 4.6 or Sonnet 4.6, you should plan to migrate. Here's why:

  • Better performance on bimodal tasks – Adaptive thinking excels when your workload mixes simple and complex requests. Claude doesn't waste tokens on easy questions.
  • Long-horizon agentic workflows – Interleaved thinking (thinking between tool calls) is automatically enabled. This is critical for agents that need to reason, act, observe, and reason again.
  • Cost efficiency – You only pay for the thinking Claude actually uses, not a fixed budget that may be too high or too low.
  • Future-proof – Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release.

Migration Steps

  • Change thinking.type from "enabled" to "adaptive"
  • Remove budget_tokens entirely
  • Optionally add an effort level (start with "high")
  • Test on a representative sample of your workload
  • Monitor latency and cost—adaptive thinking often reduces both
Before (deprecated):
thinking={"type": "enabled", "budget_tokens": 16000}
After (recommended):
thinking={"type": "adaptive"},
thinking_config={"effort": "high"}

Practical Use Cases

1. Customer Support Bot

A support bot handles everything from password resets to complex billing disputes. With adaptive thinking:

  • Simple requests ("What's my account balance?") → no thinking → fast, cheap
  • Complex requests ("Why was my transaction declined and how do I appeal?") → deep thinking → accurate

2. Code Generation Assistant

  • Simple snippet ("Write a function to reverse a string") → low effort
  • Complex refactoring ("Migrate this monolithic Java app to microservices") → high effort

3. Research Agent

Use effort: "max" or "xhigh" for deep analytical tasks like literature review synthesis or mathematical proofs.

Best Practices

  • Start with default efforthigh works well for most use cases. Only adjust if you have specific latency or depth requirements.
  • Combine with tool use – Adaptive thinking's interleaved thinking makes it ideal for agents that call tools, process results, and call more tools.
  • Monitor thinking blocks – In streaming mode, look for thinking content blocks to understand how Claude is reasoning.
  • Test on representative data – Don't just test on easy examples. Include edge cases that require deep reasoning.
  • Consider zero data retention – Adaptive thinking is eligible for ZDR. If your organization has a ZDR arrangement, data is not stored after the API response.

Limitations and Caveats

  • Not available on older models – Sonnet 4.5, Opus 4.5, and earlier require manual budget_tokens.
  • No hard latency guarantees – Because thinking depth is dynamic, response times can vary. If you need predictable latency, consider using effort: "low" or "none".
  • Effort is soft guidance – Claude may think more or less than you expect. It's not a hard constraint.

Key Takeaways

  • Adaptive thinking replaces fixed token budgets – Claude dynamically decides when and how much to think, improving performance and reducing waste.
  • Use the effort parameter for soft control – Choose from none, low, medium, high, max, or xhigh (Opus 4.7 only) to guide thinking depth.
  • Interleaved thinking is automatic – Claude can reason between tool calls, making adaptive thinking ideal for agentic workflows.
  • Migrate away from budget_tokens – Manual thinking is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases.
  • Start with effort: "high" – It's the default for a reason. Adjust up or down based on your specific workload needs.