GuideBeginnerPricing2026-05-22

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode for dynamic, cost-efficient reasoning. Covers API setup, effort levels, migration from fixed budgets, and best practices.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on bimodal tasks and agentic workflows while reducing unnecessary costs.

adaptive thinkingextended thinkingClaude APIreasoningagentic workflows

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But until recently, you had to manually set a budget_tokens value—a fixed number of thinking tokens—for every request. This one-size-fits-all approach often wasted tokens on simple queries or fell short on truly complex ones.

Adaptive thinking changes the game. Instead of you guessing how much thinking a task needs, Claude evaluates each request on the fly and decides whether—and how deeply—to think. The result? Better performance, lower costs, and simpler code.

This guide covers everything you need to know: which models support it, how to configure it, the new effort parameter, and practical migration tips.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, and Sonnet 4.6. It's also the default mode on Claude Mythos Preview.

Instead of a rigid budget_tokens number, you set thinking.type to "adaptive". Claude then:

Evaluates the complexity of each request
Decides whether thinking is needed at all
Allocates thinking tokens dynamically
Enables interleaved thinking—Claude can think between tool calls, which is especially powerful for agentic workflows

Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} is rejected with a 400 error.

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview	✅ Default mode	Thinking auto-applies; cannot disable
Claude Opus 4.7	✅ Only mode	Manual thinking rejected
Claude Opus 4.6	✅ Supported	Manual thinking deprecated
Claude Sonnet 4.6	✅ Supported	Manual thinking deprecated
Older models (Sonnet 4.5, Opus 4.5, etc.)	❌ Not supported	Must use `enabled` with `budget_tokens`

How to Use Adaptive Thinking

Basic Setup

Set thinking.type to "adaptive" in your API request. No beta header is required.

Python example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    messages=[
        {"role": "user", "content": "Explain the implications of quantum entanglement on cryptography."}
    ]
)
print(response.content)

TypeScript example:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: { type: 'adaptive' },
  messages: [
    { role: 'user', content: 'Explain the implications of quantum entanglement on cryptography.' }
  ]
});
console.log(response.content);

Using the Effort Parameter

Adaptive thinking pairs with the new effort parameter, which gives you soft control over how much thinking Claude does. The effort level acts as guidance, not a hard limit.

Effort Level	Behavior	Available On
`max`	Always thinks with no constraints on depth	Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6
`xhigh`	Always thinks deeply with extended exploration	Opus 4.7 only
`high` (default)	Always thinks; deep reasoning on complex tasks	All adaptive models
`medium`	Moderate thinking; may skip for simple tasks	All adaptive models
`low`	Minimal thinking; often skips for simple tasks	All adaptive models
`none`	Never thinks	All adaptive models

Python example with effort:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive"},
    thinking_config={"effort": "high"},  # or "max", "xhigh", "medium", "low", "none"
    messages=[
        {"role": "user", "content": "Design a distributed consensus algorithm for a network with intermittent connectivity."}
    ]
)

Tip: Start with high (the default) for most workloads. Use max or xhigh for research-grade problems. Use medium or low for high-throughput, simpler tasks.

Why Migrate from Fixed Budgets?

If you're currently using thinking: {type: "enabled", budget_tokens: N} on Opus 4.6 or Sonnet 4.6, you should plan to migrate. Here's why:

Better performance on bimodal tasks – Adaptive thinking excels when your workload mixes simple and complex requests. Claude doesn't waste tokens on easy questions.

Long-horizon agentic workflows – Interleaved thinking (thinking between tool calls) is automatically enabled. This is critical for agents that need to reason, act, observe, and reason again.

Cost efficiency – You only pay for the thinking Claude actually uses, not a fixed budget that may be too high or too low.

Future-proof – Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release.

Migration Steps

Change thinking.type from "enabled" to "adaptive"
Remove budget_tokens entirely
Optionally add an effort level (start with "high")
Test on a representative sample of your workload
Monitor latency and cost—adaptive thinking often reduces both

Before (deprecated):

thinking={"type": "enabled", "budget_tokens": 16000}

After (recommended):

thinking={"type": "adaptive"},
thinking_config={"effort": "high"}

Practical Use Cases

1. Customer Support Bot

A support bot handles everything from password resets to complex billing disputes. With adaptive thinking:

Simple requests ("What's my account balance?") → no thinking → fast, cheap
Complex requests ("Why was my transaction declined and how do I appeal?") → deep thinking → accurate

2. Code Generation Assistant

Simple snippet ("Write a function to reverse a string") → low effort
Complex refactoring ("Migrate this monolithic Java app to microservices") → high effort

3. Research Agent

Use effort: "max" or "xhigh" for deep analytical tasks like literature review synthesis or mathematical proofs.

Best Practices

Start with default effort – high works well for most use cases. Only adjust if you have specific latency or depth requirements.
Combine with tool use – Adaptive thinking's interleaved thinking makes it ideal for agents that call tools, process results, and call more tools.
Monitor thinking blocks – In streaming mode, look for thinking content blocks to understand how Claude is reasoning.
Test on representative data – Don't just test on easy examples. Include edge cases that require deep reasoning.
Consider zero data retention – Adaptive thinking is eligible for ZDR. If your organization has a ZDR arrangement, data is not stored after the API response.

Limitations and Caveats

Not available on older models – Sonnet 4.5, Opus 4.5, and earlier require manual budget_tokens.
No hard latency guarantees – Because thinking depth is dynamic, response times can vary. If you need predictable latency, consider using effort: "low" or "none".
Effort is soft guidance – Claude may think more or less than you expect. It's not a hard constraint.

Key Takeaways

Adaptive thinking replaces fixed token budgets – Claude dynamically decides when and how much to think, improving performance and reducing waste.
Use the effort parameter for soft control – Choose from none, low, medium, high, max, or xhigh (Opus 4.7 only) to guide thinking depth.
Interleaved thinking is automatic – Claude can reason between tool calls, making adaptive thinking ideal for agentic workflows.
Migrate away from budget_tokens – Manual thinking is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in future releases.
Start with effort: "high" – It's the default for a reason. Adjust up or down based on your specific workload needs.