BeClaude
Guide2026-05-06

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning tokens based on task complexity, optimize agentic workflows, and reduce costs without sacrificing performance.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended reasoning per request, replacing fixed token budgets. It improves performance on complex tasks, enables interleaved thinking between tool calls, and supports effort levels to balance depth and cost.

adaptive thinkingClaude APIextended thinkingagentic workflowstoken optimization

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Claude’s extended thinking capability has been a game-changer for complex reasoning tasks—but until recently, you had to manually set a fixed token budget for every request. That approach worked well for predictable workloads, but it often wasted tokens on simple queries or fell short on unexpectedly complex ones.

Enter adaptive thinking: a smarter, more flexible way to use Claude’s reasoning powers. Instead of guessing how many thinking tokens a task needs, you let Claude decide. This guide will walk you through everything you need to know to implement adaptive thinking effectively.

What Is Adaptive Thinking?

Adaptive thinking is the recommended mode for using extended thinking with Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview. Rather than requiring a fixed budget_tokens value, it allows Claude to evaluate each request individually and determine:

  • Whether extended thinking is needed at all
  • How much thinking to allocate
This dynamic approach leads to better performance on bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows where reasoning depth naturally varies across steps.
Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} syntax will return a 400 error. On Opus 4.6 and Sonnet 4.6, the fixed-budget mode is deprecated and will be removed in a future release.

How Adaptive Thinking Works

When you enable adaptive thinking, Claude’s internal reasoning engine evaluates the complexity of the incoming prompt and decides on the fly how many thinking tokens to use. At the default effort level (high), Claude almost always engages thinking. At lower effort levels, it may skip thinking for straightforward requests.

A key benefit is interleaved thinking: Claude can pause between tool calls to re-evaluate, plan next steps, or refine its approach. This makes adaptive thinking especially powerful for agentic workflows where the model needs to reason iteratively.

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos PreviewDefault (auto-applies)thinking: {type: "disabled"} not supported
Claude Opus 4.7Only modeManual enabled rejected with 400
Claude Opus 4.6SupportedManual mode deprecated
Claude Sonnet 4.6SupportedManual mode deprecated
Older models (Sonnet 4.5, Opus 4.5)Not supportedMust use enabled with budget_tokens

How to Use Adaptive Thinking in the API

Enabling adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. Here’s a complete Python example:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

And the equivalent in TypeScript:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

Fine-Tuning with the Effort Parameter

Adaptive thinking becomes even more powerful when combined with the effort parameter. This acts as soft guidance for how much thinking Claude should do:

Effort LevelBehavior
lowMinimal thinking; skips for simple tasks
mediumBalanced approach
high (default)Almost always thinks; maximum reasoning depth
Here’s how to set a lower effort level for cost-sensitive workloads:
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=8000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ]
)
Tip: Use low effort for high-volume, simple queries where you want to minimize latency and cost. Use high (or omit the parameter) for complex analysis, multi-step reasoning, or agentic tasks.

When to Use Adaptive Thinking vs. Fixed Budget

Adaptive Thinking Is Best For:

  • Bimodal workloads — Mix of simple and complex queries
  • Agentic workflows — Multi-step tasks with tool calls
  • Long-horizon planning — Tasks where reasoning depth varies
  • Cost optimization — Avoid over-thinking simple requests

Fixed Budget Still Makes Sense For:

  • Predictable latency — Real-time applications where response time must be consistent
  • Strict cost control — You need to cap maximum thinking tokens per request
  • Legacy integrations — Older models that don’t support adaptive mode

Practical Example: Agentic Workflow with Adaptive Thinking

Here’s how adaptive thinking shines in a multi-step agentic task. The model can think between each tool call:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=32000, thinking={"type": "adaptive"}, tools=[ { "name": "search_database", "description": "Search a database for customer records", "input_schema": { "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } }, { "name": "calculate_discount", "description": "Calculate discount based on customer tier", "input_schema": { "type": "object", "properties": { "tier": {"type": "string"}, "amount": {"type": "number"} }, "required": ["tier", "amount"] } } ], messages=[ { "role": "user", "content": "Find the customer John Doe and calculate his loyalty discount on a $500 purchase." } ] )

Claude will think between tool calls automatically

for block in response.content: if block.type == "thinking": print(f"Thinking: {block.thinking}") elif block.type == "tool_use": print(f"Tool call: {block.name}") elif block.type == "text": print(f"Final response: {block.text}")

Migration Guide: Moving from Fixed Budget to Adaptive

If you’re currently using thinking: {type: "enabled", budget_tokens: N}, here’s your migration path:

  • Identify all API calls using the old syntax on Opus 4.6 or Sonnet 4.6
  • Replace thinking: {type: "enabled", budget_tokens: 16000} with thinking: {type: "adaptive"}
  • Optionally add an effort parameter if you need to control depth
  • Test on a subset of traffic to verify performance and cost
  • Update model references — if you’re on Opus 4.7, the old syntax will fail

Key Takeaways

  • Adaptive thinking lets Claude dynamically decide when and how much to reason, eliminating the need to guess a fixed token budget for each request.
  • It enables interleaved thinking between tool calls, making it ideal for agentic workflows and multi-step tasks.
  • The effort parameter (low, medium, high) gives you soft control over reasoning depth without hard token limits.
  • On Claude Opus 4.7, adaptive thinking is mandatory — the old fixed-budget mode is no longer accepted.
  • Migrate now if you’re using the deprecated enabled + budget_tokens syntax on Opus 4.6 or Sonnet 4.6, as it will be removed in a future release.