GuideBeginnerPricing2026-05-14

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning resources, optimize performance, and reduce costs across complex agentic workflows and bimodal tasks.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking based on request complexity, replacing fixed token budgets. It enables interleaved thinking between tool calls, supports effort levels for cost control, and is the default on Claude Mythos Preview and the only mode on Opus 4.7.

extended thinkingadaptive thinkingClaude APIagentic workflowsreasoning optimization

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Claude's reasoning capabilities have taken a major leap forward with adaptive thinking — a new approach to extended thinking that replaces the old fixed-budget model. Instead of manually guessing how many thinking tokens your task needs, adaptive thinking lets Claude dynamically decide when and how much to reason, based on the complexity of each individual request.

This guide will walk you through everything you need to know about adaptive thinking: how it works, which models support it, how to implement it in your API calls, and best practices for getting the most out of this powerful feature.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It is also the default mode on Claude Mythos Preview, where it auto-applies whenever thinking is unset.

Unlike the older thinking: {type: "enabled", budget_tokens: N} approach — where you had to specify a fixed number of tokens for reasoning — adaptive thinking allows Claude to:

Evaluate request complexity and decide whether thinking is needed
Allocate thinking dynamically — more for complex problems, less (or none) for simple ones
Enable interleaved thinking between tool calls, making it ideal for agentic workflows

Important: On Claude Opus 4.7, adaptive thinking is the only supported thinking mode. The old thinking: {type: "enabled", budget_tokens: N} will be rejected with a 400 error.

Supported Models

Model	API Name	Adaptive Thinking Support
Claude Mythos Preview	`claude-mythos-preview`	Default (auto-applies when thinking unset; cannot disable)
Claude Opus 4.7	`claude-opus-4-7`	Only mode (must explicitly set `thinking: {type: "adaptive"}`)
Claude Opus 4.6	`claude-opus-4-6`	Supported (old `enabled` mode deprecated)
Claude Sonnet 4.6	`claude-sonnet-4-6`	Supported (old `enabled` mode deprecated)

Note: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking: {type: "enabled"} with budget_tokens.

How Adaptive Thinking Works

In adaptive mode, thinking becomes optional for the model. Here's what happens under the hood:

Claude evaluates each request — it looks at the prompt complexity, the number of steps required, and whether tool calls are involved.
Decision time — At the default effort level (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems.
Dynamic allocation — If thinking is needed, Claude allocates just enough tokens to solve the problem effectively, rather than burning through a fixed budget.
Interleaved thinking — Claude can think between tool calls, not just before them. This is a game-changer for agentic workflows where the model needs to reason after receiving tool results.

Why Interleaved Thinking Matters

Traditional extended thinking only allowed Claude to think before generating a response. In agentic workflows — where Claude calls tools, gets results, and needs to decide what to do next — this was limiting. With interleaved thinking, Claude can:

Call a search tool
Receive results
Think about those results
Decide the next action
Call another tool
Think again after receiving new data

This creates a much more natural and effective reasoning loop for complex multi-step tasks.

How to Use Adaptive Thinking in the API

Basic Implementation

Here's how to enable adaptive thinking in your API requests:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

Controlling Thinking with the Effort Parameter

Adaptive thinking comes with an optional effort parameter that acts as soft guidance for how much thinking Claude should do. This is especially useful when you want to balance reasoning depth against cost and latency.

Effort Levels

Effort Level	Behavior	Best For
`low`	Minimal thinking; may skip for simple tasks	High-throughput, low-cost scenarios
`medium`	Moderate thinking allocation	Balanced workloads
`high` (default)	Maximum thinking; Claude almost always thinks	Complex reasoning, agentic workflows

Example with Effort Control

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: "low", "medium", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Write a Python script to analyze a CSV file with 10,000 rows."
        }
    ]
)

Pro Tip: Start with effort: "high" for development and testing. Once you understand your workload's complexity, you can dial it down to "medium" or "low" to optimize costs.

When to Use Adaptive Thinking

Ideal Use Cases

Bimodal tasks — Workloads that mix simple and complex requests. Adaptive thinking saves tokens on easy questions while going deep on hard ones.
Long-horizon agentic workflows — Multi-step tasks where Claude needs to reason between tool calls (e.g., research agents, data analysis pipelines).
Variable complexity workloads — When you can't predict how much thinking each request will need.
Cost-sensitive applications — The effort parameter gives you fine-grained control over spending.

When to Stick with Fixed Budget

If your workload requires predictable latency or precise control over thinking costs, the old thinking: {type: "enabled", budget_tokens: N} approach is still functional on Opus 4.6 and Sonnet 4.6. However, it is deprecated and no longer recommended — plan to migrate to adaptive thinking.

Migration Guide: Moving from Fixed Budget to Adaptive

If you're currently using thinking: {type: "enabled", budget_tokens: N}, here's how to migrate:

Step 1: Update Your Model

Ensure you're using a supported model (Opus 4.6+, Sonnet 4.6+, or Mythos Preview).

Step 2: Change the Thinking Configuration

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended):

thinking={
    "type": "adaptive",
    "effort": "high"
}

Step 3: Test and Tune

Run your existing test suite with adaptive thinking. You may find that:

Simple requests use fewer tokens than before
Complex requests may use more (but produce better results)
Overall costs may decrease for bimodal workloads

Step 4: Adjust Effort as Needed

If you notice costs increasing, try reducing the effort level to "medium" or "low".

Best Practices

Always set max_tokens generously — Adaptive thinking can allocate more tokens than you might expect for complex tasks. Set max_tokens to at least 2x your expected output length.

Monitor thinking blocks — In your response handling, always check for block.type == "thinking" to capture Claude's reasoning. This is valuable for debugging and transparency.

Start with effort: "high" — Get a baseline for your workload's thinking needs before optimizing.

Use for agentic loops — Adaptive thinking shines when Claude needs to reason between tool calls. Build your agents to leverage interleaved thinking.

Test with your actual workload — Synthetic benchmarks may not reflect real-world performance. Run A/B tests with your production prompts.

No beta header required — Unlike some Claude features, adaptive thinking doesn't need any special beta headers. Just set thinking: {type: "adaptive"}.

Key Takeaways

Adaptive thinking replaces fixed token budgets — Claude dynamically decides when and how much to think based on request complexity, eliminating the need to guess budget_tokens values.
Interleaved thinking enables better agentic workflows — Claude can reason between tool calls, creating more natural and effective multi-step reasoning loops.
The effort parameter gives you cost control — Use effort: "low", "medium", or "high" to balance reasoning depth against token usage and latency.
Migration is straightforward — Update your thinking configuration from {type: "enabled", budget_tokens: N} to {type: "adaptive"} and optionally add an effort level.
Opus 4.7 requires adaptive thinking — If you're using the latest Opus model, adaptive thinking is your only option (and it's the best one).

Adaptive thinking represents a fundamental shift in how Claude approaches reasoning. By letting the model decide when to think deeply and when to respond quickly, you get better results, lower costs, and simpler code. Start experimenting with it today to see the difference in your AI workflows.