GuideBeginnerPricing2026-05-20

Mastering Adaptive Thinking in Claude: A Practical Guide to Smarter AI Reasoning

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort, improve agentic workflows, and optimize API costs with practical code examples.

Quick Answer

This guide explains how to use Claude's adaptive thinking mode to let the model dynamically decide when and how much to reason, replacing manual token budgets for better performance in agentic and bimodal tasks.

adaptive thinkingextended thinkingClaude APIagentic workflowsreasoning optimization

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks, but manually setting a budget_tokens value often leads to either wasted tokens on simple queries or insufficient reasoning on hard problems. Enter adaptive thinking — a smarter, dynamic approach that lets Claude decide when and how much to think based on the complexity of each request.

This guide covers everything you need to know to implement adaptive thinking in your Claude API projects, from basic setup to advanced optimization with the effort parameter.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and the Claude Mythos Preview. Instead of requiring you to set a fixed thinking token budget, it allows Claude to evaluate each request independently and determine:

Whether extended thinking is needed at all
How much thinking to allocate

This is especially powerful for bimodal workloads (where some queries are trivial and others are complex) and long-horizon agentic workflows (where the model needs to reason between tool calls).

Key Benefits

Better performance on mixed-complexity tasks compared to fixed budgets
Automatic interleaved thinking — Claude can think between tool calls, improving multi-step reasoning
Simpler API calls — no need to guess a budget_tokens value
Cost efficiency — lower effort levels can skip thinking for simple problems

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview (`claude-mythos-preview`)	✅ Default mode	Thinking cannot be disabled
Claude Opus 4.7 (`claude-opus-4-7`)	✅ Only supported mode	Manual `enabled` mode rejected with 400 error
Claude Opus 4.6 (`claude-opus-4-6`)	✅ Supported	Manual `enabled` deprecated
Claude Sonnet 4.6 (`claude-sonnet-4-6`)	✅ Supported	Manual `enabled` deprecated
Older models (Sonnet 4.5, Opus 4.5, etc.)	❌ Not supported	Must use `thinking.type: "enabled"` with `budget_tokens`

Warning: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future model release. Migrate to adaptive thinking as soon as possible.

How Adaptive Thinking Works

When you set thinking.type to "adaptive", Claude's internal router evaluates each incoming request and decides:

Is this a thinking-worthy problem? Simple factual questions may skip thinking entirely.
How deep should the reasoning go? Complex math, code generation, or multi-step planning gets more thinking tokens.

At the default effort level (high), Claude almost always thinks. At lower effort levels, it may skip thinking for simpler problems, saving tokens and latency.

Additionally, adaptive thinking automatically enables interleaved thinking — the model can pause between tool calls to reason about results, plan next steps, or adjust its strategy. This is critical for agentic workflows where the model needs to iterate.

Getting Started: Basic Implementation

Prerequisites

An Anthropic API key
The Anthropic Python SDK (pip install anthropic) or TypeScript SDK (npm install @anthropic-ai/sdk)
Access to a supported model

Python Example

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

Controlling Thinking Depth with the `effort` Parameter

Adaptive thinking can be combined with an optional effort parameter that acts as a soft guide for how much thinking Claude should do. This is especially useful when you want to balance reasoning quality against latency or cost.

Effort Levels

Effort Level	Behavior	Use Case
`low`	Minimal thinking; may skip for simple queries	High-throughput, low-latency applications
`medium`	Balanced reasoning	General-purpose use
`high`	Maximum reasoning (default)	Complex problem-solving, agentic tasks

Python Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # or "low", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable microservices architecture for an e-commerce platform."
        }
    ]
)

TypeScript Example with Effort

const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: {
    type: 'adaptive',
    effort: 'medium'
  },
  messages: [
    {
      role: 'user',
      content: 'Design a scalable microservices architecture for an e-commerce platform.'
    }
  ]
});

Note: The effort parameter is a beta feature. Its behavior may change in future releases. Always test with your specific workload.

Adaptive Thinking in Agentic Workflows

One of the most powerful applications of adaptive thinking is in agentic workflows where Claude uses tools to accomplish multi-step tasks. With interleaved thinking enabled automatically, Claude can:

Receive a complex request (e.g., "Research the latest AI papers and summarize them")
Think about the approach before making the first tool call
Execute a search tool and receive results
Think about the results to decide next steps
Call additional tools or synthesize the final answer

Example: Research Agent with Adaptive Thinking

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find the top 3 AI breakthroughs announced in 2025 and explain their significance."
        }
    ]
)
The response may contain interleaved thinking and tool_use blocks
for block in response.content:
    if block.type == "thinking":
        print(f"[Thinking] {block.thinking[:200]}...")
    elif block.type == "tool_use":
        print(f"[Tool Call] {block.name}({block.input})")
    elif block.type == "text":
        print(f"[Final] {block.text}")

Migrating from Fixed Budget to Adaptive Thinking

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Before (Deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (Recommended)

thinking={
    "type": "adaptive"
}

Or with effort control:

thinking={
    "type": "adaptive",
    "effort": "high"
}

Migration Checklist

✅ Update your model to a supported version (Opus 4.7, Opus 4.6, Sonnet 4.6, or Mythos Preview)
✅ Replace thinking.type: "enabled" with thinking.type: "adaptive"
✅ Remove the budget_tokens parameter
✅ Optionally add the effort parameter for finer control
✅ Test with your workloads to ensure performance meets expectations

Best Practices

1. Start with Default Effort

Begin with effort: "high" (or omit it entirely) to get the full benefit of adaptive thinking. Only reduce effort if you have specific latency or cost constraints.

2. Use Adaptive Thinking for Agentic Tasks

If your application involves multi-step reasoning with tools, adaptive thinking is almost always superior to fixed budgets because of interleaved thinking.

3. Monitor Thinking Tokens

Even though you don't set a budget, you can still inspect the thinking blocks in the response to understand how much reasoning Claude performed. Use this to validate that the effort level matches your needs.

4. Combine with Prompt Caching

Adaptive thinking works well with prompt caching. Cache your system prompts and tool definitions to reduce latency and cost, while letting Claude dynamically allocate thinking tokens.

5. Test Bimodal Workloads

If your application handles both simple and complex queries, adaptive thinking shines. Test with a representative sample to see how often Claude skips thinking for easy questions.

Limitations and Considerations

Predictable latency: If you need guaranteed response times, adaptive thinking may not be ideal because thinking depth varies per request. Consider using fixed budget_tokens on Opus 4.6/Sonnet 4.6 if latency predictability is critical.
Cost control: While adaptive thinking can save tokens on simple queries, it may use more tokens on complex ones than a conservative fixed budget. Monitor your usage.
Beta features: The effort parameter is in beta. Its behavior may change, and it may not be available in all regions or for all models.

Conclusion

Adaptive thinking represents a significant evolution in how Claude handles reasoning. By letting the model decide when and how much to think, you can achieve better performance on mixed workloads, simplify your code, and unlock more natural agentic behaviors through interleaved thinking.

Whether you're building a simple Q&A bot or a complex multi-agent system, adaptive thinking is now the recommended path forward. Migrate your existing integrations today to take full advantage of Claude's latest reasoning capabilities.

Key Takeaways

Adaptive thinking lets Claude dynamically decide when and how much to reason, replacing manual budget_tokens for better performance on bimodal and agentic workloads.
Supported on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview — older models still require fixed budgets.
Interleaved thinking is automatically enabled, allowing Claude to reason between tool calls in agentic workflows.
The optional effort parameter (low, medium, high) provides soft control over thinking depth without a fixed token budget.
Migrate now if you're using deprecated thinking.type: "enabled" — it will be removed in future model releases.