BeClaude
Guide2026-04-29

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort, improve agentic workflows, and optimize costs across Opus and Sonnet models.

Quick Answer

Adaptive thinking lets Claude dynamically decide when and how much to use extended thinking based on request complexity, replacing manual token budgets. It supports interleaved thinking for tool calls and is the default on Claude Mythos Preview and Opus 4.7.

adaptive thinkingextended thinkingClaude APIagentic workflowsreasoning optimization

Mastering Adaptive Thinking in Claude: Dynamic Reasoning for Smarter AI Workflows

When building with Claude, one of the most powerful capabilities at your disposal is extended thinking—the model's ability to reason step-by-step before generating a response. But until recently, you had to manually set a thinking token budget, which often led to either wasted tokens on simple tasks or insufficient reasoning on complex ones.

Enter adaptive thinking: a smarter, dynamic approach that lets Claude decide when and how much to think. This guide will walk you through everything you need to know to leverage adaptive thinking effectively in your applications.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview. Instead of requiring you to specify a fixed budget_tokens value, adaptive thinking allows Claude to evaluate each request's complexity and determine whether—and how much—extended thinking is needed.

This is a game-changer for several reasons:

  • No more guesswork: You don't need to estimate how much thinking a task requires.
  • Cost efficiency: Simpler requests use fewer tokens; complex ones get the reasoning they deserve.
  • Better performance: For bimodal tasks (mixing simple and complex queries) and long-horizon agentic workflows, adaptive thinking often outperforms fixed budgets.
  • Interleaved thinking: Claude can think between tool calls, making it ideal for multi-step agentic tasks.

Supported Models

Adaptive thinking is supported on the following models:

ModelAPI NameNotes
Claude Mythos Previewclaude-mythos-previewAdaptive thinking is the default; thinking: {"type": "disabled"} is not supported
Claude Opus 4.7claude-opus-4-7Only supported thinking mode; manual thinking: {"type": "enabled"} returns a 400 error
Claude Opus 4.6claude-opus-4-6Manual budget_tokens is deprecated but still functional
Claude Sonnet 4.6claude-sonnet-4-6Manual budget_tokens is deprecated but still functional
Important: On Opus 4.6 and Sonnet 4.6, thinking.type: "enabled" with budget_tokens is deprecated and will be removed in a future model release. Plan to migrate to adaptive thinking with the effort parameter.

Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking.type: "enabled" with budget_tokens.

How Adaptive Thinking Works

In adaptive mode, thinking is optional for the model. Here's what happens under the hood:

  • Claude receives your request.
  • It evaluates the complexity of the task.
  • At the default effort level (high), Claude almost always thinks.
  • At lower effort levels, Claude may skip thinking for simpler problems.
  • When thinking is triggered, Claude uses interleaved thinking—it can reason between tool calls, not just before the final response.
This interleaved capability is especially powerful for agentic workflows where the model needs to:
  • Analyze tool outputs before deciding the next action
  • Re-evaluate its approach after each step
  • Maintain coherent reasoning across multiple tool calls

How to Use Adaptive Thinking in the API

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request. No beta header is required.

Python Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

cURL Example

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 16000,
    "thinking": {"type": "adaptive"},
    "messages": [
      {"role": "user", "content": "Explain why the sum of two even numbers is always even."}
    ]
  }'

Controlling Thinking Effort

Adaptive thinking comes with an effort parameter that lets you fine-tune how aggressively Claude applies extended thinking. The effort parameter accepts values like low, medium, and high (with high being the default).

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {
            "role": "user",
            "content": "What are the key differences between quantum computing and classical computing?"
        }
    ]
)

Use cases for different effort levels:

  • Low: Simple Q&A, quick lookups, straightforward translations
  • Medium: Balanced reasoning for general-purpose tasks
  • High: Complex problem-solving, mathematical proofs, multi-step reasoning

Adaptive Thinking for Agentic Workflows

One of the most compelling use cases for adaptive thinking is in agentic workflows—where Claude uses tools to accomplish multi-step tasks. With interleaved thinking enabled automatically, Claude can:

  • Receive a complex request
  • Think about the approach
  • Call a tool (e.g., search, code execution)
  • Think about the tool output
  • Decide the next action
  • Call another tool
  • Synthesize the final response
This makes adaptive thinking ideal for:
  • Research assistants that search and synthesize information
  • Coding agents that write, test, and debug code
  • Data analysis pipelines that query databases and generate reports
  • Customer support bots that access knowledge bases and ticketing systems

Example: Agentic Workflow with Adaptive Thinking

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=32000, thinking={"type": "adaptive", "effort": "high"}, tools=[ { "name": "search_knowledge_base", "description": "Search the company knowledge base for information", "input_schema": { "type": "object", "properties": { "query": {"type": "string"} }, "required": ["query"] } } ], messages=[ { "role": "user", "content": "Find our return policy and summarize it for a customer." } ] )

Migrating from Fixed Budgets

If you're currently using thinking.type: "enabled" with budget_tokens, here's how to migrate:

Before (deprecated)

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended)

thinking={
    "type": "adaptive",
    "effort": "high"
}

Migration Checklist

  • ✅ Update your model to Opus 4.6+, Sonnet 4.6+, or Mythos Preview
  • ✅ Replace thinking.type: "enabled" with thinking.type: "adaptive"
  • ✅ Remove budget_tokens parameter
  • ✅ Optionally add effort parameter
  • ✅ Test with your workloads to verify performance
  • ✅ Monitor token usage and adjust effort as needed

Best Practices

1. Start with High Effort

For most production workloads, start with effort: "high" to ensure Claude applies thinking when needed. You can dial it down after observing actual usage patterns.

2. Combine with Structured Outputs

Adaptive thinking works well with structured outputs. Use it when you need Claude to reason about complex data before returning a structured response.

3. Monitor Token Usage

Even though adaptive thinking optimizes token usage automatically, it's still good practice to monitor your API costs. The effort parameter gives you a lever to control costs.

4. Use with Prompt Caching

Adaptive thinking is compatible with prompt caching. For long-running agentic workflows, cache your system prompts and tool definitions to reduce latency and costs.

5. Test Bimodal Workloads

If your application handles both simple and complex requests, adaptive thinking shines. Test it against fixed budgets to see the cost savings on simple queries and quality improvements on complex ones.

Limitations and Considerations

  • Not available on older models: Sonnet 4.5, Opus 4.5, and earlier models don't support adaptive thinking.
  • Predictable latency: If your workload requires predictable response times, adaptive thinking may introduce variability since thinking time depends on complexity.
  • Zero Data Retention: Adaptive thinking is eligible for ZDR arrangements, meaning data is not stored after the API response is returned.

Conclusion

Adaptive thinking represents a significant evolution in how Claude handles reasoning. By letting the model dynamically allocate thinking resources, you get better performance, lower costs, and simpler code. Whether you're building a simple Q&A bot or a complex agentic system, adaptive thinking is the recommended approach for modern Claude applications.

Key Takeaways

  • Adaptive thinking replaces manual token budgets on Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview, letting Claude dynamically decide when to use extended thinking.
  • Interleaved thinking is automatically enabled, allowing Claude to reason between tool calls—ideal for agentic workflows.
  • Use the effort parameter (low, medium, high) to control how aggressively Claude applies thinking, with high being the default.
  • Migrate from deprecated budget_tokens by switching to thinking.type: "adaptive" and optionally setting an effort level.
  • Adaptive thinking drives better performance for bimodal tasks and complex multi-step workflows while optimizing token usage.