GuideBeginnerAgents2026-05-22

Mastering Adaptive Thinking in Claude API: Dynamic Reasoning for Smarter AI Workflows

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning, optimize performance, and simplify API calls across Opus 4.7, Sonnet 4.6, and Mythos Preview.

Quick Answer

This guide explains how to use adaptive thinking in Claude API to let the model dynamically decide when and how much to think, replacing manual token budgets. You'll learn setup, effort tuning, and best practices for Opus 4.7, Sonnet 4.6, and Mythos Preview.

adaptive thinkingextended thinkingClaude APIeffort parameteragentic workflows

Introduction

Claude's extended thinking capability has been a game-changer for complex reasoning tasks. But manually setting a budget_tokens value for every request can be tedious and suboptimal—especially when your workload mixes simple lookups with deep analytical queries. Enter adaptive thinking: a smarter, dynamic approach that lets Claude decide when and how much to think based on the complexity of each request.

Adaptive thinking is now the recommended (and in some cases, only) thinking mode for Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6, and Claude Mythos Preview. This guide will walk you through everything you need to know to adopt adaptive thinking in your API workflows.

What Is Adaptive Thinking?

Adaptive thinking replaces the old thinking: {type: "enabled", budget_tokens: N} pattern with a single flag: thinking: {type: "adaptive"}. When you enable it, Claude evaluates each incoming request and decides:

Whether extended thinking is needed at all
How many thinking tokens to allocate
When to interleave thinking between tool calls (interleaved thinking)

This means you no longer have to guess a token budget upfront. For simple questions, Claude may skip thinking entirely, saving time and cost. For complex, multi-step tasks, it can allocate substantial reasoning resources automatically.

Key Benefits

Better performance on bimodal tasks – Mixes of simple and complex queries within the same conversation
Ideal for long-horizon agentic workflows – Interleaved thinking allows Claude to reason between tool calls
Simpler API code – No need to calculate or adjust budget_tokens per request
Future-proof – Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6, and rejected on Opus 4.7

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview (`claude-mythos-preview`)	Default mode	Thinking is auto-applied; cannot be disabled
Claude Opus 4.7 (`claude-opus-4-7`)	Only supported mode	Manual `thinking: {type: "enabled"}` returns a 400 error
Claude Opus 4.6 (`claude-opus-4-6`)	Supported	Manual `budget_tokens` deprecated
Claude Sonnet 4.6 (`claude-sonnet-4-6`)	Supported	Manual `budget_tokens` deprecated
Older models (Sonnet 4.5, Opus 4.5, etc.)	Not supported	Must use `thinking: {type: "enabled"}` with `budget_tokens`

Warning: If you're using Opus 4.6 or Sonnet 4.6 with budget_tokens, plan to migrate to adaptive thinking soon. The old syntax will be removed in a future model release.

How to Use Adaptive Thinking

Basic Setup

To enable adaptive thinking, set thinking.type to "adaptive" in your API request. Here's a complete Python example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

That's it! No budget_tokens, no beta headers, no extra configuration. Claude handles the rest.

Controlling Thinking Depth with the Effort Parameter

Adaptive thinking also supports an optional effort parameter that acts as a soft guide for how much thinking Claude should do. This is useful when you want to nudge the model toward deeper reasoning without setting a hard token limit.

Effort Levels

Effort Level	Behavior
`low`	Minimal thinking; Claude may skip thinking for simple queries
`medium`	Balanced approach; moderate reasoning for most tasks
`high` (default)	Claude almost always thinks, even for simple problems

Example with Effort

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Optional: defaults to "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Design a scalable microservice architecture for an e-commerce platform."
        }
    ]
)

When to use each effort level:

low – High-volume, latency-sensitive applications where most queries are simple (e.g., FAQ bots, basic classification)
medium – General-purpose assistants that handle a mix of simple and complex questions
high – Research, analysis, coding, or agentic workflows where thorough reasoning is critical

Adaptive Thinking in Agentic Workflows

One of the most powerful features of adaptive thinking is interleaved thinking—the ability for Claude to think between tool calls. This is automatically enabled when you use adaptive thinking.

Consider a multi-step agent that needs to:

Search a database
Analyze results
Call an external API
Synthesize a final answer

With manual budget_tokens, you'd have to allocate a single thinking budget upfront, which might be too small for the final synthesis or too large for the initial search. Adaptive thinking dynamically adjusts at each step.

Example: Agent with Interleaved Thinking

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "search_database",
            "description": "Search the internal database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        },
        {
            "name": "call_external_api",
            "description": "Call an external service",
            "input_schema": {
                "type": "object",
                "properties": {
                    "endpoint": {"type": "string"},
                    "payload": {"type": "object"}
                },
                "required": ["endpoint", "payload"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Find the latest sales data, enrich it with weather data from the external API, and summarize the trends."
        }
    ]
)

Claude will think before each tool call, after receiving results, and before producing the final summary—all without you having to manage token budgets.

Migrating from Manual Budget Tokens

If you're currently using thinking: {type: "enabled", budget_tokens: 8000}, here's your migration checklist:

Identify all API calls using the old syntax on Opus 4.6, Sonnet 4.6, or Opus 4.7
Replace thinking: {type: "enabled", budget_tokens: N} with thinking: {type: "adaptive"}
Optionally add an effort parameter if you need to control thinking depth
Test with a representative sample of your workload to ensure quality remains consistent
Monitor token usage—you may see cost savings on simple queries and better quality on complex ones

Before vs. After

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended):

thinking={
    "type": "adaptive"
}

Best Practices

Start with the default high effort – It provides the most consistent quality. Only lower effort if you have specific latency or cost constraints.
Use adaptive thinking for agentic workflows – Interleaved thinking is a major advantage over manual budgets.
Don't mix adaptive and manual thinking – On Opus 4.7, manual thinking is rejected. On other models, stick to one mode per request.
Monitor your thinking blocks – The API response still includes thinking content blocks, so you can inspect what Claude reasoned.
Test across models – Behavior may vary slightly between Opus 4.7 and Sonnet 4.6. Validate your use case on each.

Limitations and Considerations

Not supported on older models – Sonnet 4.5, Opus 4.5, and earlier still require budget_tokens.
No hard latency guarantees – Because thinking is dynamic, response times may vary more than with a fixed budget.
Effort is a soft guide – Claude may still think more or less than you'd expect at a given effort level.
Mythos Preview auto-applies thinking – You cannot disable thinking on this model.

Conclusion

Adaptive thinking represents a major step forward in how Claude handles reasoning. By removing the need to manually set token budgets, it simplifies your code, improves performance on mixed workloads, and enables more natural agentic behavior through interleaved thinking.

Whether you're building a simple Q&A bot or a complex multi-step agent, adaptive thinking is now the default choice for Claude's most capable models. Migrate today to future-proof your applications and unlock Claude's full reasoning potential.

Key Takeaways

Adaptive thinking replaces manual budget_tokens – Set thinking: {type: "adaptive"} and let Claude decide how much to think per request.
Supported on Opus 4.7, Opus 4.6, Sonnet 4.6, and Mythos Preview – Older models still require the legacy syntax.
Use the effort parameter to softly guide thinking depth: low, medium, or high (default).
Interleaved thinking is automatic – Claude can reason between tool calls, making adaptive thinking ideal for agentic workflows.
Manual budget_tokens is deprecated – Migrate now to avoid breaking changes in future model releases.