GuideBeginnerAgents2026-05-17

Adaptive Thinking in Claude API: Smarter Extended Thinking Without Manual Budgets

Learn how to use adaptive thinking in Claude API to let Claude dynamically decide when and how much to think, improving performance on complex tasks without manual token budgets.

Quick Answer

Adaptive thinking lets Claude dynamically allocate thinking tokens based on request complexity, replacing manual budget_tokens. It's the only supported mode on Claude Opus 4.7 and default on Mythos Preview, enabling interleaved thinking for agentic workflows and better performance on bimodal tasks.

extended thinkingadaptive thinkingClaude APIagentic workflowstoken management

Adaptive Thinking in Claude API: Smarter Extended Thinking Without Manual Budgets

Extended thinking has been one of Claude's most powerful capabilities, allowing the model to "think" before responding to complex queries. But manually setting a thinking token budget for every request was often a guessing game—too few tokens and Claude couldn't fully reason through hard problems; too many and you wasted tokens on simple tasks.

Enter adaptive thinking: a new mode that lets Claude dynamically decide when and how much to think, based on the complexity of each individual request. This guide covers everything you need to know to use adaptive thinking effectively in your Claude API applications.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It's also the default mode on Claude Mythos Preview, where it auto-applies whenever thinking is unset.

Instead of you manually setting a budget_tokens value, Claude evaluates each request and determines:

Whether to use extended thinking at all
How much thinking to allocate

This makes adaptive thinking especially powerful for bimodal workloads (where some requests are trivial and others are complex) and long-horizon agentic workflows (where thinking needs to happen between tool calls).

Key Benefits

No more guesswork: You don't need to estimate thinking budgets per request
Better performance: Adaptive thinking often outperforms fixed budgets on mixed workloads
Interleaved thinking: Claude can think between tool calls automatically, making agents more effective
Cost efficiency: Claude skips thinking on simple problems when appropriate

Supported Models

Model	Adaptive Thinking Support	Notes
Claude Mythos Preview (`claude-mythos-preview`)	Default mode	Auto-applies when thinking is unset; `thinking: {type: "disabled"}` not supported
Claude Opus 4.7 (`claude-opus-4-7`)	Only supported mode	Manual `thinking: {type: "enabled"}` is rejected with a 400 error
Claude Opus 4.6 (`claude-opus-4-6`)	Supported	Manual `budget_tokens` deprecated, plan to migrate
Claude Sonnet 4.6 (`claude-sonnet-4-6`)	Supported	Manual `budget_tokens` deprecated, plan to migrate

Important: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking and still require thinking: {type: "enabled"} with budget_tokens.

How Adaptive Thinking Works

In adaptive mode, thinking is optional for the model. At the default effort level (high), Claude almost always thinks. At lower effort levels, Claude may skip thinking for simpler problems.

Adaptive thinking also automatically enables interleaved thinking—Claude can think between tool calls. This is a game-changer for agentic workflows where the model needs to reason about tool results before taking the next action.

How to Use Adaptive Thinking

Basic Usage

To enable adaptive thinking, set thinking.type to "adaptive" in your API request. No beta header is required.

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    messages=[
        {
            "role": "user",
            "content": "Explain why the sum of two even numbers is always even."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: { type: 'adaptive' },
  messages: [
    {
      role: 'user',
      content: 'Explain why the sum of two even numbers is always even.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(\nThinking: ${block.thinking});
  } else if (block.type === 'text') {
    console.log(\nResponse: ${block.text});
  }
}

Using the Effort Parameter

You can combine adaptive thinking with the effort parameter to guide how much thinking Claude does. The effort level acts as soft guidance—Claude will generally follow your hint but may still adjust based on actual request complexity.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {
            "role": "user",
            "content": "Write a comprehensive analysis of quantum computing's impact on cryptography."
        }
    ]
)

Effort Levels

Effort Level	Behavior	Best For
`low`	Minimal thinking; may skip for simple tasks	High-throughput, low-complexity workloads
`medium`	Balanced thinking allocation	General-purpose use
`high` (default)	Almost always thinks	Complex reasoning, agentic workflows

Adaptive Thinking in Agentic Workflows

One of the most powerful applications of adaptive thinking is in agentic workflows where Claude uses tools. With interleaved thinking enabled automatically, Claude can:

Receive a complex user request
Think about how to approach it
Call a tool (e.g., search, code execution)
Think about the tool results
Decide on the next action
Continue until the task is complete

This makes adaptive thinking ideal for:

Multi-step research tasks where Claude needs to search, read, and synthesize
Code generation and debugging where Claude writes code, runs it, and iterates
Data analysis pipelines where Claude queries databases, processes results, and generates reports

Example: Agent with Adaptive Thinking

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive"},
    tools=[
        {
            "name": "web_search",
            "description": "Search the web for information",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                },
                "required": ["query"]
            }
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Research the latest breakthroughs in fusion energy and summarize the top 3 developments from 2024."
        }
    ]
)

Migration Guide: Moving from Manual Budgets

If you're currently using thinking: {type: "enabled", budget_tokens: N} on Opus 4.6 or Sonnet 4.6, here's how to migrate:

Step 1: Update Your Code

Before (deprecated):

thinking={
    "type": "enabled",
    "budget_tokens": 8000
}

After (recommended):

thinking={
    "type": "adaptive"
}

Or with effort control:

thinking={
    "type": "adaptive",
    "effort": "high"
}

Step 2: Test and Compare

Run your existing workloads with adaptive thinking and compare:

Response quality: Adaptive thinking often matches or exceeds fixed budgets
Token usage: You may see cost savings on simpler requests
Latency: Adaptive thinking may be faster on simple tasks

Step 3: Remove Beta Headers

If you were using any beta headers for extended thinking, they are no longer required with adaptive thinking.

Best Practices

Start with default effort: Begin with effort: "high" (the default) and adjust down only if you see unnecessary thinking on simple tasks.

Use for agentic workflows: Adaptive thinking shines when Claude needs to reason between tool calls. Enable it for any multi-step agent.

Monitor token usage: While adaptive thinking is efficient, you should still set appropriate max_tokens limits to cap total response size.

Test bimodal workloads: If your application handles both simple Q&A and complex reasoning, adaptive thinking will likely outperform fixed budgets.

Check model compatibility: Remember that older models (Opus 4.5 and earlier) don't support adaptive thinking—you'll need to use manual budgets for those.

Limitations and Considerations

Not supported on older models: You cannot use adaptive thinking on Sonnet 4.5, Opus 4.5, or earlier models.
No precise cost control: If you need predictable latency or exact token budgets, manual budget_tokens is still available on Opus 4.6 and Sonnet 4.6 (though deprecated).
Effort is soft guidance: The effort parameter is a hint, not a hard constraint. Claude may still think more or less than you expect.

Key Takeaways

Adaptive thinking replaces manual budget_tokens on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6, and is the default on Mythos Preview.
Claude dynamically decides when and how much to think based on request complexity, improving both performance and cost efficiency.
Interleaved thinking is automatically enabled, making adaptive thinking ideal for agentic workflows with tool calls.
Use the effort parameter (low, medium, high) to provide soft guidance on thinking allocation.
Migrate now: Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6 and will be removed in a future model release.