BeClaude
GuideBeginnerPricing2026-05-15

Adaptive Thinking in Claude: Smarter, Dynamic Reasoning for Complex Tasks

Learn how to use Claude's adaptive thinking mode to dynamically allocate reasoning effort, improve agentic workflows, and optimize API costs without manual token budgets.

Quick Answer

This guide explains how to enable and use adaptive thinking in Claude's API, a mode that lets Claude dynamically decide when and how much to reason, replacing manual token budgets for smarter, cost-effective performance.

adaptive thinkingextended thinkingClaude APIagentic workflowstoken optimization

Adaptive Thinking in Claude: Smarter, Dynamic Reasoning for Complex Tasks

Claude's adaptive thinking is a game-changer for developers who want to harness the power of extended thinking without the overhead of manually setting token budgets. Instead of guessing how much reasoning a task needs, you let Claude decide—dynamically, per request. This guide walks you through what adaptive thinking is, why it matters, and how to use it effectively in your API calls.

What Is Adaptive Thinking?

Adaptive thinking is the recommended way to use extended thinking with Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. It is also the default mode on Claude Mythos Preview. Instead of you setting a fixed budget_tokens value, Claude evaluates each request's complexity and determines whether—and how much—to engage extended thinking.

This is especially powerful for bimodal tasks (where some requests are trivial and others require deep reasoning) and long-horizon agentic workflows (where Claude needs to think between tool calls).

Key Benefits

  • No manual token budgeting – Claude handles the allocation automatically.
  • Interleaved thinking – Claude can think between tool calls, improving multi-step agent performance.
  • Cost efficiency – Simpler requests use fewer thinking tokens; complex ones get the reasoning they need.
  • Better performance – For many workloads, adaptive thinking outperforms fixed-budget extended thinking.

Supported Models

ModelAdaptive Thinking SupportNotes
Claude Mythos Preview (claude-mythos-preview)Default (auto-applies)thinking: {"type": "disabled"} is not supported
Claude Opus 4.7 (claude-opus-4-7)Only supported modeManual thinking: {"type": "enabled"} is rejected (400 error)
Claude Opus 4.6 (claude-opus-4-6)SupportedManual budget_tokens is deprecated
Claude Sonnet 4.6 (claude-sonnet-4-6)SupportedManual budget_tokens is deprecated
Important: Older models (Sonnet 4.5, Opus 4.5, etc.) do not support adaptive thinking. They require thinking.type: "enabled" with budget_tokens.

How Adaptive Thinking Works

In adaptive mode, thinking is optional for Claude. At the default effort level (high), Claude almost always thinks. At lower effort levels, Claude may skip thinking for simpler problems, saving tokens and reducing latency.

Adaptive thinking also automatically enables interleaved thinking—Claude can think between tool calls. This is a huge advantage for agentic workflows where the model needs to reason after receiving tool results before deciding the next action.

How to Use Adaptive Thinking in the API

Using adaptive thinking is straightforward. Set thinking.type to "adaptive" in your API request.

Python Example

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "adaptive"}, messages=[ { "role": "user", "content": "Explain why the sum of two even numbers is always even." } ] )

for block in response.content: if block.type == "thinking": print(f"\nThinking: {block.thinking}") elif block.type == "text": print(f"\nResponse: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 16000, thinking: { type: 'adaptive' }, messages: [ { role: 'user', content: 'Explain why the sum of two even numbers is always even.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log(\nThinking: ${block.thinking}); } else if (block.type === 'text') { console.log(\nResponse: ${block.text}); } }

cURL Example

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-7",
    "max_tokens": 16000,
    "thinking": {"type": "adaptive"},
    "messages": [
      {"role": "user", "content": "Explain why the sum of two even numbers is always even."}
    ]
  }'

Controlling Thinking with the Effort Parameter

You can fine-tune how much thinking Claude does by combining adaptive thinking with the effort parameter. The effort level acts as soft guidance—not a hard budget.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"  # Options: low, medium, high
    },
    messages=[
        {"role": "user", "content": "Write a detailed analysis of quantum computing's impact on cryptography."}
    ]
)

Effort Levels Explained

Effort LevelBehaviorUse Case
lowMinimal thinking; may skip for simple tasksHigh-throughput, low-complexity requests
mediumBalanced reasoningGeneral-purpose use
high (default)Almost always thinksComplex reasoning, agentic workflows

Task Budgets (Beta)

For even finer control, you can set a task budget—a soft cap on thinking tokens for the entire conversation or a specific task. This is useful when you want to limit costs while still benefiting from adaptive allocation.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "budget_tokens": 8000  # Soft cap on thinking tokens
    },
    messages=[...]
)
Note: Task budgets are in beta and may change. They act as guidance, not hard limits.

Fast Mode (Research Preview)

For latency-sensitive applications, you can enable fast mode to reduce thinking time at the cost of some reasoning depth.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "fast_mode": True  # Research preview
    },
    messages=[...]
)

Best Practices for Adaptive Thinking

1. Start with Default Effort

Begin with effort: "high" (or omit it) to see how Claude behaves. Then adjust down if you're overpaying for simple tasks.

2. Use Adaptive Thinking for Agentic Workflows

Adaptive thinking shines when Claude needs to reason between tool calls. The interleaved thinking capability means Claude can think after receiving tool results, leading to better multi-step decisions.

3. Combine with Prompt Caching

Adaptive thinking works well with prompt caching. Cache your system prompts and tool definitions, then let Claude think adaptively on each user request.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive"},
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant with access to tools.",
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[...]
)

4. Monitor Thinking Tokens

Even though you don't set a budget, you can still monitor how many thinking tokens Claude uses by inspecting the response's usage field.

print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")

Thinking tokens are included in output_tokens

5. Migrate from Manual Budgets

If you're currently using thinking: {"type": "enabled", "budget_tokens": N} on Opus 4.6 or Sonnet 4.6, plan to migrate to adaptive thinking. The manual approach is deprecated and will be removed in a future model release.

Common Pitfalls

  • Using adaptive thinking on unsupported models: Older models (Sonnet 4.5, Opus 4.5) will ignore the adaptive setting or error out.
  • Setting both adaptive and manual thinking: You cannot mix adaptive with enabled in the same request.
  • Expecting hard budgets: The effort parameter and task budgets are soft guidance, not hard limits.

Conclusion

Adaptive thinking represents a major step forward in how Claude handles reasoning. By letting the model decide when and how much to think, you get better performance, lower costs, and simpler code. Whether you're building a simple Q&A bot or a complex agentic system, adaptive thinking is the recommended approach for Claude Opus 4.7, Opus 4.6, and Sonnet 4.6.

Key Takeaways

  • Adaptive thinking lets Claude dynamically allocate reasoning effort per request, replacing manual token budgets for supported models.
  • It enables interleaved thinking between tool calls, making it ideal for agentic workflows.
  • Use the effort parameter (low, medium, high) to guide how much thinking Claude does.
  • Manual budget_tokens is deprecated on Opus 4.6 and Sonnet 4.6—migrate to adaptive thinking.
  • Monitor usage to understand how many tokens Claude is spending on thinking, and adjust effort levels accordingly.