Guide2026-04-30

Mastering Extended Thinking in Claude: A Complete Guide to Adaptive Reasoning

Learn how to use Claude's extended thinking capabilities for complex tasks. Covers adaptive thinking, effort parameters, budget tokens, and practical API examples.

Quick Answer

This guide explains how to enable and optimize Claude's extended thinking feature for complex reasoning tasks. You'll learn about adaptive thinking, effort parameters, budget tokens, and how to implement them in your API calls.

extended thinkingadaptive thinkingClaude APIreasoningAI optimization

Introduction

Claude's extended thinking feature is a game-changer for complex reasoning tasks. It gives Claude enhanced capabilities to work through problems step-by-step, showing its internal thought process before delivering a final answer. Whether you're building a research assistant, a code analysis tool, or a complex decision-making system, understanding extended thinking is essential.

This guide covers everything you need to know: from the basics of enabling extended thinking to advanced configuration with adaptive thinking and effort parameters.

How Extended Thinking Works

When extended thinking is enabled, Claude creates special thinking content blocks in its response. These blocks contain the model's internal reasoning—its step-by-step analysis before crafting the final answer. The API response includes these thinking blocks followed by the text content blocks.

Here's what a typical response looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The thinking blocks include a cryptographic signature for verification, ensuring the integrity of the reasoning process.

Adaptive Thinking: The Modern Approach

For Claude Opus 4.7 and later models, Anthropic has introduced adaptive thinking—a smarter way to manage reasoning. Instead of manually setting a fixed token budget, you use the effort parameter to tell Claude how much reasoning effort to apply.

Effort Parameter

The effort parameter accepts three values:

"low": Minimal reasoning, suitable for simple tasks
"medium": Balanced reasoning, good for most complex tasks
"high": Maximum reasoning, ideal for very complex problems

Task Budgets (Beta)

For finer control, you can combine adaptive thinking with a task_budget parameter. This lets you specify a maximum number of tokens Claude should use for thinking, while still allowing adaptive allocation within that budget.

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high",
        "task_budget": 16000  # Optional: max thinking tokens
    },
    messages=[
        {"role": "user", "content": "Analyze the implications of quantum computing on cryptography"}
    ]
)

Fast Mode (Research Preview)

Fast mode is an experimental feature that reduces thinking time for simpler queries. It's useful when you need quick responses but still want some reasoning capability. Enable it by setting fast_mode: true in the thinking configuration.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "medium",
        "fast_mode": True
    },
    messages=[
        {"role": "user", "content": "What's 15% of 200?"}
    ]
)

Manual Extended Thinking (Legacy)

For older models (Claude Opus 4.6, Claude Sonnet 4.6, and earlier), you can still use manual extended thinking with a fixed token budget. However, this approach is deprecated and will be removed in future model releases.

# Legacy approach - not recommended for new projects
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 8000  # Fixed thinking budget
    },
    messages=[
        {"role": "user", "content": "Solve this complex math problem..."}
    ]
)

Model-Specific Behavior

Different Claude models handle extended thinking differently:

Model	Adaptive Thinking	Manual Thinking	Notes
Claude Opus 4.7+	✅ Recommended	❌ Not supported	Use `effort` parameter
Claude Mythos Preview	✅ Default	✅ Accepted	`thinking: {"type": "disabled"}` not supported
Claude Opus 4.6	✅ Recommended	✅ Deprecated	Adaptive preferred
Claude Sonnet 4.6	✅ Recommended	✅ Deprecated	Interleaved mode available

For Claude Mythos Preview, adaptive thinking is the default. You cannot disable thinking entirely. If you don't want thinking content in the response, set display: "omitted" instead of type: "disabled". Use display: "summarized" to receive summaries of the thinking process.

Practical API Examples

Basic Adaptive Thinking

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "medium"},
    messages=[
        {"role": "user", "content": "Explain the theory of relativity in simple terms"}
    ]
)
Access thinking content
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking}")
    elif block.type == "text":
        print(f"Answer: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 32000,
  thinking: { type: 'adaptive', effort: 'high' },
  messages: [
    { role: 'user', content: 'Debug this code and explain the fix...' }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Answer:', block.text);
  }
}

Using with Streaming

Extended thinking works seamlessly with streaming responses. The thinking blocks are streamed before the text blocks:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[
        {"role": "user", "content": "Write a detailed analysis of..."}
    ]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(event.delta.text, end="")

Best Practices

Choose the right effort level: Start with "medium" for most tasks. Use "high" only for very complex problems that require deep reasoning. Use "low" for simple queries where speed matters.

Set appropriate max_tokens: Extended thinking consumes tokens from your max_tokens budget. Ensure you allocate enough tokens for both thinking and the final response. A good rule of thumb: set max_tokens to at least 2x your expected thinking budget.

Use task budgets for cost control: If you're concerned about token usage, set a task_budget to cap the thinking tokens while still allowing adaptive allocation.

Handle thinking content appropriately: The thinking blocks contain the model's internal reasoning. You can display them to users for transparency, or omit them for a cleaner UI.

Combine with other features: Extended thinking works well with tools, structured outputs, and citations. For example, you can have Claude think through a problem before calling a tool or generating a structured response.

Troubleshooting

400 Error on Opus 4.7+: If you get a 400 error, you're likely using manual thinking (type: "enabled") on a model that only supports adaptive thinking. Switch to type: "adaptive" with the effort parameter.

Thinking content missing: If you're not seeing thinking blocks in the response, check that you've enabled extended thinking in your API call. Also, some models (like Mythos Preview) may omit thinking content by default.

Token limits: If Claude stops thinking prematurely, increase your max_tokens or task_budget values.

Key Takeaways

Extended thinking enhances Claude's reasoning by allowing step-by-step analysis before generating the final answer, with transparent thinking blocks in the response.
Adaptive thinking is the modern approach for Claude Opus 4.7+ models. Use the effort parameter (low, medium, high) instead of manual token budgets.
Manual thinking is deprecated on newer models and will be removed. Migrate your code to adaptive thinking for future compatibility.
Combine with streaming and tools for powerful applications. Extended thinking works seamlessly with streaming responses and tool use.
Control costs with task budgets and appropriate effort levels to balance reasoning depth with token consumption.