GuideBeginner2026-05-06

Mastering Claude's Extended Thinking: From Manual Control to Adaptive Reasoning

Learn how to leverage Claude's extended thinking capabilities for complex reasoning tasks. Covers adaptive thinking, effort parameters, and practical API implementation examples.

Quick Answer

Extended thinking gives Claude enhanced reasoning for complex tasks. For Claude Opus 4.7+, use adaptive thinking with the effort parameter instead of manual budget_tokens. This guide covers setup, model-specific behavior, and practical implementation.

extended thinkingClaude APIadaptive thinkingreasoningAI best practices

Mastering Claude's Extended Thinking: From Manual Control to Adaptive Reasoning

Extended thinking is one of Claude's most powerful features—it gives the model enhanced reasoning capabilities for complex tasks while providing varying levels of transparency into its step-by-step thought process. Whether you're building a mathematical proof assistant, a multi-step analysis tool, or a complex decision-making system, understanding extended thinking is essential for getting the most out of Claude.

In this guide, we'll cover everything you need to know: what extended thinking is, how it works under the hood, the shift from manual to adaptive thinking, and practical code examples you can implement today.

What Is Extended Thinking?

When extended thinking is enabled, Claude creates internal "thinking content blocks" where it outputs its reasoning before crafting a final response. Think of it as Claude's scratchpad—it works through the problem step by step, then uses those insights to produce a polished answer.

The API response includes both thinking content blocks and text content blocks. Here's what the default response format looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field is crucial for verification—it ensures the thinking content hasn't been tampered with during transmission.

The Big Shift: From Manual to Adaptive Thinking

If you've used extended thinking before, you're likely familiar with the manual approach: setting thinking: {type: "enabled", budget_tokens: N}. That's changing.

What's New?

For Claude Opus 4.7 and later models, manual extended thinking is no longer supported. Attempting to use it returns a 400 error. Instead, you must use adaptive thinking with the effort parameter:

{
  "thinking": {
    "type": "adaptive",
    "effort": "high"
  }
}

Why the Change?

Adaptive thinking is smarter. Instead of you guessing how many tokens Claude should spend on reasoning (the old budget_tokens approach), Claude dynamically determines how much thinking is needed based on the complexity of the task. This means:

Better resource allocation: Simple tasks don't waste tokens on overthinking
Improved results: Complex tasks get the reasoning depth they deserve
Simpler code: No more trial-and-error with budget_tokens values

Model-Specific Behavior

Here's how different models handle extended thinking:

Model	Recommended Approach	Notes
Claude Opus 4.7+	Adaptive thinking only	Manual mode returns 400 error
Claude Opus 4.6	Adaptive thinking recommended	Manual mode deprecated but functional
Claude Sonnet 4.6	Adaptive thinking recommended	Manual mode deprecated; interleaved mode deprecated
Claude Mythos Preview	Adaptive thinking default	Manual mode accepted; thinking disabled not supported

How to Use Extended Thinking in Practice

Basic Setup with Adaptive Thinking (Recommended)

Here's how to enable extended thinking using the adaptive approach with Python:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "high"  # Options: "low", "medium", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
        }
    ]
)
Process the response
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking: {block.thinking}")
    elif block.type == "text":
        print(f"\nFinal answer: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 16000,
  thinking: {
    type: 'adaptive',
    effort: 'high'
  },
  messages: [
    {
      role: 'user',
      content: 'Design a distributed caching system that handles cache invalidation across regions.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Final answer:', block.text);
  }
}

Working with the Effort Parameter

The effort parameter accepts three values:

low: Minimal reasoning—good for simple tasks where you just want a quick check
medium: Balanced reasoning—suitable for most tasks
high: Maximum reasoning—ideal for complex analysis, proofs, or multi-step problems

Start with medium and adjust based on your results. For most applications, high is unnecessary and may increase latency.

Advanced: Task Budgets and Fast Mode

Task Budgets (Beta)

Task budgets allow you to set a maximum token limit for the entire thinking process. This is useful when you need predictable costs:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high",
        "budget_tokens": 20000  # Maximum tokens for thinking
    },
    messages=[...]
)

Fast Mode (Research Preview)

Fast mode trades some reasoning depth for significantly faster responses. This is useful for real-time applications where speed matters more than perfect reasoning:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium",
        "fast_mode": True  # Research preview
    },
    messages=[...]
)

Best Practices for Extended Thinking

1. Set Appropriate max_tokens

Extended thinking consumes tokens from your max_tokens budget. If you set max_tokens too low, Claude may run out of room before completing its reasoning. A good rule of thumb: set max_tokens to at least 1.5x your expected thinking budget.

2. Handle Thinking Blocks Properly

Always check the type field when processing responses. Thinking blocks contain sensitive reasoning that you may want to:

Display to users for transparency
Log for debugging
Use for verification (via the signature)

3. Use Streaming for Real-Time Feedback

Extended thinking works with streaming. You'll receive thinking blocks as they're generated, followed by text blocks:

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{"role": "user", "content": "Explain quantum entanglement"}]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(event.delta.text, end="")

4. Verify Signatures for Sensitive Applications

If you're using extended thinking in applications where integrity matters (e.g., financial analysis, legal reasoning), verify the signature to ensure the thinking content hasn't been modified:

import anthropic
Verify the thinking block's signature
for block in response.content:
    if block.type == "thinking":
        is_valid = anthropic.verify_signature(
            thinking=block.thinking,
            signature=block.signature
        )
        if not is_valid:
            raise ValueError("Thinking content has been tampered with!")

Common Pitfalls to Avoid

Using manual mode on Opus 4.7+: This will return a 400 error. Always use adaptive thinking for newer models.
Setting budget_tokens too low: If you must use manual mode on older models, ensure your budget is realistic for the task complexity.
Ignoring the signature: For production applications, always verify signatures to maintain data integrity.
Forgetting max_tokens: Extended thinking consumes from your max_tokens limit. Plan accordingly.

Real-World Use Cases

Extended thinking shines in scenarios that require deep reasoning:

Mathematical proofs: Claude can work through complex proofs step by step
Code analysis: Understanding and refactoring large codebases
Legal document review: Analyzing contracts clause by clause
Scientific research: Evaluating hypotheses and synthesizing findings
Strategic planning: Multi-step business or technical strategy development

Key Takeaways

Adaptive thinking is the future: For Claude Opus 4.7 and later, use thinking: {type: "adaptive", effort: "..."} instead of manual budget_tokens
Choose effort wisely: Start with medium and adjust based on task complexity—high isn't always better
Set max_tokens generously: Extended thinking consumes from your token budget; plan for 1.5x your expected thinking needs
Verify signatures in production: The signature field ensures thinking content integrity
Stream for better UX: Use streaming to show thinking progress in real-time to users