Guide2026-05-06

Mastering Claude's Extended Thinking: From Manual Control to Adaptive Reasoning

Learn how to use Claude's extended thinking capabilities for complex reasoning tasks, including adaptive thinking, effort parameters, and practical API implementation examples.

Quick Answer

Claude's extended thinking enhances reasoning for complex tasks by showing step-by-step thought processes. For Claude Opus 4.7+, use adaptive thinking with the effort parameter instead of manual budget_tokens. This guide covers setup, model-specific behaviors, and practical API examples.

extended thinkingClaude APIadaptive thinkingreasoningClaude Opus

Mastering Claude's Extended Thinking: From Manual Control to Adaptive Reasoning

Claude's extended thinking capability is one of its most powerful features for tackling complex, multi-step problems. Whether you're building a research assistant, a code analysis tool, or a mathematical reasoning engine, understanding how to leverage extended thinking can dramatically improve your results.

This guide covers everything you need to know about implementing extended thinking in your Claude-powered applications, from the latest adaptive thinking approach to model-specific behaviors and practical code examples.

What Is Extended Thinking?

Extended thinking gives Claude enhanced reasoning capabilities for complex tasks. When enabled, Claude generates internal reasoning steps before crafting its final response. These reasoning steps are returned as thinking content blocks in the API response, providing varying levels of transparency into Claude's thought process.

Think of it as Claude "showing its work" — you get to see the intermediate analysis, logical deductions, and problem-solving strategies it uses before arriving at a final answer.

The Shift: From Manual to Adaptive Thinking

An important change has occurred in the extended thinking landscape. For Claude Opus 4.7 and later models, the traditional manual extended thinking configuration (thinking: {type: "enabled", budget_tokens: N}) is no longer supported and returns a 400 error.

Instead, Anthropic has introduced adaptive thinking (thinking: {type: "adaptive"}) with an effort parameter. This new approach automatically determines how much thinking is appropriate based on the complexity of your request.

Why Adaptive Thinking?

Adaptive thinking solves a fundamental problem with manual budget_tokens: you never really know how much thinking a particular query requires. Set the budget too low, and Claude might cut off its reasoning prematurely. Set it too high, and you waste tokens and increase latency.

With adaptive thinking, Claude dynamically allocates reasoning resources based on the task's complexity. You control the effort level, not the budget.

Model-Specific Behavior

Different Claude models handle extended thinking differently. Here's a breakdown:

Model	Recommended Approach	Notes
Claude Opus 4.7+	Adaptive thinking only	Manual mode returns 400 error
Claude Opus 4.6	Adaptive thinking recommended	Manual mode deprecated but functional
Claude Sonnet 4.6	Adaptive thinking recommended	Manual mode deprecated but functional (interleaved mode)
Claude Mythos Preview	Adaptive thinking default	Manual mode also accepted; `thinking: {type: "disabled"}` not supported

How Extended Thinking Works

When extended thinking is enabled, the API response includes thinking content blocks followed by text content blocks. Here's the default response format:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field is used for verification purposes and ensures the integrity of the thinking content.

Using Extended Thinking in the API

Adaptive Thinking (Recommended for Claude Opus 4.7+)

Here's how to use adaptive thinking with the effort parameter in Python:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high"  # Options: "low", "medium", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that there are infinitely many prime numbers congruent to 3 modulo 4."
        }
    ]
)
Process the response
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
    elif block.type == "text":
        print(f"Answer: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
  const response = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 32000,
    thinking: {
      type: 'adaptive',
      effort: 'high'
    },
    messages: [
      {
        role: 'user',
        content: 'Design a distributed caching system that handles cache invalidation across multiple data centers.'
      }
    ]
  });
for (const block of response.content) {
    if (block.type === 'thinking') {
      console.log(Thinking: ${block.thinking.substring(0, 200)}...);
    } else if (block.type === 'text') {
      console.log(Answer: ${block.text});
    }
  }
}
main();

Manual Extended Thinking (Legacy Models Only)

For models that still support manual mode (Claude Opus 4.6 and Claude Sonnet 4.6):

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Allocate 10,000 tokens for thinking
    },
    messages=[
        {
            "role": "user",
            "content": "Analyze the time complexity of this algorithm and suggest optimizations: [code snippet]"
        }
    ]
)

Choosing the Right Effort Level

The effort parameter in adaptive thinking accepts three values:

low: Minimal thinking, suitable for simple queries where you want fast responses
medium: Balanced approach for most tasks
high: Maximum reasoning depth for complex problems

When to Use Each Level

Effort Level	Best For	Example Queries
Low	Simple factual questions, quick lookups	"What's the capital of France?"
Medium	Standard analysis, moderate complexity	"Compare these two database architectures"
High	Complex reasoning, proofs, deep analysis	"Prove the Riemann Hypothesis implications"

Working with Thinking Content Blocks

When processing responses with extended thinking, you'll need to handle the thinking blocks appropriately:

def process_thinking_response(response):
    """Extract and display thinking and text content from Claude's response."""
    thinking_content = []
    final_answer = ""
    
    for block in response.content:
        if block.type == "thinking":
            thinking_content.append({
                "thinking": block.thinking,
                "signature": block.signature
            })
        elif block.type == "text":
            final_answer += block.text
    
    return {
        "thinking_steps": thinking_content,
        "final_answer": final_answer
    }

Best Practices for Extended Thinking

1. Set Appropriate max_tokens

Always set max_tokens higher than your expected thinking budget. The total tokens consumed will be thinking tokens + output tokens. A good rule of thumb:

# For manual mode
budget_tokens = 10000
max_tokens = budget_tokens + 5000  # 5000 tokens for the final answer
For adaptive mode
max_tokens = 32000  # Generous buffer for complex reasoning

2. Use Streaming for Real-Time Feedback

For long-running thinking operations, streaming provides real-time visibility:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{"role": "user", "content": "Complex query here..."}]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="", flush=True)
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(event.delta.text, end="", flush=True)

3. Handle the Signature Field

The signature field in thinking blocks is crucial for verification. Always preserve it if you're storing or forwarding thinking content:

# Store thinking content with signature for later verification
thinking_data = {
    "thinking": block.thinking,
    "signature": block.signature
}

4. Consider Zero Data Retention (ZDR)

Extended thinking is eligible for Zero Data Retention arrangements. If your organization has ZDR enabled, data sent through this feature is not stored after the API response is returned — important for sensitive applications.

Common Pitfalls to Avoid

❌ Using Manual Mode on Opus 4.7+

# This will return a 400 error on Claude Opus 4.7+
response = client.messages.create(
    model="claude-opus-4-7",
    thinking={"type": "enabled", "budget_tokens": 10000},  # ERROR!
    ...
)

✅ Correct Approach for Opus 4.7+

response = client.messages.create(
    model="claude-opus-4-7",
    thinking={"type": "adaptive", "effort": "high"},  # Correct!
    ...
)

❌ Setting max_tokens Too Low

If max_tokens is less than the thinking budget, Claude will truncate its reasoning prematurely.

Real-World Use Cases

Mathematical Proofs and Problem Solving

Extended thinking excels at mathematical reasoning where step-by-step verification is crucial:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{
        "role": "user",
        "content": "Solve the following optimization problem: minimize f(x,y) = x^2 + y^2 subject to x + y = 1"
    }]
)

Code Analysis and Debugging

For complex codebases, extended thinking helps Claude trace through logic:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{
        "role": "user",
        "content": "Review this distributed system code for race conditions and deadlocks: [code]"
    }]
)

Research and Analysis

Extended thinking is invaluable for synthesizing information from multiple sources:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{
        "role": "user",
        "content": "Compare and contrast the economic policies of Keynesian economics vs. Monetarism, focusing on their approaches to inflation control."
    }]
)

Key Takeaways

Adaptive thinking is the future: For Claude Opus 4.7 and later, use thinking: {type: "adaptive", effort: "..."} instead of manual budget_tokens. Manual mode returns a 400 error on these models.
Choose effort wisely: Use low for simple queries, medium for standard tasks, and high for complex reasoning. Let Claude decide how many tokens to allocate.
Always set generous max_tokens: Ensure max_tokens is significantly higher than your expected thinking + output token consumption to prevent premature truncation.
Handle thinking blocks properly: Process thinking content blocks separately from text blocks, and always preserve the signature field for verification purposes.
Stream for long operations: Use streaming to provide real-time feedback to users during extended thinking operations, especially with high effort levels.