GuideBeginner2026-05-06

Mastering Claude's Extended Thinking: A Complete Guide to Enhanced Reasoning

Learn how to leverage Claude's extended thinking capabilities for complex tasks. Covers adaptive thinking, effort parameters, budget tokens, and practical API implementation examples.

Quick Answer

Extended thinking gives Claude enhanced reasoning for complex tasks by showing its step-by-step thought process. Use adaptive thinking (thinking: {type: 'adaptive'}) with the effort parameter for Claude Opus 4.7+, or manual mode (thinking: {type: 'enabled', budget_tokens: N}) for older models.

Extended ThinkingClaude APIAdaptive ThinkingReasoningAI Development

Mastering Claude's Extended Thinking: A Complete Guide to Enhanced Reasoning

Extended thinking is one of Claude's most powerful features, giving the model enhanced reasoning capabilities for tackling complex problems. Whether you're analyzing mathematical proofs, debugging intricate code, or making multi-step logical deductions, extended thinking provides transparency into Claude's thought process before it delivers its final answer.

This guide covers everything you need to know to implement extended thinking effectively in your applications.

Understanding Extended Thinking

When extended thinking is enabled, Claude generates internal reasoning in the form of thinking content blocks before producing its final response. These blocks contain step-by-step analysis, intermediate conclusions, and the model's reasoning chain.

The API response structure looks like this:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The thinking block includes a cryptographic signature that verifies the integrity of the thinking content, ensuring it hasn't been tampered with.

Adaptive Thinking vs. Manual Extended Thinking

Claude offers two approaches to extended thinking, and the right choice depends on your model version:

Adaptive Thinking (Recommended for Claude Opus 4.7+)

Adaptive thinking lets Claude dynamically determine how much thinking is needed based on the complexity of the task. You control this with the effort parameter, which accepts values from 0.0 to 1.0.

Key benefits:

Claude automatically allocates thinking resources
No need to guess budget tokens
More efficient for varied workloads
Required for Claude Opus 4.7 and later models

Manual Extended Thinking (Legacy)

Manual mode requires you to specify a budget_tokens value, which sets the maximum number of tokens Claude can use for thinking.

Important: Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is no longer supported on Claude Opus 4.7 or later models and returns a 400 error. It remains functional but deprecated on Claude Opus 4.6 and Claude Sonnet 4.6.

Model Compatibility Matrix

Model	Adaptive Thinking	Manual Thinking	Notes
Claude Opus 4.7+	✅ Required	❌ Returns 400 error	Use `effort` parameter
Claude Mythos Preview	✅ Default	✅ Accepted	`thinking: {type: "disabled"}` not supported
Claude Opus 4.6	✅ Recommended	⚠️ Deprecated	Manual mode still functional
Claude Sonnet 4.6	✅ Recommended	⚠️ Deprecated	Interleaved mode deprecated

Implementing Extended Thinking in Your Code

Using Adaptive Thinking (Claude Opus 4.7+)

Here's how to use adaptive thinking with the effort parameter:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=20000,
    thinking={
        "type": "adaptive",
        "effort": 0.8  # Scale from 0.0 (minimal) to 1.0 (maximum)
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that the square root of 2 is irrational using a proof by contradiction."
        }
    ]
)
Process the response
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
        print(f"Signature: {block.signature[:50]}...")
    elif block.type == "text":
        print(f"Final answer: {block.text}")

Using Manual Extended Thinking (Legacy Models)

For models that still support manual mode:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 16000,
  thinking: {
    type: 'enabled',
    budget_tokens: 10000  // Maximum tokens for thinking
  },
  messages: [
    {
      role: 'user',
      content: 'Design a distributed caching system that handles cache invalidation across multiple regions.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(Thinking summary: ${block.thinking.substring(0, 100)}...);
  } else if (block.type === 'text') {
    console.log(Final answer: ${block.text});
  }
}

Best Practices for Extended Thinking

1. Set Appropriate Max Tokens

Your max_tokens value must be greater than your budget_tokens value. The difference represents the tokens available for the final response.

# Good: max_tokens > budget_tokens
thinking={"type": "enabled", "budget_tokens": 10000},
max_tokens=16000  # 6000 tokens reserved for final response
Bad: will cause an error
thinking={"type": "enabled", "budget_tokens": 10000},
max_tokens=8000  # Error: budget_tokens exceeds max_tokens

2. Choose the Right Effort Level

With adaptive thinking, start with moderate effort (0.5-0.7) and adjust based on task complexity:

0.0 - 0.3: Simple tasks, quick lookups, straightforward answers
0.4 - 0.6: Moderate complexity, multi-step reasoning
0.7 - 0.9: Complex problems, mathematical proofs, architectural design
1.0: Maximum reasoning for the most challenging tasks

3. Handle Thinking Blocks in Streaming Mode

When streaming, thinking blocks appear before text blocks:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=20000,
    thinking={"type": "adaptive", "effort": 0.7},
    messages=[{"role": "user", "content": "Solve this complex optimization problem..."}]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(f"Thinking: {event.delta.thinking}", end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(f"Answer: {event.delta.text}", end="")

4. Verify Thinking Signatures

Always verify the signature of thinking blocks in production environments to ensure integrity:

def verify_thinking_block(block):
    if block.type == "thinking":
        # The signature is automatically verified by the SDK
        # You can also implement custom verification
        print(f"Thinking signature: {block.signature[:20]}...")
        return True
    return False

Common Pitfalls to Avoid

Using manual mode on Claude Opus 4.7+: This returns a 400 error. Always use adaptive thinking.
Setting budget_tokens too high: This wastes tokens and increases costs. Start conservative.
Forgetting to handle both block types: Your code must process both thinking and text blocks.
Ignoring the signature: In production, verify signatures to ensure thinking content hasn't been modified.

Real-World Use Cases

Extended thinking excels in scenarios requiring deep reasoning:

Mathematical proofs and theorem verification
Complex code generation with architectural decisions
Multi-step logical reasoning and analysis
Scientific research and hypothesis testing
Legal document analysis with precedent comparison

Key Takeaways

Adaptive thinking (thinking: {type: "adaptive"}) is the modern approach and required for Claude Opus 4.7+ models. Use the effort parameter (0.0 to 1.0) to control reasoning depth.
Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is deprecated on Claude Opus 4.6 and Sonnet 4.6, and returns errors on Claude Opus 4.7+. Migrate to adaptive thinking.
Always set max_tokens higher than budget_tokens to leave room for the final response. The difference determines response length.
Handle both thinking and text content blocks in your response processing, especially when streaming.
Verify thinking signatures in production to ensure the integrity of Claude's reasoning chain.