Guide2026-05-05

Mastering Extended Thinking in Claude: A Complete Guide to Enhanced Reasoning

Learn how to use Claude's extended thinking feature for complex tasks. Covers adaptive thinking, effort parameters, manual mode, and practical API examples for all supported models.

Quick Answer

Extended thinking gives Claude enhanced step-by-step reasoning for complex tasks. Use adaptive thinking (thinking: {type: 'adaptive'}) with the effort parameter on Claude Opus 4.7+, or manual mode on older models. The API returns thinking content blocks followed by text responses.

extended thinkingClaude APIadaptive thinkingreasoningClaude Opus

Mastering Extended Thinking in Claude: A Complete Guide to Enhanced Reasoning

Claude's extended thinking feature is one of its most powerful capabilities—it enables the model to perform deep, step-by-step reasoning before delivering a final answer. Whether you're solving complex mathematical proofs, analyzing multi-layered business problems, or writing sophisticated code, extended thinking gives Claude the cognitive runway it needs to produce more accurate and nuanced responses.

This guide covers everything you need to know about implementing extended thinking in your Claude applications, from the latest adaptive thinking approach to legacy manual configurations.

Understanding Extended Thinking

When you enable extended thinking, Claude generates internal reasoning content blocks before crafting its final response. These thinking blocks contain the model's step-by-step analysis, which it then uses to inform the final answer. The API response includes both thinking content blocks and text content blocks.

Here's what a typical response looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field is crucial—it cryptographically verifies the integrity of the thinking content, ensuring it hasn't been tampered with.

Adaptive Thinking vs. Manual Extended Thinking

Anthropic has evolved how extended thinking works across model versions. Understanding the differences is critical for building robust applications.

Adaptive Thinking (Recommended for All Current Models)

Adaptive thinking (thinking: {type: "adaptive"}) is the modern approach. Instead of specifying a fixed token budget, you control the reasoning depth using an effort parameter. This gives Claude the flexibility to allocate thinking resources dynamically based on task complexity.

Key benefits:

No need to guess token budgets
Claude automatically adjusts reasoning depth
More efficient token usage
Future-proof for upcoming models

Manual Extended Thinking (Legacy)

Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) lets you specify exactly how many tokens Claude can use for thinking. While still functional on some models, it's being phased out.

Supported Models and Their Behavior

Here's a quick reference for which thinking mode works with each model:

Model	Adaptive Thinking	Manual Mode	Notes
Claude Opus 4.7+	✅ Required	❌ Returns 400 error	Use `effort` parameter
Claude Opus 4.6	✅ Recommended	✅ Deprecated	Manual mode still works
Claude Sonnet 4.6	✅ Recommended	✅ Deprecated (interleaved)	Manual mode still works
Claude Mythos Preview	✅ Default	✅ Accepted	`thinking: {type: "disabled"}` not supported
Claude Sonnet 3.7	❌ Not available	✅ Available	API shapes identical

Important: For Claude Opus 4.7 and later models, passing thinking: {type: "enabled", budget_tokens: N} will return a 400 error. You must use adaptive thinking.

How to Use Extended Thinking in the API

Using Adaptive Thinking (Claude Opus 4.7+)

For the latest models, use adaptive thinking with the effort parameter. The effort value ranges from 0 to 1, where higher values allocate more reasoning resources.

Python Example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": 0.8  # High reasoning effort
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that there are infinitely many prime numbers congruent to 3 modulo 4."
        }
    ]
)
Process the response
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
        print(f"Signature: {block.signature}")
    elif block.type == "text":
        print(f"Final answer: {block.text}")

TypeScript Example:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 32000,
  thinking: {
    type: 'adaptive',
    effort: 0.8
  },
  messages: [
    {
      role: 'user',
      content: 'Design a distributed caching system that handles cache invalidation across multiple data centers.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(Thinking: ${block.thinking.substring(0, 200)}...);
    console.log(Signature: ${block.signature});
  } else if (block.type === 'text') {
    console.log(Final answer: ${block.text});
  }
}

Using Manual Extended Thinking (Older Models)

For models like Claude Sonnet 3.7 or Claude Opus 4.6, you can still use manual mode:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Allocate 10K tokens for thinking
    },
    messages=[
        {
            "role": "user",
            "content": "Analyze the time complexity of this algorithm and suggest optimizations: [code]"
        }
    ]
)

The Effort Parameter: Fine-Tuning Reasoning Depth

The effort parameter in adaptive thinking is your primary control for balancing reasoning quality against speed and cost.

Low effort (0.1–0.3): Quick reasoning for simple tasks. Use for straightforward questions where deep analysis isn't needed.
Medium effort (0.4–0.7): Balanced reasoning for most complex tasks. Good for code review, data analysis, and problem-solving.
High effort (0.8–1.0): Maximum reasoning depth. Use for mathematical proofs, complex system design, and multi-step logical deductions.

Practical tip: Start with medium effort (0.5) and increase only if you need deeper analysis. Higher effort consumes more tokens and increases latency.

Task Budgets and Fast Mode (Beta)

For advanced users, Claude offers two beta features that work with extended thinking:

Task Budgets

Task budgets let you set a maximum number of thinking tokens Claude can use for a specific task. This is useful when you want to cap costs while still using adaptive thinking.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": 0.8,
        "budget_tokens": 15000  # Cap thinking at 15K tokens
    },
    messages=[...]
)

Fast Mode (Research Preview)

Fast mode reduces thinking time for scenarios where you need quicker responses. It's ideal for interactive applications where latency matters more than maximum reasoning depth.

Best Practices for Extended Thinking

1. Set Appropriate max_tokens

Always set max_tokens higher than your thinking budget. A good rule of thumb: max_tokens = thinking_budget + expected_output_tokens.

2. Handle Thinking Content Blocks

Your application should properly process both thinking and text blocks. The thinking blocks contain sensitive reasoning that you may or may not want to display to end users.

3. Verify Signatures

For security-critical applications, verify the signature field to ensure thinking content hasn't been modified in transit.

4. Use Streaming for Real-Time Feedback

Extended thinking works with streaming. You can show thinking progress to users as it happens:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": 0.8},
    messages=[...]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="")

5. Choose the Right Model

Claude Opus 4.7+: Best for maximum reasoning depth. Use adaptive thinking.
Claude Sonnet 4.6: Good balance of speed and reasoning. Manual mode still works.
Claude Sonnet 3.7: Use manual mode only. No adaptive thinking support.

Common Pitfalls to Avoid

Using manual mode on Claude Opus 4.7+: This will return a 400 error. Always use adaptive thinking.
Setting budget_tokens too low: Claude may run out of thinking tokens before completing its reasoning.
Forgetting to set max_tokens: Without it, Claude may stop mid-reasoning.
Ignoring the signature: For applications requiring tamper-proof reasoning, always verify signatures.

Real-World Use Cases

Extended thinking excels in scenarios requiring deep analysis:

Mathematical proofs and theorem verification
Complex code generation with multiple constraints
Legal document analysis and contract review
Scientific research and hypothesis testing
Multi-step business strategy formulation
System architecture design with trade-off analysis

Key Takeaways

Use adaptive thinking (thinking: {type: "adaptive"}) with the effort parameter for Claude Opus 4.7+ and newer models—manual mode is no longer supported on these models and returns a 400 error.
The effort parameter (0–1) controls reasoning depth: start with 0.5 for most tasks and increase for complex problems.
Always set max_tokens higher than your thinking budget to prevent premature truncation of responses.
Process thinking content blocks separately from text blocks in your application, and verify signatures for security-critical use cases.
Streaming is fully supported with extended thinking, enabling real-time display of Claude's reasoning process to end users.