Guide2026-05-02

Mastering Claude's Extended Thinking: From Manual Mode to Adaptive Reasoning

Learn how to use Claude's extended thinking feature for complex reasoning tasks, including adaptive thinking, effort control, and practical API examples across supported models.

Quick Answer

This guide explains how to enable and configure Claude's extended thinking for step-by-step reasoning. You'll learn about adaptive thinking (recommended for Claude Opus 4.7+), manual mode (deprecated), effort parameters, and how to handle thinking blocks in API responses.

extended thinkingClaude APIadaptive thinkingreasoningClaude Opus 4.7

Mastering Claude's Extended Thinking: From Manual Mode to Adaptive Reasoning

Claude's extended thinking feature unlocks a new level of reasoning capability. When enabled, Claude generates internal thought processes before delivering its final answer—giving you insight into its reasoning chain and dramatically improving performance on complex tasks like mathematical proofs, multi-step analysis, and code debugging.

This guide covers everything you need to know: which models support extended thinking, how to configure it (including the new adaptive thinking mode), and practical code examples you can use today.

What Is Extended Thinking?

Extended thinking allows Claude to "think out loud" before responding. Instead of producing an immediate answer, Claude first generates one or more thinking content blocks containing its internal reasoning, then follows with the final text content block.

This process:

Improves accuracy on complex, multi-step problems
Provides transparency into Claude's reasoning chain
Enables debugging of incorrect or unexpected outputs
Supports structured outputs like code, JSON, or mathematical proofs

Supported Models: A Quick Reference

Extended thinking behavior varies by model. Here's the current landscape:

Model	Recommended Mode	Manual Mode (`type: "enabled"`)	Notes
Claude Opus 4.7+	Adaptive (`type: "adaptive"`)	❌ Returns 400 error	Use `effort` parameter instead of `budget_tokens`
Claude Opus 4.6	Adaptive (recommended)	✅ Deprecated, still functional	Will be removed in future
Claude Sonnet 4.6	Adaptive (recommended)	✅ Deprecated, interleaved mode	Will be removed in future
Claude Mythos Preview	Adaptive (default)	✅ Accepted	`thinking: {type: "disabled"}` not supported; use `display: "summarized"` for summaries
Claude Sonnet 3.7	Manual (`type: "enabled"`)	✅ Supported	No adaptive mode available

Important: If you're using Claude Opus 4.7 or later, you must use adaptive thinking. The old manual mode with budget_tokens will return a 400 error.

How Extended Thinking Works

When extended thinking is enabled, the API response includes a thinking content block before the text block. Here's the default response structure:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field is used for verification and must be included if you're streaming responses or using citations.

Using Adaptive Thinking (Recommended for Claude Opus 4.7+)

Adaptive thinking is the modern approach. Instead of manually setting a token budget, you specify an effort level that controls how much Claude thinks before responding.

The `effort` Parameter

The effort parameter accepts three values:

"low" – Minimal thinking, faster responses
"medium" – Balanced reasoning (default)
"high" – Maximum reasoning depth

Python Example

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={
        "type": "adaptive",
        "effort": "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that there are infinitely many prime numbers congruent to 3 mod 4."
        }
    ]
)
Process thinking blocks
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
        print(f"Signature: {block.signature}")
    elif block.type == "text":
        print(f"Final answer: {block.text}")

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 32000,
  thinking: {
    type: 'adaptive',
    effort: 'high'
  },
  messages: [
    {
      role: 'user',
      content: 'Design a distributed rate limiter and explain your reasoning.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log(Thinking: ${block.thinking.substring(0, 200)}...);
  } else if (block.type === 'text') {
    console.log(Answer: ${block.text});
  }
}

Using Manual Extended Thinking (Legacy)

For older models (Claude Sonnet 3.7, Claude Opus 4.5, etc.), you can still use manual mode with a budget_tokens value:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000
    },
    messages=[
        {"role": "user", "content": "Solve this complex math problem..."}
    ]
)

The budget_tokens sets the maximum number of tokens Claude can use for thinking. The remaining tokens (up to max_tokens) are reserved for the final response.

Advanced Configuration Options

Task Budgets (Beta)

For adaptive thinking, you can optionally set a task_budget to limit total token usage:

thinking = {
    "type": "adaptive",
    "effort": "high",
    "task_budget": {
        "type": "token_count",
        "value": 5000
    }
}

Fast Mode (Research Preview)

Fast mode reduces thinking time for speed-sensitive applications. It's available as a beta feature:

thinking = {
    "type": "adaptive",
    "effort": "medium",
    "fast_mode": True
}

Controlling Thinking Display

For Claude Mythos Preview, you can control how thinking is displayed:

response = client.messages.create(
    model="claude-mythos-preview",
    thinking={
        "type": "adaptive",
        "display": "summarized"  # Returns summaries instead of full thinking
    },
    messages=[...]
)

Streaming with Extended Thinking

When streaming, thinking blocks appear before text blocks. You must handle the signature for verification:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=32000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{"role": "user", "content": "Explain quantum entanglement"}]
) as stream:
    for event in stream:
        if event.type == "content_block_start":
            if event.content_block.type == "thinking":
                print("\n[Thinking started]")
            elif event.content_block.type == "text":
                print("\n[Answer started]")
        elif event.type == "content_block_delta":
            if event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="")
            elif event.delta.type == "text_delta":
                print(event.delta.text, end="")

Best Practices

1. Choose the Right Effort Level

Low effort: Use for simple Q&A, quick lookups, or when speed matters
Medium effort: Default for most tasks—good balance of speed and depth
High effort: Use for complex reasoning, proofs, code generation, or when accuracy is critical

2. Set Appropriate `max_tokens`

Always set max_tokens higher than your expected thinking + response tokens. A good rule of thumb:

max_tokens = budget_tokens × 1.5 (for manual mode)
max_tokens = 32000+ for adaptive mode with high effort

3. Handle Thinking Blocks in Your Application

Always iterate over response.content and check block.type to separate thinking from final output. Never assume the first block is the answer.

4. Use Signatures for Verification

When you need to verify that thinking content hasn't been tampered with (e.g., in regulated environments), store the signature field and validate it using Anthropic's verification tools.

Common Pitfalls

Mistake	Solution
Using `type: "enabled"` on Claude Opus 4.7+	Switch to `type: "adaptive"` with `effort` parameter
Setting `budget_tokens` too low	Ensure budget is at least 10% of `max_tokens`
Forgetting to handle streaming signatures	Always capture and verify `signature` in streaming mode
Using `thinking: {type: "disabled"}` on Mythos	Not supported; use `display: "summarized"` instead

Key Takeaways

Adaptive thinking (type: "adaptive") is the recommended mode for Claude Opus 4.7+ and newer models. Use the effort parameter (low, medium, high) to control reasoning depth.
Manual mode (type: "enabled" with budget_tokens) is deprecated on Claude Opus 4.6 and Sonnet 4.6, and returns an error on Claude Opus 4.7+. Migrate to adaptive thinking.
Always handle thinking blocks in your code by checking block.type—the final answer is always in a text block that follows thinking blocks.
Streaming requires signature handling for verification. Capture the signature from thinking blocks when using streaming responses.
Choose effort wisely: high effort for complex reasoning, low effort for speed-sensitive tasks, medium as your default.

Mastering Claude's Extended Thinking: From Manual Mode to Adaptive Reasoning

What Is Extended Thinking?

Supported Models: A Quick Reference

How Extended Thinking Works

Using Adaptive Thinking (Recommended for Claude Opus 4.7+)

The effort Parameter

Python Example

Process thinking blocks

TypeScript Example

Using Manual Extended Thinking (Legacy)

Advanced Configuration Options

Task Budgets (Beta)

Fast Mode (Research Preview)

Controlling Thinking Display

Streaming with Extended Thinking

Best Practices

1. Choose the Right Effort Level

2. Set Appropriate max_tokens

3. Handle Thinking Blocks in Your Application

4. Use Signatures for Verification

Common Pitfalls

Key Takeaways

The `effort` Parameter

2. Set Appropriate `max_tokens`