Guide2026-04-24

Mastering Extended Thinking in Claude: A Practical Guide to Enhanced Reasoning

Learn how to use Claude's extended thinking feature for complex tasks. Covers adaptive thinking, manual mode, effort parameters, and code examples for API integration.

Quick Answer

This guide explains how to enable and configure Claude's extended thinking for step-by-step reasoning. You'll learn about adaptive thinking (recommended for Opus 4.7+), manual mode, effort parameters, and how to handle thinking blocks in API responses.

extended thinkingClaude APIadaptive thinkingreasoningClaude Opus

Introduction

Claude's extended thinking feature unlocks a new level of reasoning capability. When enabled, Claude generates internal "thinking blocks" — step-by-step reasoning that it uses to craft more accurate, well-reasoned final answers. This is especially valuable for complex tasks like mathematical proofs, multi-step analysis, code debugging, and strategic planning.

In this guide, you'll learn:

What extended thinking is and how it works
The difference between adaptive thinking and manual mode
How to configure extended thinking for different Claude models
Practical code examples for the Messages API
Best practices for using thinking blocks in your applications

How Extended Thinking Works

When extended thinking is enabled, Claude's API response includes special thinking content blocks before the final text content block. These thinking blocks contain Claude's internal reasoning — the "scratchpad" it uses to work through a problem.

Here's what a typical response looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field is used for verification purposes and is required when streaming responses.

Adaptive Thinking vs. Manual Extended Thinking

Adaptive Thinking (Recommended for Opus 4.7+)

For Claude Opus 4.7 and later models, Anthropic has introduced adaptive thinking. Instead of manually setting a token budget, you use the effort parameter to control how much Claude should "think" about a problem.

Key benefits:

No need to guess a token budget
Claude dynamically allocates thinking tokens based on task complexity
Simpler API calls

Manual Extended Thinking (Deprecated for Opus 4.7+)

Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is the older approach. It's still supported on Claude Opus 4.6 and Claude Sonnet 4.6, but it's deprecated and will be removed in future releases. Do not use it on Opus 4.7 or later — it returns a 400 error.

Model Compatibility

Model	Recommended Mode	Manual Mode Support
Claude Opus 4.7+	Adaptive thinking (`type: "adaptive"`)	❌ Returns 400 error
Claude Opus 4.6	Adaptive thinking (recommended)	✅ Deprecated but functional
Claude Sonnet 4.6	Adaptive thinking (recommended)	✅ Deprecated but functional (interleaved mode)
Claude Mythos Preview	Adaptive thinking (default)	✅ `type: "enabled"` accepted; `type: "disabled"` not supported

How to Use Extended Thinking in the API

Using Adaptive Thinking (Opus 4.7+)

With adaptive thinking, you specify an effort level. The effort parameter controls how much reasoning Claude applies. Higher effort means more thinking tokens, which can improve accuracy on very complex tasks but increases latency and cost.

Python Example:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=20000,
    thinking={
        "type": "adaptive",
        "effort": "high"  # Options: "low", "medium", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that there are infinitely many prime numbers of the form 4k+3."
        }
    ]
)
Print the thinking content
for block in response.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

TypeScript Example:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 20000,
  thinking: {
    type: 'adaptive',
    effort: 'high'
  },
  messages: [
    {
      role: 'user',
      content: 'Prove that there are infinitely many prime numbers of the form 4k+3.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Answer:', block.text);
  }
}

Using Manual Extended Thinking (Opus 4.6 / Sonnet 4.6)

If you're using an older model, you can still use manual mode with a budget_tokens parameter:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Claude can use up to 10,000 tokens for thinking
    },
    messages=[
        {
            "role": "user",
            "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

Important: The budget_tokens must be less than max_tokens. A good rule of thumb is to set budget_tokens to about 60-80% of max_tokens.

Streaming with Extended Thinking

When streaming, thinking blocks appear before text blocks. You need to handle the signature field for verification:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=20000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "Explain the Riemann Hypothesis in simple terms."
        }
    ]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(event.delta.text, end="")

Best Practices

1. Choose the Right Effort Level

Low effort: Use for simple tasks where you want faster responses and lower cost. Good for straightforward Q&A.
Medium effort: A balanced choice for most complex tasks. Recommended as a starting point.
High effort: Use for very complex reasoning tasks like mathematical proofs, multi-step analysis, or strategic planning. Expect higher latency and cost.

2. Set Appropriate max_tokens

Extended thinking consumes tokens from your max_tokens budget. If you set max_tokens too low, Claude may run out of tokens before completing its reasoning. For complex tasks, start with max_tokens of 16000-32000.

3. Handle Thinking Blocks in Your Application

If you're displaying Claude's response to users, you have options:

Show thinking blocks: Great for educational or debugging contexts
Hide thinking blocks: Show only the final text response for a cleaner UX
Summarize thinking: On Claude Mythos Preview, you can pass display: "summarized" to receive summaries instead of raw thinking

4. Use with Structured Outputs

Extended thinking works well with structured outputs (JSON mode). Claude can reason about the structure before generating the final JSON:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "Extract all dates, names, and amounts from this invoice and return as JSON."
        }
    ]
)

Common Pitfalls

Using manual mode on Opus 4.7+: This returns a 400 error. Always use adaptive thinking for new models.
Setting budget_tokens too high: If budget_tokens exceeds max_tokens, the API will reject the request.
Ignoring the signature: When streaming, always capture the signature for verification purposes.
Not accounting for token usage: Extended thinking consumes tokens from your rate limits and billing. Monitor your usage carefully.

Key Takeaways

Adaptive thinking is the future: For Claude Opus 4.7 and later, use thinking: {type: "adaptive", effort: "low"|"medium"|"high"} instead of manual budget tokens.
Manual mode is deprecated: It still works on Opus 4.6 and Sonnet 4.6 but will be removed. Migrate to adaptive thinking as soon as possible.
Extended thinking improves complex reasoning: Use it for math, code analysis, multi-step logic, and strategic planning tasks.
Handle thinking blocks appropriately: You can display, hide, or summarize Claude's internal reasoning depending on your use case.
Monitor token usage: Extended thinking consumes tokens from your max_tokens budget and affects rate limits and costs.