GuideBeginnerAPI2026-05-22

Mastering Extended Thinking in Claude: A Practical Guide to Adaptive and Manual Modes

Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, manual mode, effort parameters, and code examples for the API.

Quick Answer

This guide explains how to enable and configure Claude's extended thinking for complex reasoning tasks. You'll learn the difference between adaptive thinking (recommended for Opus 4.7) and manual mode, how to set effort levels, and how to handle thinking blocks in your API responses.

extended thinkingClaude APIreasoningadaptive thinkingmodel capabilities

Mastering Extended Thinking in Claude: A Practical Guide to Adaptive and Manual Modes

Claude's extended thinking feature unlocks enhanced reasoning capabilities for complex tasks—from mathematical proofs to multi-step analysis. By giving Claude a dedicated "thinking" phase before it produces its final answer, you can get deeper, more accurate responses. This guide covers everything you need to know to implement extended thinking effectively, whether you're using the latest Claude Opus 4.7 or earlier models.

What Is Extended Thinking?

Extended thinking allows Claude to "think out loud" before delivering its final answer. When enabled, the API response includes special thinking content blocks that contain Claude's internal reasoning, followed by the final text response. This provides:

Transparency into Claude's reasoning process
Improved accuracy on complex tasks like math, logic, and planning
Debugging insights when responses are unexpected

Adaptive Thinking vs. Manual Extended Thinking

Claude now offers two approaches to extended thinking, and the right choice depends on your model version.

Adaptive Thinking (Recommended)

Adaptive thinking (thinking: {type: "adaptive"}) lets Claude decide how much thinking is needed for each request. You control the effort level rather than a fixed token budget. This is the only supported mode on Claude Opus 4.7 and is recommended for all current models.

Manual Extended Thinking (Deprecating)

Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) lets you set a fixed token budget for thinking. This mode is:

No longer supported on Claude Opus 4.7 (returns a 400 error)
Deprecated but functional on Claude Opus 4.6 and Claude Sonnet 4.6
Will be removed in a future model release

Key takeaway: If you're building new applications, use adaptive thinking. If you're maintaining legacy code on older models, plan to migrate soon.

Model-Specific Behavior

Different Claude models handle extended thinking differently:

Model	Adaptive Thinking	Manual Mode	Notes
Claude Opus 4.7	✅ Required	❌ Not supported	Use `effort` parameter
Claude Opus 4.6	✅ Recommended	✅ Deprecated	Migrate to adaptive
Claude Sonnet 4.6	✅ Recommended	✅ Deprecated (interleaved)	Migrate to adaptive
Claude Mythos Preview	✅ Default	✅ Accepted	`disabled` not supported; display defaults to `"omitted"`

How to Use Extended Thinking in the API

Prerequisites

An Anthropic API key
The Anthropic Python SDK (pip install anthropic) or TypeScript SDK

Basic Implementation with Adaptive Thinking

Here's how to enable adaptive thinking on Claude Opus 4.7:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "high"  # Options: "low", "medium", "high"
    },
    messages=[
        {
            "role": "user",
            "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
        }
    ]
)
Process the response
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking}")
    elif block.type == "text":
        print(f"Final answer: {block.text}")

Using Manual Extended Thinking (Legacy Models)

For Claude Opus 4.6 or Sonnet 4.6, you can still use manual mode:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Set your thinking budget
    },
    messages=[
        {
            "role": "user",
            "content": "Explain the Riemann Hypothesis in simple terms."
        }
    ]
)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 16000,
  thinking: {
    type: 'adaptive',
    effort: 'high'
  },
  messages: [
    {
      role: 'user',
      content: 'Design a sorting algorithm that works in O(n log n) time.'
    }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Answer:', block.text);
  }
}

Understanding the Response Format

When extended thinking is enabled, the API response contains a content array with two types of blocks:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The signature field in thinking blocks is used for verification purposes. The thinking blocks appear before the final text response.

Choosing the Right Effort Level

With adaptive thinking, you control the effort parameter:

"low": Minimal thinking, faster responses. Good for simple tasks where you just need a quick check.
"medium": Balanced thinking. Suitable for most analytical tasks.
"high": Maximum reasoning depth. Best for complex proofs, multi-step logic, or tasks requiring careful analysis.

Rule of thumb: Start with "medium" and increase to "high" if you need deeper reasoning. Use "low" for high-throughput applications where speed matters more than depth.

Best Practices

1. Set Appropriate `max_tokens`

Your max_tokens value must be greater than your thinking budget (or effort equivalent). A good rule is to set max_tokens to at least 1.5× your expected thinking tokens.

2. Handle Thinking Blocks in Streaming

When streaming, thinking blocks appear as separate events. Make sure your streaming handler can process both thinking and text content blocks:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[{"role": "user", "content": "Solve this equation step by step."}]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(f"Thinking: {event.delta.thinking}", end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(f"Answer: {event.delta.text}", end="")

3. Use for Complex Reasoning Tasks

Extended thinking shines on tasks like:

Mathematical proofs and derivations
Multi-step logical reasoning
Code generation with complex algorithms
Strategic planning and analysis
Debugging and root cause analysis

4. Don't Use for Simple Queries

For simple factual questions or straightforward tasks, extended thinking adds latency without benefit. Reserve it for problems that genuinely benefit from step-by-step reasoning.

Common Pitfalls

Using manual mode on Opus 4.7: This returns a 400 error. Always use adaptive thinking.
Setting budget_tokens too low: Claude may run out of thinking tokens before completing its reasoning, leading to truncated responses.
Forgetting max_tokens: Extended thinking requires max_tokens to be set explicitly.
Ignoring the signature field: While optional for most use cases, the signature is important for verifying the authenticity of thinking blocks.

Migrating from Manual to Adaptive Thinking

If you're currently using manual extended thinking, here's your migration checklist:

Update your model to Claude Opus 4.7 (or keep using 4.6/4.7 with adaptive)
Replace budget_tokens with the effort parameter
Test with "medium" effort first, then adjust up or down
Update your error handling to catch 400 errors if you accidentally use manual mode on Opus 4.7

Key Takeaways

Adaptive thinking is the future: Use thinking: {type: "adaptive", effort: "high|medium|low"} for all new projects, especially on Claude Opus 4.7.
Manual mode is deprecated: Avoid thinking: {type: "enabled", budget_tokens: N} on new code; it will be removed in future model releases.
Choose effort wisely: Match the effort level to task complexity—"low" for speed, "high" for depth.
Handle both thinking and text blocks: Your application must process both content block types in API responses.
Extended thinking adds latency: Use it strategically for complex reasoning, not for simple queries.

Mastering Extended Thinking in Claude: A Practical Guide to Adaptive and Manual Modes

What Is Extended Thinking?

Adaptive Thinking vs. Manual Extended Thinking

Adaptive Thinking (Recommended)

Manual Extended Thinking (Deprecating)

Model-Specific Behavior

How to Use Extended Thinking in the API

Prerequisites

Basic Implementation with Adaptive Thinking

Process the response

Using Manual Extended Thinking (Legacy Models)

TypeScript Example

Understanding the Response Format

Choosing the Right Effort Level

Best Practices

1. Set Appropriate max_tokens

2. Handle Thinking Blocks in Streaming

3. Use for Complex Reasoning Tasks

4. Don't Use for Simple Queries

Common Pitfalls

Migrating from Manual to Adaptive Thinking

Key Takeaways

1. Set Appropriate `max_tokens`