Guide2026-04-22

Mastering Extended Thinking in Claude: Adaptive vs. Manual Thinking for Complex Reasoning

Learn how to use Claude's extended thinking feature for enhanced reasoning. Covers adaptive thinking, manual mode, effort parameters, and code examples for Opus 4.7 and Sonnet 4.6.

Quick Answer

This guide explains how to enable and configure Claude's extended thinking for complex tasks, covering adaptive thinking (recommended for Opus 4.7+) and manual mode (deprecated but functional on older models). You'll learn to set effort levels, handle thinking blocks in responses, and choose the right configuration for your use case.

extended thinkingadaptive thinkingClaude APIreasoningtoken budget

Introduction

Claude's extended thinking capability unlocks deeper reasoning for complex tasks by allowing the model to "think out loud" before delivering a final answer. This feature is essential for multi-step problem solving, mathematical proofs, code analysis, and any scenario where transparency into the model's reasoning process adds value.

With the release of Claude Opus 4.7 and later models, Anthropic introduced adaptive thinking — a smarter, more flexible approach that replaces the older manual extended thinking mode. This guide covers both approaches, explains when to use each, and provides practical API examples to get you started.

How Extended Thinking Works

When extended thinking is enabled, Claude generates thinking content blocks in its response before the final text output. These blocks contain the model's internal reasoning — step-by-step analysis, considerations, and intermediate conclusions.

A typical response looks like this:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me break this problem down. First, I need to consider...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my step-by-step analysis, the answer is..."
    }
  ]
}

The signature field is used for verification purposes and is required when streaming responses.

Adaptive Thinking (Recommended for Opus 4.7+)

Adaptive thinking is the modern, recommended approach for Claude Opus 4.7 and later models. Instead of setting a fixed token budget, you specify an effort level that tells Claude how much reasoning to apply. The model dynamically allocates thinking tokens based on the complexity of the task.

Effort Parameter

The effort parameter accepts values from 0.0 to 1.0:

Effort Value	Behavior
`0.0`	Minimal thinking — fast, direct responses
`0.5`	Balanced reasoning — good for most tasks
`1.0`	Maximum reasoning — best for complex problems

API Example (Python)

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={
        "type": "adaptive",
        "effort": 0.8  # High reasoning effort
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that the square root of 2 is irrational."
        }
    ]
)
Access thinking blocks
for block in response.content:
    if block.type == "thinking":
        print("Thinking:", block.thinking)
    elif block.type == "text":
        print("Answer:", block.text)

API Example (TypeScript)

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const response = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 4096,
  thinking: {
    type: 'adaptive',
    effort: 0.8
  },
  messages: [
    { role: 'user', content: 'Prove that the square root of 2 is irrational.' }
  ]
});
for (const block of response.content) {
  if (block.type === 'thinking') {
    console.log('Thinking:', block.thinking);
  } else if (block.type === 'text') {
    console.log('Answer:', block.text);
  }
}

Manual Extended Thinking (Deprecated)

Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is the older approach where you specify a fixed token budget for reasoning. This mode is no longer supported on Claude Opus 4.7 and later models (returns a 400 error). It remains functional but deprecated on Claude Opus 4.6 and Claude Sonnet 4.6.

When to Use Manual Mode

You are using Claude Opus 4.6 or Claude Sonnet 4.6
You need precise control over reasoning token allocation
You are migrating existing code and need backward compatibility

API Example (Python)

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=8192,
    thinking={
        "type": "enabled",
        "budget_tokens": 4096  # Fixed reasoning budget
    },
    messages=[
        {
            "role": "user",
            "content": "Design a distributed caching system."
        }
    ]
)

Model-Specific Behavior

Different Claude models handle extended thinking differently:

Model	Adaptive Thinking	Manual Mode	Notes
Claude Opus 4.7+	✅ Recommended	❌ Returns 400 error	Use `effort` parameter
Claude Opus 4.6	✅ Recommended	⚠️ Deprecated, functional	Migrate to adaptive
Claude Sonnet 4.6	✅ Recommended	⚠️ Deprecated, functional	Interleaved mode available
Claude Mythos Preview	✅ Default	✅ Accepted	Adaptive is default; pass `display: "summarized"` for summaries

Streaming with Extended Thinking

When streaming responses, thinking blocks appear before text blocks. You must handle both block types in your stream handler:

stream = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive", "effort": 0.7},
    messages=[{"role": "user", "content": "Solve this equation step by step: 3x + 7 = 22"}],
    stream=True
)
for event in stream:
    if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
        print(event.delta.thinking, end="")
    elif event.type == "content_block_delta" and event.delta.type == "text_delta":
        print(event.delta.text, end="")

Best Practices

1. Choose the Right Effort Level

Low effort (0.0–0.3): Simple Q&A, direct instructions, factual lookups
Medium effort (0.4–0.7): Code generation, analysis, moderate problem solving
High effort (0.8–1.0): Complex proofs, multi-step reasoning, creative tasks

2. Set Appropriate max_tokens

Your max_tokens value should be larger than your thinking budget because the final text output consumes tokens on top of the thinking tokens. A good rule of thumb: max_tokens = thinking_budget + expected_output_tokens.

3. Handle Thinking Blocks in UI

If you're building a chat interface, consider:

Displaying thinking blocks in a collapsible section
Showing a "thinking..." indicator while streaming
Allowing users to toggle visibility of reasoning

4. Use Signatures for Verification

When you need to verify that thinking content hasn't been tampered with (e.g., in regulated environments), store and validate the signature field.

Common Pitfalls

Forgetting to increase max_tokens: If your thinking budget is 4000 tokens and max_tokens is 4000, Claude will have zero tokens left for the final answer.
Using manual mode on Opus 4.7+: This will return a 400 error. Always use adaptive thinking for new models.
Ignoring thinking blocks in streaming: Your stream handler must account for both thinking_delta and text_delta events.

Conclusion

Extended thinking is a powerful tool for unlocking Claude's full reasoning potential. With adaptive thinking on Opus 4.7+, you get dynamic, efficient reasoning without manual token budgeting. For older models, manual mode still works but should be migrated to adaptive thinking as soon as possible.

By choosing the right effort level, handling thinking blocks properly, and following best practices, you can build applications that leverage Claude's deep reasoning capabilities transparently and effectively.

Key Takeaways

Adaptive thinking (thinking: {type: "adaptive", effort: N}) is the recommended approach for Claude Opus 4.7+ and should be used for all new projects.
Manual extended thinking (budget_tokens) is deprecated on Opus 4.6/Sonnet 4.6 and returns an error on Opus 4.7+ — migrate existing code to adaptive thinking.
Effort levels range from 0.0 (minimal reasoning) to 1.0 (maximum reasoning); choose based on task complexity.
Always set max_tokens higher than your thinking budget to leave room for the final text output.
Handle thinking blocks in your application code, especially when streaming, to provide a smooth user experience.