BeClaude
Guide2026-04-29

Mastering Extended Thinking in Claude: A Practical Guide to Adaptive & Manual Thinking Modes

Learn how to enable and optimize Claude's extended thinking capabilities for complex reasoning tasks. Covers adaptive thinking, manual budgets, effort parameters, and code examples.

Quick Answer

This guide teaches you how to use Claude's extended thinking feature to enhance reasoning for complex tasks. You'll learn about adaptive thinking with effort parameters, manual token budgets, and how to implement them in the Messages API with practical code examples.

extended thinkingadaptive thinkingClaude APIreasoningtoken budget

Introduction

Claude's extended thinking feature unlocks a new level of reasoning capability, allowing the model to work through complex problems step-by-step before delivering a final answer. Whether you're building a code analysis tool, a mathematical solver, or a multi-step decision engine, extended thinking gives Claude the "thinking time" it needs to produce more accurate and nuanced responses.

This guide covers everything you need to know: from the basics of enabling extended thinking to advanced configuration with adaptive thinking and effort parameters. You'll find practical code examples, model-specific behavior notes, and best practices to get the most out of this powerful feature.

How Extended Thinking Works

When extended thinking is enabled, Claude generates thinking content blocks in its response. These blocks contain the model's internal reasoning—its step-by-step thought process—before it produces the final text output.

Here's what a typical response looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

The thinking blocks include a signature field, which is used for verification purposes. The final answer appears in subsequent text blocks.

Adaptive Thinking (Recommended for Claude 4 Models)

For Claude Opus 4.7 and later models, Anthropic has introduced adaptive thinking—a smarter way to allocate thinking resources. Instead of manually setting a fixed token budget, you tell Claude how much effort to apply, and it dynamically adjusts its thinking depth.

Using the Effort Parameter

The effort parameter accepts values on a scale from 0.0 to 1.0:

  • Low effort (0.0–0.3): Quick reasoning for simple tasks
  • Medium effort (0.4–0.7): Balanced depth for most complex tasks
  • High effort (0.8–1.0): Maximum reasoning for very challenging problems
Python example:
import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=4096, thinking={ "type": "adaptive", "effort": 0.8 # High effort for complex reasoning }, messages=[ {"role": "user", "content": "Solve this complex math problem: integrate ∫(x^2 * sin(x)) dx"} ] )

Access thinking blocks

for block in response.content: if block.type == "thinking": print("Thinking:", block.thinking) elif block.type == "text": print("Answer:", block.text)
TypeScript example:
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function main() { const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 4096, thinking: { type: 'adaptive', effort: 0.8 }, messages: [ { role: 'user', content: 'Analyze the pros and cons of quantum computing for cryptography.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log('Thinking:', block.thinking); } else if (block.type === 'text') { console.log('Answer:', block.text); } } }

main();

Task Budgets (Beta)

For even finer control, you can combine adaptive thinking with a task budget—a maximum token limit for the thinking process. This is useful when you want to cap costs while still benefiting from adaptive allocation.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=8192,
    thinking={
        "type": "adaptive",
        "effort": 0.9,
        "task_budget_tokens": 4000  # Cap thinking at 4000 tokens
    },
    messages=[
        {"role": "user", "content": "Write a comprehensive essay on the history of AI."}
    ]
)

Fast Mode (Beta Research Preview)

For scenarios where you need faster responses but still want some reasoning, fast mode reduces thinking overhead. This is ideal for real-time applications where latency matters.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=2048,
    thinking={
        "type": "adaptive",
        "effort": 0.5,
        "fast_mode": True  # Enable faster thinking
    },
    messages=[
        {"role": "user", "content": "Summarize this article in 3 bullet points."}
    ]
)

Manual Extended Thinking (Legacy)

For Claude Opus 4.6 and Claude Sonnet 4.6, you can still use manual extended thinking with a fixed token budget. However, this approach is deprecated and will be removed in future model releases. Anthropic strongly recommends migrating to adaptive thinking.

Setting a Token Budget

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=4096,
    thinking={
        "type": "enabled",
        "budget_tokens": 2000  # Fixed thinking budget
    },
    messages=[
        {"role": "user", "content": "Debug this Python code: [code here]"}
    ]
)
Important: Manual extended thinking (type: "enabled") is not supported on Claude Opus 4.7 or later models. Attempting to use it will result in a 400 error.

Model-Specific Behavior

Different Claude models handle extended thinking differently. Here's a quick reference:

ModelAdaptive ThinkingManual ThinkingNotes
Claude Opus 4.7+✅ Recommended❌ Not supportedUse effort parameter
Claude Mythos Preview✅ Default✅ Accepteddisabled not supported; use display: "summarized" for summaries
Claude Opus 4.6✅ Recommended⚠️ DeprecatedManual still functional
Claude Sonnet 4.6✅ Recommended⚠️ DeprecatedInterleaved mode deprecated

Claude Mythos Preview Special Behavior

Claude Mythos Preview has unique behavior:

  • Adaptive thinking is the default mode
  • thinking: {type: "disabled"} is not supported
  • The display defaults to "omitted" (no thinking content returned)
  • Pass display: "summarized" to receive summaries of the thinking process
response = client.messages.create(
    model="claude-mythos-preview",
    max_tokens=4096,
    thinking={
        "type": "adaptive",
        "effort": 0.7,
        "display": "summarized"  # Get thinking summaries
    },
    messages=[
        {"role": "user", "content": "Explain quantum entanglement."}
    ]
)

Best Practices

1. Choose the Right Effort Level

Start with a moderate effort (0.5–0.7) and adjust based on task complexity. For simple factual queries, low effort (0.2–0.3) is sufficient. For multi-step reasoning, coding, or analysis, use high effort (0.8–1.0).

2. Combine with Structured Outputs

Extended thinking pairs well with structured outputs. Use thinking for reasoning, then output a structured JSON response:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive", "effort": 0.8},
    messages=[
        {"role": "user", "content": "Extract key entities from this text and return as JSON."}
    ]
)

3. Monitor Token Usage

Extended thinking consumes tokens from your max_tokens budget. The thinking tokens are counted separately from output tokens. Use the usage field in the API response to monitor consumption.

4. Handle Streaming Responses

When streaming, thinking blocks appear before text blocks. Process them in order:

stream = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={"type": "adaptive", "effort": 0.8},
    messages=[{"role": "user", "content": "Solve this riddle..."}],
    stream=True
)

for event in stream: if event.type == "content_block_delta" and event.delta.type == "thinking_delta": print(event.delta.thinking, end="") elif event.type == "content_block_delta" and event.delta.type == "text_delta": print(event.delta.text, end="")

Troubleshooting

  • 400 Error on Opus 4.7+: You're using type: "enabled" instead of type: "adaptive". Switch to adaptive thinking.
  • No thinking content returned: Check the display parameter. On Mythos, it defaults to "omitted".
  • High latency: Reduce the effort parameter or enable fast_mode for quicker responses.
  • Token limit exceeded: Increase max_tokens or reduce task_budget_tokens.

Conclusion

Extended thinking is a game-changer for complex reasoning tasks. With the introduction of adaptive thinking and the effort parameter, you now have fine-grained control over how Claude allocates its cognitive resources. Whether you're building a code assistant, a research tool, or a decision support system, mastering extended thinking will help you get the most out of Claude.

Key Takeaways

  • Adaptive thinking (with the effort parameter) is the recommended approach for Claude Opus 4.7 and later models—manual thinking is no longer supported on these models.
  • Manual extended thinking (type: "enabled" with budget_tokens) is deprecated on Opus 4.6 and Sonnet 4.6 but still functional; migrate to adaptive thinking for future compatibility.
  • Effort values range from 0.0 (low reasoning) to 1.0 (maximum reasoning); start with 0.5–0.7 and adjust based on task complexity.
  • Task budgets and fast mode provide additional controls for cost management and latency optimization.
  • Model behavior varies—always check the documentation for your specific model, especially Claude Mythos Preview which has unique display defaults.