GuideBeginnerAPI2026-05-14

Mastering Extended Thinking in Claude: A Guide to Adaptive Reasoning for Complex Tasks

Learn how to use Claude's extended thinking feature for enhanced reasoning. Covers adaptive thinking, effort parameters, budget tokens, and practical API examples for Opus and Sonnet models.

Quick Answer

This guide explains how to enable and configure Claude's extended thinking feature—including adaptive thinking with effort parameters and manual budget tokens—to get step-by-step reasoning for complex tasks, with code examples for the Messages API.

extended thinkingadaptive thinkingClaude APIreasoningcomplex tasks

Mastering Extended Thinking in Claude: A Guide to Adaptive Reasoning for Complex Tasks

Claude's extended thinking feature unlocks a new level of reasoning capability, allowing the model to show its step-by-step thought process before delivering a final answer. Whether you're debugging intricate code, solving advanced mathematical proofs, or analyzing multi-layered business logic, extended thinking gives you both transparency and power.

This guide covers everything you need to know: how extended thinking works, the differences between adaptive and manual modes, which models support what, and practical API examples to get you started.

What Is Extended Thinking?

When extended thinking is enabled, Claude generates internal reasoning—visible as thinking content blocks in the API response—before producing its final text response. This allows you to:

Inspect Claude's reasoning chain for debugging or trust verification.
Improve accuracy on complex, multi-step problems.
Control reasoning depth via budget tokens or effort parameters.

The response format looks like this:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

Note: The thinking block includes a signature for verifying the integrity of the reasoning. This is especially useful in enterprise or regulated environments.

Adaptive Thinking vs. Manual Extended Thinking

Claude offers two modes for extended thinking:

Adaptive Thinking (Recommended)

Adaptive thinking (thinking: {type: "adaptive"}) lets Claude decide how much reasoning to use based on the complexity of the task. You can optionally control the effort level:

"low" – minimal reasoning, faster responses.
"medium" – balanced reasoning (default).
"high" – deep reasoning for very complex tasks.

This is the only supported mode on Claude Opus 4.7 and is recommended for Opus 4.6 and Sonnet 4.6 as well.

Manual Extended Thinking (Deprecated on Opus 4.7)

Manual mode (thinking: {type: "enabled", budget_tokens: N}) allows you to set a fixed token budget for reasoning. It is:

No longer supported on Claude Opus 4.7 (returns a 400 error).
Deprecated but functional on Opus 4.6 and Sonnet 4.6.
Still accepted on Claude Mythos Preview (though adaptive is the default).

Migration tip: If you're currently using manual mode on Opus 4.6 or Sonnet 4.6, plan to switch to adaptive thinking before the next model update.

Model-Specific Behavior

Model	Adaptive Thinking	Manual Thinking	Notes
Claude Opus 4.7	✅ Supported	❌ Returns 400 error	Use `effort` parameter to control depth
Claude Opus 4.6	✅ Recommended	✅ Deprecated but works	Migrate to adaptive soon
Claude Sonnet 4.6	✅ Recommended	✅ Deprecated (interleaved mode)	Migrate to adaptive soon
Claude Mythos Preview	✅ Default	✅ Accepted	`thinking: {type: "disabled"}` not supported; use `display: "summarized"` for summaries

How to Use Extended Thinking in the API

Basic Example with Adaptive Thinking (Opus 4.7)

Here's how to enable adaptive thinking with medium effort on Claude Opus 4.7:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={
        "type": "adaptive",
        "effort": "medium"
    },
    messages=[
        {
            "role": "user",
            "content": "Prove that there are infinitely many prime numbers congruent to 3 modulo 4."
        }
    ]
)
Print thinking and final response
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking summary:\n{block.thinking}")
    elif block.type == "text":
        print(f"\nFinal answer:\n{block.text}")

Example with Manual Extended Thinking (Sonnet 4.6)

For models that still support manual mode, you can set a specific token budget:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000
    },
    messages=[
        {
            "role": "user",
            "content": "Design a distributed caching system that handles cache invalidation across 100 nodes."
        }
    ]
)
for block in response.content:
    if block.type == "thinking":
        print(f"\nThinking:\n{block.thinking}")
    elif block.type == "text":
        print(f"\nResponse:\n{block.text}")

Important: budget_tokens must be less than max_tokens. The reasoning tokens count toward your total token usage.

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
  const response = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 16000,
    thinking: {
      type: 'adaptive',
      effort: 'high'
    },
    messages: [
      {
        role: 'user',
        content: 'Explain the P vs NP problem and its implications for cryptography.'
      }
    ]
  });
for (const block of response.content) {
    if (block.type === 'thinking') {
      console.log('Thinking:', block.thinking);
    } else if (block.type === 'text') {
      console.log('Response:', block.text);
    }
  }
}
main();

Best Practices for Extended Thinking

1. Choose the Right Effort Level

Low effort: Use for straightforward tasks like summarization or simple Q&A where you want faster responses.
Medium effort: Default for most complex tasks like code generation, analysis, or planning.
High effort: Reserve for tasks requiring deep reasoning, such as mathematical proofs, complex debugging, or multi-step strategic planning.

2. Set Appropriate `max_tokens`

Extended thinking consumes tokens from your max_tokens budget. If you set budget_tokens to 10,000, ensure max_tokens is at least 10,000 + expected output length. A safe rule: set max_tokens to 1.5x your budget_tokens.

3. Handle Thinking Blocks in Streaming

When streaming, thinking blocks appear as separate events. Process them accordingly:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive", "effort": "medium"},
    messages=[{"role": "user", "content": "Solve this: 2x + 5 = 13"}]
) as stream:
    for event in stream:
        if event.type == "content_block_delta" and event.delta.type == "thinking_delta":
            print(event.delta.thinking, end="")
        elif event.type == "content_block_delta" and event.delta.type == "text_delta":
            print(event.delta.text, end="")

4. Use with Structured Outputs

Extended thinking works well with structured outputs (JSON mode). Claude reasons first, then outputs structured data:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=16000,
    thinking={"type": "adaptive", "effort": "high"},
    messages=[
        {
            "role": "user",
            "content": "Analyze this dataset and return a JSON object with key findings."
        }
    ]
)

5. Monitor Token Usage

Extended thinking increases token consumption. Monitor your usage carefully, especially with high effort or large budget tokens. Use the usage field in the response to track input, output, and thinking tokens.

Common Pitfalls to Avoid

Using manual mode on Opus 4.7: This will return a 400 error. Always use adaptive thinking.
Setting budget_tokens higher than max_tokens: This causes an error. Ensure max_tokens > budget_tokens.
Ignoring the signature field: If you need to verify reasoning integrity, always check the signature.
Expecting thinking blocks in all responses: Some models (like Mythos Preview) may omit thinking content by default; use display: "summarized" to get summaries.

Key Takeaways

Adaptive thinking is the future: Claude Opus 4.7 requires adaptive thinking (type: "adaptive" with effort parameter). Manual mode is deprecated on older models and will be removed.
Control reasoning depth with effort: Use low, medium, or high effort to balance speed and depth.
Thinking blocks are transparent: The API returns reasoning as separate thinking content blocks with signatures for verification.
Token budgeting matters: Extended thinking consumes tokens from your max_tokens budget—plan accordingly.
Model behavior varies: Always check which mode your model supports. Claude Mythos Preview has unique defaults and display options.

By mastering extended thinking, you can tackle the most demanding tasks with Claude—from advanced mathematics to complex system design—with full visibility into the model's reasoning process.

Mastering Extended Thinking in Claude: A Guide to Adaptive Reasoning for Complex Tasks

What Is Extended Thinking?

Adaptive Thinking vs. Manual Extended Thinking

Adaptive Thinking (Recommended)

Manual Extended Thinking (Deprecated on Opus 4.7)

Model-Specific Behavior

How to Use Extended Thinking in the API

Basic Example with Adaptive Thinking (Opus 4.7)

Print thinking and final response

Example with Manual Extended Thinking (Sonnet 4.6)

TypeScript Example

Best Practices for Extended Thinking

1. Choose the Right Effort Level

2. Set Appropriate max_tokens

3. Handle Thinking Blocks in Streaming

4. Use with Structured Outputs

5. Monitor Token Usage

Common Pitfalls to Avoid

Key Takeaways

2. Set Appropriate `max_tokens`