BeClaude
GuideBeginnerAPI2026-05-22

Mastering Extended Thinking in Claude: A Complete Guide to Adaptive Reasoning

Learn how to use Claude's extended thinking feature for complex reasoning tasks. Covers adaptive thinking, effort parameters, budget tokens, and practical API examples.

Quick Answer

This guide explains how to enable and optimize Claude's extended thinking capability, including adaptive thinking with the effort parameter, manual budget tokens, and best practices for complex reasoning tasks.

extended thinkingadaptive thinkingClaude APIreasoningbudget tokens

Mastering Extended Thinking in Claude: A Complete Guide to Adaptive Reasoning

Claude's extended thinking feature unlocks enhanced reasoning capabilities for complex tasks, giving the model room to "think through" problems step-by-step before delivering a final answer. Whether you're building a research assistant, a code analysis tool, or a multi-step reasoning agent, understanding how to configure and use extended thinking is essential.

This guide covers everything from the basics of enabling extended thinking to advanced configuration with adaptive thinking and effort parameters. You'll learn practical API patterns, model-specific behavior, and best practices to get the most out of Claude's reasoning abilities.

What Is Extended Thinking?

Extended thinking allows Claude to generate internal reasoning content blocks before producing its final response. These thinking blocks contain the model's step-by-step analysis, which it then uses to craft a more accurate and well-reasoned answer.

When extended thinking is enabled, the API response includes one or more thinking content blocks followed by text content blocks. The thinking blocks contain Claude's internal reasoning and a cryptographic signature for verification.

Here's what a typical response looks like:

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

Adaptive Thinking vs. Manual Extended Thinking

Claude offers two modes for extended thinking: adaptive thinking (recommended) and manual extended thinking (deprecated on most models).

Adaptive Thinking (Recommended)

Adaptive thinking lets Claude automatically decide how much reasoning to apply based on the complexity of the task. You control the reasoning depth using the effort parameter, which accepts values from 0.0 (minimum reasoning) to 1.0 (maximum reasoning).

Key benefits:
  • Automatically adjusts reasoning depth per request
  • No need to guess a token budget
  • Supported on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6
  • Future-proof: manual mode is being phased out

Manual Extended Thinking (Deprecated)

Manual extended thinking requires you to specify a budget_tokens value, which sets a hard limit on the number of tokens Claude can use for reasoning. This mode is deprecated on Claude Opus 4.7 (returns a 400 error) and will be removed from other models in future releases.

Model-Specific Behavior

Different Claude models handle extended thinking differently. Here's what you need to know:

ModelAdaptive ThinkingManual ThinkingNotes
Claude Opus 4.7✅ Required❌ Returns 400 errorUse thinking: {type: "adaptive"} with effort
Claude Opus 4.6✅ Recommended✅ Deprecated but functionalMigrate to adaptive thinking
Claude Sonnet 4.6✅ Recommended✅ Deprecated, interleaved modeMigrate to adaptive thinking
Claude Mythos Preview✅ Default✅ Acceptedthinking: {type: "disabled"} not supported; display defaults to "omitted"

How to Enable Extended Thinking

Using Adaptive Thinking (Python)

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-7", max_tokens=4096, thinking={ "type": "adaptive", "effort": 0.8 # 0.0 (min) to 1.0 (max) }, messages=[ { "role": "user", "content": "Analyze the following code for potential security vulnerabilities and suggest fixes:\n\n

python\ndef authenticate(user, password):\n if user == 'admin' and password == 'secret123':\n return True\n return False\n``" } ] )

Access thinking and text content

for block in response.content: if block.type == "thinking": print("Thinking:", block.thinking[:200] + "...") elif block.type == "text": print("Response:", block.text)
### Using Manual Extended Thinking (Python - Legacy)
python import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-sonnet-4-6", max_tokens=4096, thinking={ "type": "enabled", "budget_tokens": 16000 # Max tokens for reasoning }, messages=[ { "role": "user", "content": "Explain the P vs NP problem in computer science, including its implications for cryptography." } ] )

### Using Adaptive Thinking (TypeScript)
typescript import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const response = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 4096, thinking: { type: 'adaptive', effort: 0.9 }, messages: [ { role: 'user', content: 'Design a distributed caching system that handles cache invalidation across multiple regions. Consider consistency, latency, and failure modes.' } ] });

for (const block of response.content) { if (block.type === 'thinking') { console.log('Thinking:', block.thinking); } else if (block.type === 'text') { console.log('Response:', block.text); } }

## Understanding the Effort Parameter

The effort parameter in adaptive thinking gives you fine-grained control over reasoning depth:

  • 0.0: Minimal reasoning – Claude skips most internal deliberation and produces a quick response. Best for simple, factual queries.
  • 0.3–0.5: Moderate reasoning – Good balance for everyday complex tasks like code review or data analysis.
  • 0.7–0.9: Deep reasoning – Ideal for research, mathematical proofs, or multi-step problem solving.
  • 1.0: Maximum reasoning – Use for the most challenging problems where accuracy is critical and latency is acceptable.
Pro tip: Start with
effort: 0.5 and increase if you need deeper analysis. Higher effort values increase response time and token usage.

Best Practices for Extended Thinking

1. Set Appropriate Max Tokens

Extended thinking consumes tokens from your max_tokens budget. Ensure max_tokens is large enough to accommodate both thinking and the final response. A good rule of thumb:

max_tokens = budget_tokens (or effort-equivalent) + expected_response_tokens
For complex tasks, start with max_tokens: 8192 and adjust based on observed usage.

2. Handle Thinking Blocks in Streaming

When streaming responses, thinking blocks appear before text blocks. Process them accordingly:

python import anthropic

client = anthropic.Anthropic()

with client.messages.stream( model="claude-opus-4-7", max_tokens=4096, thinking={"type": "adaptive", "effort": 0.7}, messages=[{"role": "user", "content": "Solve this complex math problem..."}] ) as stream: for event in stream: if event.type == "content_block_delta" and event.delta.type == "thinking_delta": # Accumulate thinking content pass elif event.type == "content_block_delta" and event.delta.type == "text_delta": # Output final response print(event.delta.text, end="") `

3. Use for Multi-Step Reasoning Tasks

Extended thinking shines in tasks that require:

  • Complex mathematical calculations
  • Multi-step code analysis and debugging
  • Research synthesis from multiple sources
  • Strategic planning and decision trees
  • Legal or regulatory analysis

4. Monitor Token Usage

Extended thinking can significantly increase token consumption. Monitor your usage and adjust effort/budget accordingly. For production systems, consider:

  • Setting lower effort values for simple queries
  • Caching responses for repeated questions
  • Using prompt caching to reduce costs on long context windows

Common Pitfalls to Avoid

  • Setting budget_tokens too low: If the budget is too small, Claude may cut off reasoning prematurely, leading to incomplete or inaccurate responses.
  • Forgetting to update max_tokens: Extended thinking tokens count toward your max_tokens limit. If max_tokens is too low, the response may be truncated.
  • Using manual mode on Opus 4.7: This returns a 400 error. Always use adaptive thinking with the effort parameter.
  • Ignoring the signature: The signature field in thinking blocks is essential for verifying the integrity of Claude's reasoning, especially in regulated environments.

Key Takeaways

  • Adaptive thinking (thinking: {type: "adaptive"}) is the recommended approach for all current Claude models, with the effort parameter (0.0–1.0) controlling reasoning depth.
  • Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is deprecated on Claude Opus 4.7 and will be removed from other models in future releases.
  • Model behavior varies: Claude Opus 4.7 requires adaptive thinking, while Claude Mythos Preview defaults to it. Always check the model's documentation.
  • Set max_tokens` generously to accommodate both thinking and response tokens. A good starting point is 8192 tokens for complex tasks.
  • Stream thinking blocks carefully – they arrive before text blocks and require different handling in streaming mode.
Extended thinking is a powerful tool for unlocking Claude's full reasoning potential. By understanding the configuration options and best practices outlined in this guide, you can build applications that leverage deep, transparent reasoning for even the most challenging problems.