BeClaude
GuideBeginnerPricing2026-05-21

Mastering Claude's Effort Parameter: Control Thinking Depth and Token Spend

Learn how to use Claude's effort parameter to balance response thoroughness, speed, and cost. Includes code examples, effort levels, and best practices for Opus and Sonnet models.

Quick Answer

This guide explains Claude's effort parameter, which lets you control how eagerly Claude spends tokens on responses. You'll learn the five effort levels (low, medium, high, xhigh, max), how to set them in API calls, and when to use each for optimal balance of performance, speed, and cost.

effort parametertoken optimizationextended thinkingClaude APIcost management

Mastering Claude's Effort Parameter: Control Thinking Depth and Token Spend

Claude's effort parameter gives you fine-grained control over how many tokens your model spends on each response. Whether you're building a high-volume chat application, a complex agentic system, or a cost-sensitive tool, understanding effort is key to getting the best performance-to-cost ratio.

In this guide, you'll learn:

  • What the effort parameter is and how it works
  • The five effort levels and when to use each
  • How effort interacts with extended thinking
  • Practical code examples for Python and TypeScript
  • Best practices for different use cases

What Is the Effort Parameter?

The effort parameter lets you control how "eager" Claude is about spending tokens when responding to requests. By default, Claude uses high effort, spending as many tokens as needed for excellent results. You can raise the effort to max for the absolute highest capability, or lower it to low for faster, cheaper responses.

Key advantages:

  • Works without extended thinking — effort affects all tokens, including text responses and tool calls
  • Controls tool call frequency — lower effort means fewer tool calls, saving tokens
  • Single-model flexibility — you can trade off between thoroughness and efficiency without switching models

Supported Models

The effort parameter is available on:

  • Claude Mythos Preview
  • Claude Opus 4.7
  • Claude Opus 4.6
  • Claude Sonnet 4.6
  • Claude Opus 4.5
For Opus 4.6 and Sonnet 4.6, effort replaces the deprecated budget_tokens parameter.

Effort Levels Explained

LevelDescriptionTypical Use Case
lowMost efficient. Significant token savings with some capability reduction.Simple tasks, high-volume chat, subagents
mediumBalanced approach with moderate token savings.Agentic tasks needing speed/cost balance
high (default)High capability. Equivalent to omitting the parameter.Complex reasoning, coding, agentic tasks
xhighExtended capability for long-horizon work. Available on Opus 4.7.Long-running agentic/coding tasks (>30 min)
maxAbsolute maximum capability with no constraints.Deepest reasoning, most thorough analysis
Important: Effort is a behavioral signal, not a strict token budget. At lower levels, Claude will still think on sufficiently difficult problems — but it will think less than it would at higher levels for the same problem.

How Effort Works with Extended Thinking

When you combine effort with adaptive thinking (thinking: {type: "adaptive"}), Claude automatically adjusts its thinking depth based on the problem complexity. This is the recommended configuration for most use cases.

At high (default) and max effort, Claude will almost always think. At lower levels, it may skip thinking for simpler problems, saving tokens and reducing latency.

Code Examples

Python (using the Anthropic SDK)

import anthropic

client = anthropic.Anthropic()

Low effort — fast, cheap, for simple tasks

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, system="You are a helpful assistant.", messages=[{"role": "user", "content": "What is the capital of France?"}], extra_headers={"anthropic-effort": "low"} )

Medium effort — balanced for agentic tasks

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=4096, system="You are a coding assistant.", messages=[{"role": "user", "content": "Write a Python function to sort a list of dictionaries by a key."}], extra_headers={"anthropic-effort": "medium"} )

High effort (default) — complex reasoning

response = client.messages.create( model="claude-opus-4-20250514", max_tokens=8192, messages=[{"role": "user", "content": "Explain the implications of quantum computing on cryptography."}], # Omitting effort header defaults to "high" )

Max effort — deepest reasoning

response = client.messages.create( model="claude-opus-4-20250514", max_tokens=16384, messages=[{"role": "user", "content": "Prove the Riemann Hypothesis."}], extra_headers={"anthropic-effort": "max"} )

TypeScript (using the Anthropic SDK)

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

// Low effort const response = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 1024, system: 'You are a helpful assistant.', messages: [{ role: 'user', content: 'What is the capital of France?' }], extraHeaders: { 'anthropic-effort': 'low' } });

// Medium effort with adaptive thinking const response2 = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 4096, thinking: { type: 'adaptive' }, messages: [{ role: 'user', content: 'Debug this code: ...' }], extraHeaders: { 'anthropic-effort': 'medium' } });

// Max effort for complex analysis const response3 = await client.messages.create({ model: 'claude-opus-4-20250514', max_tokens: 16384, thinking: { type: 'adaptive' }, messages: [{ role: 'user', content: 'Analyze this legal contract...' }], extraHeaders: { 'anthropic-effort': 'max' } });

Recommended Effort Levels for Sonnet 4.6

Sonnet 4.6 defaults to high effort. To avoid unexpected latency and cost, explicitly set effort when using this model:

  • Medium effort (recommended default): Best balance for most applications. Suitable for agentic coding, tool-heavy workflows, and code generation.
  • Low effort: For high-volume or latency-sensitive workloads. Suitable for chat and non-coding use cases where speed matters more than depth.

Best Practices

1. Start with medium, then adjust

For new applications, begin with medium effort. Monitor response quality and token usage, then adjust up or down based on your specific needs.

2. Use adaptive thinking with effort

Combine effort with thinking: {type: "adaptive"} for the best experience. This lets Claude decide when to think deeply and when to respond quickly, saving tokens on simple queries.

3. Match effort to task complexity

  • Simple Q&A, classification, extraction: low
  • Multi-step agents, code generation: medium
  • Complex reasoning, analysis: high
  • Research-grade problems, deep analysis: max

4. Consider cost implications

Lower effort levels can significantly reduce token spend, especially on tool calls. For high-volume applications, even a 20% reduction in tokens per call can lead to substantial savings.

5. Test with representative workloads

Effort affects behavior differently depending on the problem. Always test with your actual use case to find the optimal level.

Common Pitfalls

  • Assuming low effort means no thinking: Claude will still think on difficult problems, just less deeply.
  • Forgetting to set effort on Sonnet 4.6: Defaults to high, which may be more expensive than needed.
  • Using effort without adaptive thinking: While effort works without thinking, combining them yields better results.
  • Expecting strict token budgets: Effort is a behavioral signal, not a hard limit.

Key Takeaways

  • Effort controls token spend across all response types, including text, tool calls, and extended thinking — without requiring thinking to be enabled.
  • Five levels (low, medium, high, xhigh, max) let you trade off between speed/cost and capability, all with a single model.
  • Combine effort with adaptive thinking for the best balance of performance and efficiency.
  • Explicitly set effort on Sonnet 4.6 to avoid unexpected latency and cost from the default high setting.
  • Start with medium effort for most applications, then adjust based on observed quality and token usage.