BeClaude
GuideBeginnerAgents2026-05-20

Claude API Feature Overview: A Practical Guide to Model Capabilities, Tools, and Context Management

Explore the five core areas of the Claude API: model capabilities, tools, context management, files, and tool infrastructure. Learn how to use each feature with code examples and best practices.

Quick Answer

This guide breaks down the Claude API into five feature areas: model capabilities (thinking, structured outputs), tools (web search, code execution), context management (prompt caching, compaction), files (PDF, images), and tool infrastructure (discovery, orchestration). You'll learn when and how to use each feature with practical code snippets.

Claude APIExtended ThinkingTool UseContext ManagementStructured Outputs

Claude API Feature Overview: A Practical Guide

Whether you're building a simple chatbot or a complex agentic system, understanding the Claude API's feature landscape is essential. The API surface is organized into five core areas: model capabilities, tools, tool infrastructure, context management, and files/assets. This guide walks through each area with practical advice and code examples to help you choose the right features for your use case.

1. Model Capabilities: Steering Claude's Output

Model capabilities control how Claude reasons and what it returns. These are the most fundamental features you'll use.

Extended Thinking & Adaptive Thinking

For complex reasoning tasks, Claude can "think" before responding. With extended thinking, you set a fixed thinking budget. Adaptive thinking (recommended for Opus 4.7) lets Claude decide dynamically how much to think, using the effort parameter.

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-opus-4-20250514", max_tokens=4096, thinking={ "type": "enabled", "budget_tokens": 2048 # Extended thinking budget }, messages=[ {"role": "user", "content": "Solve this complex math problem: integrate x^2 * sin(x) dx"} ] )

Access the thinking block

for block in response.content: if block.type == "thinking": print("Thinking:", block.thinking) elif block.type == "text": print("Answer:", block.text)

For adaptive thinking, use the effort parameter:

response = client.messages.create(
    model="claude-opus-4-20250514",
    max_tokens=4096,
    thinking={
        "type": "enabled",
        "effort": "high"  # Options: low, medium, high
    },
    messages=[{"role": "user", "content": "Explain quantum entanglement in simple terms"}]
)

Structured Outputs

When you need Claude to return data in a specific format (JSON, YAML, etc.), use structured outputs. Define a schema and Claude will comply reliably.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Extract the name, date, and amount from this invoice: ..."}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "invoice_extraction",
            "schema": {
                "type": "object",
                "properties": {
                    "invoice_number": {"type": "string"},
                    "date": {"type": "string"},
                    "total_amount": {"type": "number"}
                },
                "required": ["invoice_number", "date", "total_amount"]
            }
        }
    }
)

Citations

Ground Claude's responses in source documents. When you provide reference material, Claude can cite specific passages.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{
        "role": "user", 
        "content": "Summarize the key findings from the attached research paper."
    }],
    documents=[{
        "type": "document",
        "source": {"type": "text", "content": "... full paper text ..."},
        "citations": {"enabled": True}
    }]
)

2. Tools: Letting Claude Take Action

Tools extend Claude's capabilities beyond text generation. You can give Claude access to web search, code execution, file operations, and more.

Built-in Tools

Anthropic provides several pre-built tools:

  • Web Search Tool: Fetch real-time information from the web
  • Code Execution Tool: Run Python code in a sandboxed environment
  • Text Editor Tool: Read, write, and edit files
  • Computer Use Tool (beta): Control a virtual desktop
Example using the web search tool:
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    tools=[{
        "type": "web_search",
        "name": "web_search"
    }],
    messages=[{"role": "user", "content": "What's the latest news on AI regulation in the EU?"}]
)

Custom Tool Definition

Define your own tools with a JSON schema:

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
]

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=tools, messages=[{"role": "user", "content": "What's the weather in Tokyo?"}] )

Handle tool use

if response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": print(f"Tool called: {block.name}") print(f"Arguments: {block.input}")

3. Context Management: Keeping Sessions Efficient

Long conversations or large documents can consume significant context. Claude offers several features to manage this.

Prompt Caching

Cache frequently used context (system prompts, document chunks) to reduce latency and cost.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant with knowledge of our product documentation.",
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[{"role": "user", "content": "How do I reset my password?"}]
)

Check cache metrics

print(f"Cache created: {response.usage.cache_creation_input_tokens}") print(f"Cache read: {response.usage.cache_read_input_tokens}")

Context Compaction

For long-running sessions, compact the conversation history to stay within context limits.

# After many turns, compact the conversation
compacted = client.messages.compact(
    messages=conversation_history  # List of messages
)

Use the compacted version for the next request

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=compacted + [{"role": "user", "content": "Continue our discussion..."}] )

4. Files and Assets: Working with Documents

Claude can process various file types directly.

PDF Support

import base64

with open("report.pdf", "rb") as f: pdf_data = base64.b64encode(f.read()).decode("utf-8")

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=2048, messages=[{ "role": "user", "content": [ { "type": "document", "source": { "type": "base64", "media_type": "application/pdf", "data": pdf_data } }, { "type": "text", "text": "Summarize this PDF." } ] }] )

Image Understanding

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/jpeg",
                    "data": image_base64
                }
            },
            {"type": "text", "text": "Describe this image in detail."}
        ]
    }]
)

5. Tool Infrastructure: Orchestration at Scale

When building complex agents, you need infrastructure for tool discovery, routing, and execution.

MCP (Model Context Protocol)

MCP provides a standardized way to connect Claude to external tools and data sources. Use remote MCP servers for production deployments.

# Configure MCP connector
from anthropic import Anthropic

client = Anthropic()

Tools can be discovered dynamically via MCP

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=2048, tools=[{ "type": "mcp", "server_url": "https://my-mcp-server.example.com/tools" }], messages=[{"role": "user", "content": "Query my database for recent orders."}] )

Parallel Tool Use

Claude can call multiple tools simultaneously for efficiency:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    tools=[weather_tool, news_tool, calendar_tool],
    parallel_tool_calls=True,  # Enable parallel execution
    messages=[{"role": "user", "content": "What's the weather today and do I have any meetings?"}]
)

Feature Availability by Platform

Not all features are available everywhere. Here's a quick reference:

FeatureClaude APIAWS BedrockVertex AI
Extended ThinkingGAGAGA
Adaptive ThinkingGAGAGA
Structured OutputsGAGAGA
Prompt CachingGAGABeta
Batch ProcessingGAGAGA
Computer UseBetaBeta-
MCPBetaBeta-
GA = Generally Available, Beta = Preview with possible changes

Best Practices

  • Start simple: Begin with model capabilities (thinking, structured outputs) before adding tools.
  • Use caching for static context: Cache system prompts and reference documents to reduce costs.
  • Monitor token usage: Use the usage field in responses to track input/output tokens.
  • Handle tool calls gracefully: Always check stop_reason and handle tool_use blocks.
  • Test with batch processing: For high-volume workloads, use batch API to save 50% on costs.

Key Takeaways

  • Claude's API is organized into five areas: model capabilities, tools, tool infrastructure, context management, and files/assets.
  • Use extended/adaptive thinking for complex reasoning and structured outputs for reliable data extraction.
  • Leverage built-in tools (web search, code execution) or define custom tools with JSON schemas.
  • Manage context efficiently with prompt caching and compaction to keep costs low and responses fast.
  • Check feature availability per platform (Claude API, Bedrock, Vertex AI) before building, as some features are in beta.