GuideBeginnerPricing2026-05-15

Claude API Feature Overview: A Practical Guide to Model Capabilities, Tools, and Context Management

Explore the five core areas of the Claude API surface: model capabilities, tools, context management, files, and tool infrastructure. Learn how to steer Claude's reasoning, use tools, and optimize costs with practical code examples.

Quick Answer

This guide breaks down the Claude API into five areas: model capabilities (thinking, structured outputs), tools (web search, code execution), context management (prompt caching, compaction), files (PDF, images), and tool infrastructure (MCP, orchestration). You'll learn how to use each area with practical code examples.

Claude APIModel CapabilitiesToolsContext ManagementBatch Processing

Introduction

Claude's API surface is organized into five core areas: Model capabilities, Tools, Tool infrastructure, Context management, and Files and assets. Each area gives you different levers to control how Claude reasons, interacts with external systems, and handles long-running conversations. This guide walks through each area with practical code examples and explains how features map to availability (Beta, GA, or Deprecated) across platforms like Claude API, AWS Bedrock, Vertex AI, and Microsoft Foundry.

1. Model Capabilities: Steering Claude's Reasoning and Output

Model capabilities control how Claude processes input and formats responses. Key features include:

Extended Thinking & Adaptive Thinking: Let Claude reason step-by-step before answering. With Adaptive Thinking (GA on Claude API), you can set the effort parameter to let Claude dynamically decide how much to think.
Structured Outputs: Enforce JSON or other structured formats for machine-readable responses.
Citations: Ground responses in source documents with exact sentence references.
Multilingual Support: Claude works across dozens of languages.
Zero Data Retention (ZDR): Eligible for many features, ensuring your data isn't stored.

Example: Using Adaptive Thinking with Effort Parameter

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-20250514",
    max_tokens=1024,
    thinking={
        "type": "enabled",
        "budget_tokens": 2048,
        "effort": "high"  # Controls thinking depth
    },
    messages=[
        {"role": "user", "content": "Analyze the pros and cons of using microservices vs monoliths for a startup."}
    ]
)
print(response.content[0].text)

Batch Processing for Cost Savings

Batch API calls cost 50% less than standard API calls. Use this for large volumes of non-urgent requests.

import anthropic
client = anthropic.Anthropic()
batch = client.batches.create(
    requests=[
        {
            "custom_id": "req-001",
            "params": {
                "model": "claude-sonnet-4-20250514",
                "max_tokens": 256,
                "messages": [{"role": "user", "content": "Summarize this article."}]
            }
        },
        # Add more requests...
    ]
)
print(f"Batch ID: {batch.id}")

2. Tools: Let Claude Take Actions

Tools extend Claude's capabilities to interact with the outside world. The API supports:

Web Search Tool: Let Claude search the web for real-time information.
Code Execution Tool: Run Python code in a sandboxed environment.
Computer Use Tool: Control a virtual desktop environment.
Memory Tool: Store and retrieve information across conversations.
Bash Tool: Execute shell commands.
Text Editor Tool: Read, write, and edit files.
Advisor Tool: Get guidance on complex tasks.

Example: Using the Web Search Tool

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[
        {
            "type": "web_search",
            "name": "web_search",
            "description": "Search the web for current information."
        }
    ],
    messages=[
        {"role": "user", "content": "What is the latest news about Claude AI?"]
    ]
)
Claude will automatically decide when to call the tool
print(response.content[0].text)

Tool Use Best Practices

Parallel Tool Use: Claude can call multiple tools simultaneously for efficiency.
Strict Tool Use: Force Claude to use a specific tool when needed.
Tool Runner (SDK): Automate tool execution with built-in SDK helpers.
Fine-grained Tool Streaming: Stream tool calls and results incrementally.

3. Tool Infrastructure: Discovery and Orchestration at Scale

When you have many tools, you need infrastructure to manage them. Claude supports:

MCP (Model Context Protocol): A standard for connecting Claude to external tools and data sources.
Remote MCP Servers: Connect to tools hosted on remote servers.
MCP Connector: Bridge between Claude and your existing tool ecosystem.
Tool Search: Let Claude discover the right tool from a large catalog.
Tool Combinations: Chain multiple tools together for complex workflows.
Programmatic Tool Calling: Call tools directly from your code without Claude deciding.

Example: Setting Up a Remote MCP Server

import { MCPClient } from "@anthropic-ai/sdk";
const client = new MCPClient({
  apiKey: process.env.ANTHROPIC_API_KEY,
});
// Connect to a remote MCP server
const mcpServer = await client.mcp.connect({
  url: "https://my-mcp-server.example.com",
  auth: {
    type: "bearer",
    token: process.env.MCP_SERVER_TOKEN,
  },
});
// Use tools from the MCP server
const result = await mcpServer.useTool({
  name: "database_query",
  input: { query: "SELECT * FROM users LIMIT 10" },
});
console.log(result);

4. Context Management: Keeping Long Sessions Efficient

Long conversations can consume many tokens. Context management features help:

Context Windows: Up to 1M tokens for processing large documents and codebases.
Compaction: Reduce token usage by summarizing or pruning older messages.
Context Editing: Manually remove or modify parts of the conversation history.
Prompt Caching: Reuse cached prompts across requests to reduce latency and cost.
Token Counting: Estimate token usage before sending a request.

Example: Using Prompt Caching

import anthropic
client = anthropic.Anthropic()
Cache a system prompt for reuse
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=512,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant specialized in Python programming.",
            "cache_control": {"type": "ephemeral"}  # Cache this prompt
        }
    ],
    messages=[
        {"role": "user", "content": "Explain list comprehensions."}
    ]
)
print(response.content[0].text)

5. Files and Assets: Managing Documents and Data

Claude can process various file types:

PDF Support: Extract text and layout from PDFs.
Images and Vision: Analyze images with multimodal models.
Files API: Upload and reference files in conversations.

Example: Processing a PDF with Citations

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": "<base64-encoded-pdf>"
                    }
                },
                {
                    "type": "text",
                    "text": "Summarize this document and cite key statistics."
                }
            ]
        }
    ]
)
print(response.content[0].text)

Feature Availability Across Platforms

Not all features are available everywhere. Here's a quick reference:

Feature	Claude API	AWS Bedrock	Vertex AI	Microsoft Foundry
Context Windows (1M tokens)	GA	GA	GA	Beta
Adaptive Thinking	GA	GA	GA	Beta
Batch Processing	GA	GA	GA	GA
Citations	GA	GA	GA	Beta
Web Search Tool	Beta	Beta	Beta	Beta
Code Execution Tool	Beta	Beta	Beta	Beta
Prompt Caching	GA	GA	GA	Beta
Structured Outputs	GA	GA	GA	GA

Features marked as Beta may change significantly and are not guaranteed for production use. Always check the feature's documentation for the latest status.

Key Takeaways

Claude's API is organized into five areas: Model capabilities, tools, tool infrastructure, context management, and files. Start with model capabilities and tools, then explore the others for optimization.
Use Adaptive Thinking for complex reasoning: Set the effort parameter to control thinking depth without manual tuning.
Batch processing cuts costs by 50%: Use the Batch API for large, non-urgent workloads.
Leverage tools for real-world actions: Web search, code execution, and memory tools let Claude interact with external systems.
Manage context efficiently: Use prompt caching, compaction, and context editing to keep long sessions fast and cost-effective.