GuideBeginnerBest Practices2026-05-12

Claude API Feature Overview: A Practical Guide to Model Capabilities, Tools, and Context Management

Explore Claude's five core API areas—model capabilities, tools, context management, and more. Learn how to use each feature with code examples and best practices.

Quick Answer

This guide walks you through Claude's API surface—model capabilities, tools, context management, files, and tool infrastructure—with practical code examples and feature availability details for each platform.

Claude APImodel capabilitiestoolscontext managementbatch processing

Claude API Feature Overview: A Practical Guide

Claude's API is designed to be both powerful and flexible, offering a rich set of features that go far beyond simple text generation. Whether you're building a chatbot, an automated research assistant, or a code review tool, understanding the five core areas of the API surface will help you get the most out of Claude.

This guide provides a practical, actionable walkthrough of each area, with code examples and best practices. By the end, you'll know exactly which features to use for your use case and how to combine them effectively.

The Five Core Areas of the Claude API

Claude's API surface is organized into five areas:

Model capabilities – Control how Claude reasons and formats responses.
Tools – Let Claude take actions on the web or in your environment.
Tool infrastructure – Handle discovery and orchestration at scale.
Context management – Keep long-running sessions efficient.
Files and assets – Manage the documents and data you provide to Claude.

If you're new to the API, start with model capabilities and tools. Return to the other sections when you're ready to optimize cost, latency, or scale.

1. Model Capabilities: Steering Claude's Outputs

Model capabilities are the direct ways you control what Claude produces. This includes reasoning depth, response format, and input modalities.

Extended Thinking (Adaptive Thinking)

Claude can now dynamically decide when and how much to "think" before responding. This is especially useful for complex reasoning tasks like math, code generation, or multi-step analysis.

Example: Using adaptive thinking with the effort parameter

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-20250514",
    max_tokens=1024,
    thinking={
        "type": "enabled",
        "budget_tokens": 2048,
        "effort": "high"  # Options: low, medium, high
    },
    messages=[
        {"role": "user", "content": "Explain the proof of Fermat's Last Theorem in simple terms."}
    ]
)
print(response.content[0].text)

Tip: Use effort: "low" for simple tasks to save tokens and reduce latency. Use effort: "high" for complex reasoning.

Structured Outputs

You can ask Claude to return responses in a structured format like JSON, which is perfect for programmatic consumption.

Example: Requesting JSON output

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="Always respond in valid JSON format.",
    messages=[
        {"role": "user", "content": "List three famous scientists and their discoveries as a JSON array."}
    ]
)
import json
data = json.loads(response.content[0].text)
print(data)

Batch Processing for Cost Savings

Batch processing lets you send large volumes of requests asynchronously, costing 50% less than standard API calls. This is ideal for data enrichment, content generation at scale, or backfill tasks.

Example: Creating a batch

batch = client.batches.create(
    requests=[
        {
            "custom_id": "req-001",
            "params": {
                "model": "claude-sonnet-4-20250514",
                "max_tokens": 256,
                "messages": [{"role": "user", "content": "Summarize: E=mc^2"}]
            }
        },
        {
            "custom_id": "req-002",
            "params": {
                "model": "claude-sonnet-4-20250514",
                "max_tokens": 256,
                "messages": [{"role": "user", "content": "Summarize: Quantum entanglement"}]
            }
        }
    ]
)
print(f"Batch created with ID: {batch.id}")

Note: Batch processing is not eligible for Zero Data Retention (ZDR). Use it for non-sensitive workloads.

Citations

Citations allow Claude to ground its responses in source documents, providing exact references to the sentences it used. This is critical for legal, academic, or fact-checking applications.

Example: Enabling citations

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    documents=[
        {
            "type": "text",
            "title": "Climate Report 2024",
            "content": "Global temperatures have risen by 1.2°C since pre-industrial levels..."
        }
    ],
    messages=[
        {"role": "user", "content": "What does the report say about temperature rise?"}
    ]
)
Citations are included in the response metadata
print(response.content[0].citations)

2. Tools: Let Claude Take Action

Tools extend Claude's capabilities beyond text generation. You can define custom tools, use built-in tools, or let Claude call external APIs.

Defining Custom Tools

You can define tools using a JSON schema. Claude will decide when to call them based on the conversation.

Example: A weather lookup tool

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
]
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo?"}
    ]
)
Claude will respond with a tool_use block
print(response.content)

Built-in Tools

Claude provides several built-in tools:

Web search tool – Fetch real-time information from the web.
Code execution tool – Run Python code in a sandboxed environment.
Computer use tool – Let Claude interact with a virtual desktop.
Memory tool – Store and retrieve information across sessions.
Bash tool – Execute shell commands.

Example: Using the web search tool

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[{"type": "web_search"}],
    messages=[
        {"role": "user", "content": "What is the latest news about AI regulation?"}
    ]
)
print(response.content[0].text)

3. Tool Infrastructure: Discovery and Orchestration at Scale

When you have many tools, managing them becomes a challenge. Claude's tool infrastructure handles:

Tool discovery – Automatically find the right tool for a task.
Tool combinations – Use multiple tools together in a single workflow.
Fine-grained tool streaming – Stream tool calls and results in real time.
Programmatic tool calling – Force Claude to call a specific tool.

4. Context Management: Keeping Sessions Efficient

Long conversations or large documents can quickly consume your context window. Claude provides several features to manage this.

Context Windows

Claude supports context windows up to 1 million tokens, enough to process entire codebases or lengthy books.

Prompt Caching

Reduce latency and cost by caching frequently used context (e.g., system prompts or reference documents).

Example: Using prompt caching

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant with knowledge of our company policies.",
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[
        {"role": "user", "content": "What is our return policy?"}
    ]
)

Compaction and Context Editing

Compaction – Summarize older parts of a conversation to free up tokens.
Context editing – Manually remove or modify parts of the context.

5. Files and Assets: Managing Documents and Data

Claude can work with various file types, including PDFs, images, and code files.

PDF Support

Claude can read and analyze PDF documents, extracting text and layout information.

Example: Uploading a PDF

with open("report.pdf", "rb") as f:
    pdf_data = f.read()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": base64.b64encode(pdf_data).decode()}},
                {"type": "text", "text": "Summarize this report."}
            ]
        }
    ]
)

Images and Vision

Claude can analyze images, diagrams, and screenshots.

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}},
                {"type": "text", "text": "Describe this chart."}
            ]
        }
    ]
)

Feature Availability by Platform

Not all features are available on every platform. Here's a quick reference:

Feature	Claude API	AWS Bedrock	Vertex AI	Microsoft Foundry
Context windows (1M tokens)	GA	GA	GA	Beta
Adaptive thinking	GA	GA	GA	Beta
Batch processing	GA	GA	GA	GA
Citations	GA	GA	GA	GA
Prompt caching	GA	GA	GA	Beta
Web search tool	GA	GA	GA	Beta
Code execution tool	GA	GA	GA	Beta

Note: Features marked as "Beta" may have limited availability or breaking changes. Always check the latest documentation.

Putting It All Together: A Practical Workflow

Here's a real-world example combining multiple features:

import anthropic
client = anthropic.Anthropic()
Step 1: Upload a PDF document
with open("research_paper.pdf", "rb") as f:
    pdf_data = f.read()
Step 2: Ask Claude to analyze it with citations and web search
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=2048,
    thinking={"type": "enabled", "budget_tokens": 1024, "effort": "medium"},
    tools=[{"type": "web_search"}],
    documents=[
        {
            "type": "document",
            "source": {"type": "base64", "media_type": "application/pdf", "data": base64.b64encode(pdf_data).decode()},
            "title": "Research Paper"
        }
    ],
    messages=[
        {
            "role": "user",
            "content": "Summarize this paper and find recent related research on the web. Provide citations for both."
        }
    ]
)
print(response.content[0].text)

This workflow uses:

Files (PDF upload)
Model capabilities (adaptive thinking)
Tools (web search)
Citations (grounding responses)

Key Takeaways

Claude's API is organized into five areas: model capabilities, tools, tool infrastructure, context management, and files. Start with model capabilities and tools, then optimize with context management.
Use adaptive thinking for complex tasks – The effort parameter lets you control reasoning depth, saving tokens on simple queries.
Batch processing cuts costs by 50% – Ideal for large-scale, non-urgent workloads. Remember it's not ZDR eligible.
Leverage built-in tools – Web search, code execution, and memory tools can dramatically expand what Claude can do without custom code.
Check feature availability per platform – Not all features are GA everywhere. Always verify before building production systems.