Claude API Features Overview: A Practical Guide to Model Capabilities, Tools, and Context Management
Explore Claude's API surface: model capabilities, tools, context management, and files. Learn how to steer reasoning, use tools, and optimize cost with practical code examples.
This guide walks you through Claude's five API areas: model capabilities (thinking, structured outputs), tools (web fetch, code execution), context management (prompt caching, compaction), and files (PDF, images). You'll learn when to use each and see practical Python code examples.
Introduction
Claude's API is not just a single endpoint — it's a rich ecosystem of features organized into five core areas. Whether you're building a simple chatbot, a complex agent, or a document analysis pipeline, understanding these areas helps you choose the right tools for the job.
This guide covers the Claude API surface as documented in the official overview, with practical advice and code examples for each area. By the end, you'll know how to steer Claude's reasoning, let it take actions, manage long conversations, and handle files efficiently.
The Five API Areas
The Claude API surface is organized into:
- Model capabilities — Control how Claude reasons and formats responses.
- Tools — Let Claude take actions on the web or in your environment.
- Tool infrastructure — Handle discovery and orchestration at scale.
- Context management — Keep long-running sessions efficient.
- Files and assets — Manage documents and data you provide to Claude.
1. Model Capabilities
Model capabilities are the direct outputs and reasoning controls you have over Claude. These include:
Extended Thinking & Adaptive Thinking
Claude can "think" before responding, which improves reasoning on complex tasks. With adaptive thinking, Claude dynamically decides when and how much to think — ideal for Opus 4.7. Use the effort parameter to control thinking depth.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-20250514",
max_tokens=4096,
thinking={"type": "enabled", "budget_tokens": 2048},
messages=[
{"role": "user", "content": "Solve this complex math problem: integrate x^2 * sin(x) dx"}
]
)
print(response.content[0].text)
Structured Outputs
Claude can return structured data (JSON) directly, making it easy to integrate with applications.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "List three famous physicists and their key contributions as JSON"}
],
response_format={"type": "json_object"}
)
import json
data = json.loads(response.content[0].text)
print(data)
Batch Processing
For large volumes of requests, use batch processing to save 50% on costs. Batches are processed asynchronously.
import anthropic
client = anthropic.Anthropic()
batch = client.messages.batches.create(
requests=[
{
"custom_id": "req-1",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "What is the capital of France?"}]
}
},
{
"custom_id": "req-2",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "What is the capital of Japan?"}]
}
}
]
)
print(f"Batch ID: {batch.id}")
Citations
Claude can ground responses in source documents by providing detailed references. This is critical for legal, medical, or research applications where accuracy and provenance matter.
2. Tools
Tools let Claude interact with the outside world. The API supports several built-in tools:
- Web fetch tool — Retrieve web pages
- Code execution tool — Run Python code in a sandbox
- Text editor tool — Edit files programmatically
- Computer use tool — Control a virtual desktop
- Bash tool — Execute shell commands
Example: Using the Web Fetch Tool
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
tools=[
{
"type": "web_fetch",
"name": "web_fetch",
"description": "Fetch a web page"
}
],
messages=[
{"role": "user", "content": "What is the latest news on AI from the Anthropic blog?"}
]
)
Claude will decide to call the web_fetch tool
print(response.content[0].text)
Parallel Tool Use
Claude can call multiple tools in parallel, speeding up complex workflows.
Strict Tool Use
You can enforce that Claude uses a specific tool, reducing hallucination in tool-driven applications.
3. Tool Infrastructure
When building at scale, you need tool discovery and orchestration. The API provides:
- Tool Runner (SDK) — Automates tool execution
- Server tools — Host tools remotely
- Fine-grained tool streaming — Stream tool calls incrementally
- Programmatic tool calling — Call tools without Claude deciding
Example: Programmatic Tool Calling
# Force Claude to use a specific tool without letting it choose
tool_call = {
"type": "tool_use",
"name": "web_fetch",
"input": {"url": "https://docs.anthropic.com"}
}
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Summarize the Anthropic documentation"},
{"role": "assistant", "content": [tool_call]}
]
)
4. Context Management
Long conversations consume tokens. Claude offers several features to manage context efficiently:
- Context windows — Up to 1M tokens for large documents
- Compaction — Reduce token usage without losing key information
- Prompt caching — Cache repeated system prompts or large documents to reduce latency and cost
- Token counting — Estimate token usage before sending
Example: Prompt Caching
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a helpful assistant with expertise in Python programming.",
"cache_control": {"type": "ephemeral"}
}
],
messages=[
{"role": "user", "content": "Write a Python function to sort a list of dictionaries by a key"}
]
)
5. Files and Assets
Claude can process various file types:
- PDF support — Extract text and layout from PDFs
- Images and vision — Analyze images (photos, diagrams, screenshots)
- Files API — Upload and reference files in conversations
Example: Processing a PDF
import base64
with open("document.pdf", "rb") as f:
pdf_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data
}
},
{
"type": "text",
"text": "Summarize this document in bullet points."
}
]
}
]
)
print(response.content[0].text)
Feature Availability Across Platforms
Not all features are available on every platform. The API docs classify features as:
- Beta — Preview, may change significantly
- Generally Available (GA) — Stable, production-ready
- Deprecated — Still functional, migration path provided
- Retired — No longer available
- Claude API (Anthropic first-party)
- Claude Platform on AWS (Anthropic-operated)
- Amazon Bedrock (AWS-operated)
- Vertex AI (Google-operated)
- Microsoft Foundry (Anthropic-operated on Azure)
Best Practices
- Start simple — Begin with model capabilities and tools before adding context management or complex infrastructure.
- Use batch processing for high volume — Save 50% on costs by batching asynchronous requests.
- Leverage prompt caching — Cache system prompts and large documents to reduce latency and token usage.
- Monitor token usage — Use the token counting endpoint to estimate costs before sending requests.
- Choose the right thinking mode — Use adaptive thinking for Opus 4.7, and fixed budgets for predictable reasoning depth.
Key Takeaways
- Claude's API is organized into five areas: model capabilities, tools, tool infrastructure, context management, and files.
- Model capabilities include extended thinking, structured outputs, batch processing (50% cheaper), and citations.
- Tools enable Claude to fetch web pages, execute code, edit files, and control a virtual desktop.
- Context management features like prompt caching and compaction help keep long-running sessions efficient and cost-effective.
- Feature availability varies by platform (Claude API, Bedrock, Vertex AI, etc.), so always check the GA/Beta status before building.