Mastering Claude’s API: A Practical Guide to Features, Tools, and Context Management
Learn to navigate Claude's API surface—model capabilities, tools, context management, and files. Includes code examples and best practices for building production-ready applications.
This guide walks you through Claude’s five API feature areas: model capabilities, tools, tool infrastructure, context management, and files. You’ll learn how to control reasoning depth, use tools, manage long sessions, and handle documents—with practical Python examples.
Introduction
Claude’s API is more than just a text-in, text-out interface. It’s a rich ecosystem of features designed to help you build intelligent, scalable, and cost-effective applications. Whether you’re creating a customer support bot, a code assistant, or a document analysis tool, understanding the API’s five core areas will unlock Claude’s full potential.
This guide covers:
- Model capabilities – controlling reasoning and output format
- Tools – letting Claude act on the web or in your environment
- Tool infrastructure – discovery and orchestration at scale
- Context management – keeping long-running sessions efficient
- Files and assets – managing documents and data
---
1. Model Capabilities: Steering Claude’s Output
Model capabilities are the direct levers you pull to control how Claude thinks and responds. The key features include:
- Context windows – up to 1M tokens for processing large documents or long conversations
- Adaptive thinking – Claude dynamically decides when and how much to “think” (recommended for Opus 4.7)
- Structured outputs – enforce JSON schemas or other formats
- Batch processing – send large volumes of requests asynchronously at 50% cost savings
- Citations – ground responses in source documents with exact references
Example: Using Adaptive Thinking with the Effort Parameter
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-7-20250417",
max_tokens=1024,
system="You are a helpful assistant that answers concisely.",
messages=[
{"role": "user", "content": "Explain quantum entanglement in simple terms."}
],
thinking={
"type": "enabled",
"budget_tokens": 512,
"effort": "high" # Options: low, medium, high
}
)
print(response.content[0].text)
Best practice: Use effort to balance reasoning depth and latency. For simple Q&A, low is sufficient; for complex analysis, use high.
---
2. Tools: Letting Claude Take Action
Tools extend Claude’s capabilities beyond text generation. Claude can call functions you define, search the web, execute code, or even control a computer.
Tool Categories
| Tool Type | Example Use Case |
|---|---|
| Web search | Fetch real-time information from the internet |
| Code execution | Run Python or JavaScript in a sandbox |
| File operations | Read, write, or transform files |
| Computer use | Control a virtual desktop (beta) |
| Custom tools | Your own API endpoints or database queries |
Example: Defining a Custom Tool
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., 'San Francisco'"
}
},
"required": ["location"]
}
}
],
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
]
)
Claude will respond with a tool_use block
print(response.content)
Pro tip: Use parallel tool use to let Claude call multiple tools in a single turn—great for gathering data from several sources at once.
---
3. Tool Infrastructure: Discovery and Orchestration
When you have many tools, you need a way to manage them efficiently. Claude’s tool infrastructure includes:
- Tool Runner (SDK) – automatically handles tool call execution and result injection
- Strict tool use – forces Claude to use a specific tool (useful for routing)
- Tool search – dynamically discover tools based on user intent
- Fine-grained tool streaming – stream tool calls and results incrementally
Example: Using Tool Runner
from anthropic import Anthropic
from anthropic.tools import ToolRunner
client = Anthropic()
Define your tools
weather_tool = {
"name": "get_weather",
"description": "Get current weather",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
Create a runner that automatically handles tool calls
runner = ToolRunner(
client=client,
model="claude-sonnet-4-20250514",
tools=[weather_tool],
max_tokens=1024
)
response = runner.run(
messages=[{"role": "user", "content": "Weather in Paris?"}]
)
print(response.content[0].text)
Best practice: Use Tool Runner for multi-turn interactions where Claude may need to call tools multiple times to fulfill a request.
---
4. Context Management: Keeping Sessions Efficient
Long conversations or large documents can quickly consume tokens. Claude offers several features to manage context:
- Context windows – up to 1M tokens (Sonnet and Opus models)
- Prompt caching – reuse common prefixes (system prompts, large documents) to reduce latency and cost
- Compaction – summarize or compress older messages to stay within context limits
- Context editing – selectively remove or modify parts of the conversation
Example: Using Prompt Caching
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a legal document analyst. Answer based on the provided documents.",
"cache_control": {"type": "ephemeral"}
}
],
messages=[
{"role": "user", "content": "Summarize the key clauses in this contract."}
]
)
print(response.usage)
Note: cache_creation_input_tokens and cache_read_input_tokens will appear
Cost-saving tip: Cache large system prompts or reference documents. Subsequent calls with the same prefix will be faster and cheaper.
---
5. Files and Assets: Working with Documents
Claude can process a variety of file types, including:
- PDFs – extract text, tables, and images
- Images – vision analysis (JPG, PNG, GIF, WebP)
- Code files – syntax highlighting and analysis
- Spreadsheets – CSV, Excel (via conversion)
Example: Processing a PDF with Citations
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": "<base64-encoded-pdf>"
},
"citations": {"enabled": True}
},
{
"type": "text",
"text": "What is the main conclusion of this report?"
}
]
}
]
)
Citations will include page numbers and exact text snippets
print(response.content)
Note: Citations are especially useful for legal, academic, or compliance use cases where you need to verify Claude’s answers against source material.
---
Feature Availability by Platform
Not all features are available everywhere. Here’s a quick reference:
| Feature | Claude API | AWS | Bedrock | Vertex AI |
|---|---|---|---|---|
| 1M context | GA | GA | GA | GA |
| Adaptive thinking | GA | GA | GA | GA |
| Batch processing | GA | GA | GA | GA |
| Citations | GA | GA | GA | Beta |
| Prompt caching | GA | GA | GA | GA |
| Computer use | Beta | Beta | Beta | — |
---
Putting It All Together: A Practical Workflow
Here’s a real-world pattern combining multiple features:
- Send a large PDF (context management + files)
- Ask Claude to analyze it with citations (model capabilities)
- Let Claude call a custom tool to look up additional data (tools)
- Cache the system prompt to save costs (context management)
- Stream the response for a better user experience
import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=2048,
system=[
{
"type": "text",
"text": "You are a financial analyst. Answer with citations.",
"cache_control": {"type": "ephemeral"}
}
],
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": "<base64-pdf>"
},
"citations": {"enabled": True}
},
{
"type": "text",
"text": "What are the top three risks mentioned?"
}
]
}
],
tools=[
{
"name": "get_stock_price",
"description": "Get current stock price for a ticker",
"input_schema": {
"type": "object",
"properties": {
"ticker": {"type": "string"}
},
"required": ["ticker"]
}
}
]
) as stream:
for event in stream:
if event.type == "content_block_delta":
print(event.delta.text, end="")
---
Key Takeaways
- Claude’s API is organized into five areas: model capabilities, tools, tool infrastructure, context management, and files. Start with capabilities and tools, then optimize with the others.
- Adaptive thinking lets you control reasoning depth—use the
effortparameter to balance quality and speed. - Tools extend Claude beyond text: web search, code execution, custom functions, and even computer control are available.
- Prompt caching and batch processing are your best friends for reducing cost and latency at scale.
- Citations are essential for any application that requires verifiable, grounded answers—especially in regulated industries.