Mastering the Claude API: A Complete Guide to Features, Tools, and Context Management
Explore Claude's API surface including model capabilities, tools, context management, and file handling. Learn practical implementation with code examples for building production-ready AI applications.
This guide walks you through Claude's five API areas: model capabilities, tools, tool infrastructure, context management, and files. You'll learn how to control reasoning, use tools, manage context windows, and handle documents—with code examples for each.
Mastering the Claude API: A Complete Guide to Features, Tools, and Context Management
Claude's API is more than just a text generation endpoint. It's a comprehensive platform designed to give you fine-grained control over how Claude reasons, interacts with external systems, and processes information. Whether you're building a simple chatbot or a complex agentic workflow, understanding the five core areas of the API surface is essential.
This guide covers everything you need to know to build production-ready applications with Claude. We'll explore model capabilities, tools, context management, and file handling—with practical code examples you can use today.
The Five Pillars of the Claude API
Claude's API surface is organized into five areas:
- Model capabilities – Control how Claude reasons and formats responses
- Tools – Let Claude take actions on the web or in your environment
- Tool infrastructure – Handle discovery and orchestration at scale
- Context management – Keep long-running sessions efficient
- Files and assets – Manage documents and data you provide to Claude
Model Capabilities: Steering Claude's Output
Model capabilities give you direct control over Claude's reasoning depth, response format, and input modalities. These are the building blocks for any application.
Extended Thinking and Adaptive Thinking
Claude supports extended thinking—letting the model reason step-by-step before producing a final answer. With adaptive thinking, Claude dynamically decides when and how much to think. This is the recommended mode for Claude Opus 4.7.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-20250514",
max_tokens=4096,
thinking={
"type": "enabled",
"budget_tokens": 2048 # Max tokens for thinking
},
messages=[
{"role": "user", "content": "Solve this complex math problem step by step: 15! / (12! * 3!)"}
]
)
The thinking content is available separately
print(response.content[0].thinking)
print(response.content[1].text)
For adaptive thinking, use the effort parameter:
response = client.messages.create(
model="claude-opus-4-20250514",
max_tokens=4096,
thinking={
"type": "enabled",
"effort": "high" # Options: low, medium, high
},
messages=[
{"role": "user", "content": "Analyze the pros and cons of quantum computing for cryptography."}
]
)
Structured Outputs
For production applications, you often need Claude to return structured data. Use the structured_outputs feature to enforce JSON schemas:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Extract the key entities from this text: 'Apple acquired the startup for $500 million in 2023.'"}
],
structured_outputs={
"json_schema": {
"name": "entity_extraction",
"schema": {
"type": "object",
"properties": {
"entities": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"type": {"type": "string", "enum": ["company", "person", "amount", "date"]},
"value": {"type": "string"}
},
"required": ["name", "type", "value"]
}
}
},
"required": ["entities"]
}
}
}
)
print(response.content[0].text)
Citations for Grounded Responses
When Claude needs to reference source documents, use the Citations feature. This grounds responses in specific passages, making outputs more verifiable and trustworthy.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Based on the attached report, what were the Q3 revenue figures?"
}
],
documents=[
{
"type": "text",
"title": "Q3 Financial Report",
"content": "...",
"citations": {"enabled": True}
}
]
)
Citations appear in the response
for block in response.content:
if block.type == "text" and block.citations:
for citation in block.citations:
print(f"Cited: {citation.document_title} - {citation.start_index}:{citation.end_index}")
Tools: Let Claude Take Action
Tools are how Claude interacts with the outside world. The API supports several built-in tools and custom tool definitions.
Using Built-in Tools
Claude provides several server-side tools you can enable:
- Web search tool – Search the internet for current information
- Code execution tool – Run Python code in a sandboxed environment
- Computer use tool – Control a virtual desktop (beta)
- Text editor tool – Read and write files in a workspace
- Memory tool – Store and retrieve information across conversations
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[
{
"type": "web_search",
"name": "web_search"
},
{
"type": "code_execution",
"name": "execute_python"
}
],
messages=[
{"role": "user", "content": "Search for the latest Claude API updates and then write a Python script to test the streaming feature."}
]
)
Custom Tool Definitions
You can define your own tools using a JSON schema. This is how you connect Claude to your own APIs, databases, or business logic.
import json
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g., San Francisco"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit"
}
},
"required": ["city"]
}
}
],
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
]
)
Handle tool calls
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
# Call your actual API here
print(f"Tool called: {tool_name} with {json.dumps(tool_input)}")
Parallel Tool Use
Claude can call multiple tools in parallel, which is critical for efficiency in agentic workflows:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[tool1, tool2, tool3],
parallel_tool_use=True, # Enable parallel calls
messages=[
{"role": "user", "content": "Check the weather in London, Paris, and Berlin simultaneously."}
]
)
Context Management: Keeping Sessions Efficient
Long-running conversations can consume significant tokens. Claude's context management features help you stay efficient.
Context Windows and Compaction
Claude supports context windows up to 1 million tokens—enough to process entire codebases or lengthy documents. For ongoing sessions, use context compaction to summarize and prune older messages:
# Enable compaction in your system prompt
system_prompt = """
You are a helpful assistant. When the conversation becomes very long,
you may compact the context by summarizing earlier parts of the conversation.
"""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
system=system_prompt,
messages=[
# ... many messages
]
)
Prompt Caching
For repeated system prompts or large context blocks, prompt caching reduces latency and cost. Cache frequently used content:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a customer support agent for Acme Corp.",
"cache_control": {"type": "ephemeral"} # Cache this system prompt
}
],
messages=[
{"role": "user", "content": "How do I reset my password?"}
]
)
Token Counting
Always monitor your token usage to avoid surprises:
# Count tokens before sending
count = client.messages.count_tokens(
model="claude-sonnet-4-20250514",
messages=[
{"role": "user", "content": "Hello, how are you?"}
]
)
print(f"Input tokens: {count.input_tokens}")
Working with Files and Assets
Claude can process various file types, including PDFs, images, and code files.
PDF Support
import base64
with open("report.pdf", "rb") as f:
pdf_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data
}
},
{
"type": "text",
"text": "Summarize this PDF."
}
]
}
]
)
Image and Vision
Claude can analyze images for tasks like object detection, OCR, or visual reasoning:
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "What does this chart show?"
}
]
}
]
)
Feature Availability and Lifecycle
Not all features are available on every platform. Claude's features follow a lifecycle:
| Classification | Description |
|---|---|
| Beta | Preview features for feedback. May change significantly. Not for production. |
| Generally Available (GA) | Stable, fully supported, recommended for production. |
| Deprecated | Still functional but not recommended. Migration path provided. |
| Retired | No longer available. |
Best Practices for Production
- Start simple – Begin with model capabilities and tools before adding complex infrastructure.
- Use streaming – For responsive UIs, enable streaming to get partial results.
- Monitor token usage – Use token counting and prompt caching to manage costs.
- Handle errors gracefully – Implement retry logic for rate limits and timeouts.
- Test with different models – Claude Opus 4.7 excels at complex reasoning; Claude Sonnet 4 is faster and cheaper for simpler tasks.
Key Takeaways
- Claude's API is organized into five core areas: model capabilities, tools, tool infrastructure, context management, and files/assets.
- Use extended thinking and adaptive thinking to control reasoning depth, and structured outputs for reliable JSON responses.
- Tools (both built-in and custom) let Claude interact with external systems, with support for parallel calls.
- Context management features like prompt caching and compaction help keep long-running sessions efficient and cost-effective.
- Claude supports multiple file types including PDFs and images, making it suitable for document analysis and visual reasoning tasks.