Mastering the Claude API: A Complete Guide to Features, Tools, and Workflows
Explore Claude's API surface—model capabilities, tools, context management, and files. Learn practical patterns with code examples to build production-ready AI applications.
This guide walks you through Claude's five API areas: model capabilities, tools, tool infrastructure, context management, and files. You'll learn how to steer reasoning, use tools, manage long sessions, and process documents—with code examples for each.
Mastering the Claude API: A Complete Guide to Features, Tools, and Workflows
Claude's API is more than a simple text-in, text-out interface. It's a rich platform organized into five core areas that let you control reasoning, take actions, manage context, and handle files. Whether you're building a chatbot, an agent, or a document analysis pipeline, understanding these areas is key to getting the most out of Claude.
This guide covers each area with practical examples and best practices. By the end, you'll know how to choose the right features for your use case and how to combine them for production-ready applications.
The Five Pillars of the Claude API
Claude's API surface is organized into:
- Model capabilities – Control how Claude reasons and formats responses.
- Tools – Let Claude take actions on the web or in your environment.
- Tool infrastructure – Handle discovery and orchestration at scale.
- Context management – Keep long-running sessions efficient.
- Files and assets – Manage documents and data you provide to Claude.
Model Capabilities: Steering Claude's Output
Model capabilities are the direct levers you pull to shape Claude's responses. They include reasoning depth, response format, and input modalities.
Extended Thinking and Adaptive Thinking
Claude can "think" before it responds. With extended thinking, you set a fixed budget for reasoning tokens. With adaptive thinking (recommended for Opus 4.7), Claude decides dynamically how much to think based on the task.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-opus-4-20250514",
max_tokens=4096,
thinking={
"type": "enabled",
"budget_tokens": 2048 # Fixed budget for extended thinking
},
messages=[
{"role": "user", "content": "Solve this complex math problem: integrate x^2 * sin(x) dx"}
]
)
The response includes both thinking and final answer
print(response.content[0].thinking)
print(response.content[1].text)
For adaptive thinking, use the effort parameter instead:
response = client.messages.create(
model="claude-opus-4-20250514",
max_tokens=4096,
thinking={
"type": "enabled",
"effort": "high" # Options: low, medium, high
},
messages=[
{"role": "user", "content": "Analyze the ethical implications of autonomous vehicles."}
]
)
Structured Outputs
You can force Claude to output valid JSON or follow a specific schema using the structured_outputs parameter:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Extract the name, date, and amount from this invoice: ..."}
],
structured_outputs={
"json_schema": {
"name": "invoice",
"strict": True,
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"date": {"type": "string"},
"amount": {"type": "number"}
},
"required": ["name", "date", "amount"]
}
}
}
)
Citations for Grounded Responses
When you need Claude to reference specific passages in source documents, use the Citations feature. It returns exact sentence-level references:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Summarize the key findings from the attached research paper."}
],
documents=[
{
"type": "document",
"source": {
"type": "text",
"media_type": "text/plain",
"data": "...full research paper text..."
},
"citations": {"enabled": True}
}
]
)
Each citation includes the exact sentence and passage used
for citation in response.citations:
print(f"Cited: {citation.quote}")
print(f"From: {citation.document_index}")
Tools: Letting Claude Take Action
Tools are how Claude interacts with the outside world—performing web searches, running code, or calling your own APIs.
Built-in Tools
Claude comes with several server-side tools you can enable with a single flag:
- Web search tool – Search the internet for up-to-date information.
- Code execution tool – Run Python code in a sandboxed environment.
- Computer use tool – Control a virtual desktop (beta).
- Memory tool – Store and retrieve information across conversations.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[
{"type": "web_search"},
{"type": "code_execution"}
],
messages=[
{"role": "user", "content": "What's the current population of Tokyo? Also, calculate the area of a circle with radius 5."}
]
)
Custom Tools (Function Calling)
You can define your own tools using a JSON schema. Claude will decide when to call them:
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "What's the weather in Paris?"}
]
)
Handle the tool call
if response.stop_reason == "tool_use":
for block in response.content:
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
print(f"Claude wants to call {tool_name} with {tool_input}")
# Execute your function and return the result
Parallel Tool Use
Claude can call multiple tools in a single turn, which is great for independent operations:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[weather_tool, news_tool, stock_tool],
parallel_tool_calls=True, # Enable parallel calls
messages=[
{"role": "user", "content": "Get the weather in London, the latest tech news, and Apple's stock price."}
]
)
Context Management: Keeping Sessions Efficient
Long conversations consume tokens. Claude provides several mechanisms to manage context windows efficiently.
Context Windows
Claude supports up to 1 million tokens of context—enough to process entire codebases or lengthy documents. Use the Models API to check a model's limits:
models = client.models.list()
for model in models:
print(f"{model.id}: max_input={model.max_input_tokens}, max_output={model.max_tokens}")
Prompt Caching
Reduce latency and cost by caching frequently used context (like system prompts or document chunks):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a helpful assistant specialized in Python programming.",
"cache_control": {"type": "ephemeral"}
}
],
messages=[
{"role": "user", "content": "Explain decorators in Python."}
]
)
Check if cache was used
print(f"Cache created: {response.cache_creation_input_tokens}")
print(f"Cache read: {response.cache_read_input_tokens}")
Batch Processing for Cost Savings
For large volumes of requests, use batch processing. Batch API calls cost 50% less than standard calls:
# Submit a batch of requests
batch = client.batches.create(
requests=[
{
"custom_id": "req-001",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Summarize this document..."}]
}
},
{
"custom_id": "req-002",
"params": {
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Translate this to French..."}]
}
}
]
)
Retrieve results later
results = client.batches.retrieve(batch.id)
Files and Assets: Working with Documents
Claude can process PDFs, images, and other file types directly.
PDF Support
import base64
with open("report.pdf", "rb") as f:
pdf_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[
{
"role": "user",
"content": [
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data
}
},
{
"type": "text",
"text": "Summarize the key findings from this report."
}
]
}
]
)
Images and Vision
Claude can analyze images for tasks like OCR, object detection, and visual reasoning:
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe what this chart shows."
}
]
}
]
)
Putting It All Together: A Practical Workflow
Here's a real-world example that combines multiple features: a research assistant that searches the web, reads PDFs, and provides cited answers.
import anthropic
import base64
client = anthropic.Anthropic()
Step 1: Load a PDF document
with open("research_paper.pdf", "rb") as f:
pdf_data = base64.b64encode(f.read()).decode("utf-8")
Step 2: Ask Claude to search for related information and analyze the PDF
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=[
{"type": "web_search"},
{
"name": "get_paper_metadata",
"description": "Get metadata for a research paper",
"input_schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"authors": {"type": "array", "items": {"type": "string"}}
},
"required": ["title"]
}
}
],
documents=[
{
"type": "document",
"source": {
"type": "base64",
"media_type": "application/pdf",
"data": pdf_data
},
"citations": {"enabled": True}
}
],
messages=[
{
"role": "user",
"content": "Summarize this paper and find recent news about the same topic."
}
]
)
Step 3: Process the response
for block in response.content:
if block.type == "text":
print(block.text)
elif block.type == "tool_use":
print(f"Tool called: {block.name}")
print(f"Input: {block.input}")
Feature Availability and Lifecycle
Features on the Claude Platform go through stages:
| Classification | Description |
|---|---|
| Beta | Preview features for feedback. May change or be discontinued. Not for production. |
| Generally Available (GA) | Stable, fully supported, recommended for production. |
| Deprecated | Still functional but not recommended. Migration path provided. |
| Retired | No longer available. |
Key Takeaways
- Claude's API is organized into five areas: model capabilities, tools, tool infrastructure, context management, and files. Start with the first two.
- Use extended thinking or adaptive thinking for complex reasoning tasks. Adaptive thinking with the
effortparameter is recommended for Opus 4.7. - Leverage built-in tools (web search, code execution) and custom tools for agentic workflows. Enable parallel tool calls for independent operations.
- Optimize cost and latency with prompt caching, batch processing (50% cost reduction), and context compaction for long sessions.
- Process files natively – Claude supports PDFs, images, and other formats. Use Citations for verifiable, grounded responses with exact source references.