Building with Claude: A Practical Guide to the Messages API, Tools, and Managed Agents
Learn how to integrate Claude into your applications using the Messages API, tool use, extended thinking, and managed agents. Includes code examples and best practices.
This guide walks you through building with Claude—from your first API call to advanced features like tool use, extended thinking, and managed agents. You'll learn practical code examples and best practices for production deployment.
Introduction
Claude is more than a chatbot. With the Claude API, you can build intelligent applications that reason, use tools, process images, and even run code. Whether you're creating a customer support agent, a code assistant, or a data analysis pipeline, Claude provides the infrastructure to go from idea to production quickly.
This guide covers the essential building blocks: the Messages API, tool use, extended thinking, and managed agents. You'll get practical code examples and actionable advice for each feature.
Getting Started: Your First API Call
Before diving into advanced features, you need an API key and a working client. Claude supports multiple programming languages, but Python and TypeScript are the most popular.
Python Quickstart
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message.content[0].text)
TypeScript Quickstart
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello, Claude' }],
});
console.log(message.content[0].text);
}
main();
Key parameters:
model: Choose fromclaude-opus-4-7(most capable),claude-sonnet-4-6(best balance), orclaude-haiku-4-5(fastest).max_tokens: Controls response length.messages: An array of conversation turns.
Core API Features
Messages API
The Messages API is the primary way to interact with Claude. You control every turn, manage conversation state, and handle tool calls yourself. This gives you maximum flexibility.
Example with conversation history:response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "Tell me more about its history."}
]
)
Extended Thinking
For complex reasoning tasks, enable extended thinking. Claude can "think" step-by-step before responding, improving accuracy on math, logic, and analysis.
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=4096,
thinking={
"type": "enabled",
"budget_tokens": 2048 # Reserve tokens for thinking
},
messages=[
{"role": "user", "content": "Solve this equation step by step: 3x + 7 = 22"}
]
)
Best practice: Use extended thinking when you need deep reasoning. For simple queries, disable it to save tokens and reduce latency.
Streaming
For real-time applications, stream responses token by token. This provides a better user experience by showing progress.
stream = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short poem about AI."}],
stream=True
)
for chunk in stream:
if chunk.type == "content_block_delta":
print(chunk.delta.text, end="", flush=True)
Tool Use: Giving Claude Superpowers
Tools allow Claude to interact with external systems—databases, APIs, file systems, or even execute code. This is how you build autonomous agents.
Defining a Tool
def get_weather(location: str) -> str:
"""Get current weather for a location."""
# In production, call a real weather API
return f"The weather in {location} is sunny, 72°F."
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g., San Francisco, CA"
}
},
"required": ["location"]
}
}
],
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
]
)
Handling Tool Calls
When Claude decides to use a tool, the response contains a tool_use content block. Your code must execute the tool and return the result.
if response.stop_reason == "tool_use":
for content in response.content:
if content.type == "tool_use":
tool_name = content.name
tool_input = content.input
if tool_name == "get_weather":
result = get_weather(**tool_input)
# Send result back to Claude
follow_up = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "What's the weather in Tokyo?"},
{"role": "assistant", "content": response.content},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": content.id, "content": result}
]}
]
)
Parallel Tool Use
Claude can call multiple tools simultaneously, reducing round trips. This is ideal for gathering independent data points.
tools = [
{"name": "search_flights", ...},
{"name": "search_hotels", ...},
{"name": "get_currency_rate", ...}
]
Claude may call all three in one response
Managed Agents: Deploy Without the Boilerplate
If you don't want to manage conversation state and tool loops yourself, use Claude Managed Agents. This fully managed infrastructure handles state, persistence, and event history.
# Create a managed agent
agent = client.agents.create(
model="claude-sonnet-4-6",
name="CustomerSupportBot",
instructions="You are a helpful customer support agent for an e-commerce store.",
tools=[
{"name": "lookup_order", "description": "Look up order by ID", ...},
{"name": "refund_order", "description": "Process a refund", ...}
]
)
Start a session
session = client.agents.sessions.create(agent_id=agent.id)
Send a message
response = client.agents.sessions.message(
session_id=session.id,
content="I need a refund for order #12345"
)
When to use managed agents:
- You want to focus on business logic, not infrastructure.
- You need persistent sessions with history.
- You're building a chatbot or virtual assistant.
Advanced Features
Vision and Image Processing
Claude can analyze images. Pass image data as base64 or URL.
import base64
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this chart."},
{"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}}
]
}
]
)
Structured Outputs
For programmatic consumption, request structured JSON output.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="Always respond with valid JSON.",
messages=[
{"role": "user", "content": "List 3 famous scientists and their discoveries as JSON."}
]
)
Prompt Caching
Reduce costs and latency by caching system prompts or large context blocks.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a helpful assistant...",
"cache_control": {"type": "ephemeral"}
}
],
messages=[...]
)
Best Practices for Production
- Handle stop reasons: Always check
response.stop_reason. It could be"end_turn","tool_use","max_tokens", or"stop_sequence". - Implement retries: Use exponential backoff for rate limits.
- Monitor token usage: Track input and output tokens to control costs.
- Use evaluation tools: Test your prompts with Claude's Evaluation Tool before deploying.
- Strengthen guardrails: Add system prompts to reduce hallucinations and mitigate jailbreaks.
Key Takeaways
- Start with the Messages API for full control over conversation state and tool loops. Use managed agents when you want to skip infrastructure.
- Enable extended thinking for complex reasoning tasks, but disable it for simple queries to save tokens.
- Use tools to give Claude real-world capabilities—weather lookups, database queries, code execution. Handle tool calls in your code and return results.
- Stream responses for a better user experience in real-time applications.
- Optimize with prompt caching and structured outputs to reduce costs and improve reliability in production.