Guide2026-04-30

Building with Claude: A Practical Guide to the Messages API, Tools, and Managed Agents

Learn how to integrate Claude into your applications using the Messages API, tool use, extended thinking, and managed agents. Includes code examples and best practices.

Quick Answer

This guide walks you through building with Claude—from your first API call to advanced features like tool use, extended thinking, and managed agents. You'll learn practical code examples and best practices for production deployment.

Claude APIMessages APITool UseManaged AgentsExtended Thinking

Introduction

Claude is more than a chatbot. With the Claude API, you can build intelligent applications that reason, use tools, process images, and even run code. Whether you're creating a customer support agent, a code assistant, or a data analysis pipeline, Claude provides the infrastructure to go from idea to production quickly.

This guide covers the essential building blocks: the Messages API, tool use, extended thinking, and managed agents. You'll get practical code examples and actionable advice for each feature.

Getting Started: Your First API Call

Before diving into advanced features, you need an API key and a working client. Claude supports multiple programming languages, but Python and TypeScript are the most popular.

Python Quickstart

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message.content[0].text)

TypeScript Quickstart

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
  const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [{ role: 'user', content: 'Hello, Claude' }],
  });
console.log(message.content[0].text);
}
main();

Key parameters:

model: Choose from claude-opus-4-7 (most capable), claude-sonnet-4-6 (best balance), or claude-haiku-4-5 (fastest).
max_tokens: Controls response length.
messages: An array of conversation turns.

Core API Features

Messages API

The Messages API is the primary way to interact with Claude. You control every turn, manage conversation state, and handle tool calls yourself. This gives you maximum flexibility.

Example with conversation history:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=2048,
    messages=[
        {"role": "user", "content": "What is the capital of France?"},
        {"role": "assistant", "content": "The capital of France is Paris."},
        {"role": "user", "content": "Tell me more about its history."}
    ]
)

Extended Thinking

For complex reasoning tasks, enable extended thinking. Claude can "think" step-by-step before responding, improving accuracy on math, logic, and analysis.

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    thinking={
        "type": "enabled",
        "budget_tokens": 2048  # Reserve tokens for thinking
    },
    messages=[
        {"role": "user", "content": "Solve this equation step by step: 3x + 7 = 22"}
    ]
)

Best practice: Use extended thinking when you need deep reasoning. For simple queries, disable it to save tokens and reduce latency.

Streaming

For real-time applications, stream responses token by token. This provides a better user experience by showing progress.

stream = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a short poem about AI."}],
    stream=True
)
for chunk in stream:
    if chunk.type == "content_block_delta":
        print(chunk.delta.text, end="", flush=True)

Tool Use: Giving Claude Superpowers

Tools allow Claude to interact with external systems—databases, APIs, file systems, or even execute code. This is how you build autonomous agents.

Defining a Tool

def get_weather(location: str) -> str:
    """Get current weather for a location."""
    # In production, call a real weather API
    return f"The weather in {location} is sunny, 72°F."
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City and state, e.g., San Francisco, CA"
                    }
                },
                "required": ["location"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo?"}
    ]
)

Handling Tool Calls

When Claude decides to use a tool, the response contains a tool_use content block. Your code must execute the tool and return the result.

if response.stop_reason == "tool_use":
    for content in response.content:
        if content.type == "tool_use":
            tool_name = content.name
            tool_input = content.input
            
            if tool_name == "get_weather":
                result = get_weather(**tool_input)
                
                # Send result back to Claude
                follow_up = client.messages.create(
                    model="claude-sonnet-4-6",
                    max_tokens=1024,
                    messages=[
                        {"role": "user", "content": "What's the weather in Tokyo?"},
                        {"role": "assistant", "content": response.content},
                        {"role": "user", "content": [
                            {"type": "tool_result", "tool_use_id": content.id, "content": result}
                        ]}
                    ]
                )

Parallel Tool Use

Claude can call multiple tools simultaneously, reducing round trips. This is ideal for gathering independent data points.

tools = [
    {"name": "search_flights", ...},
    {"name": "search_hotels", ...},
    {"name": "get_currency_rate", ...}
]
Claude may call all three in one response

Managed Agents: Deploy Without the Boilerplate

If you don't want to manage conversation state and tool loops yourself, use Claude Managed Agents. This fully managed infrastructure handles state, persistence, and event history.

# Create a managed agent
agent = client.agents.create(
    model="claude-sonnet-4-6",
    name="CustomerSupportBot",
    instructions="You are a helpful customer support agent for an e-commerce store.",
    tools=[
        {"name": "lookup_order", "description": "Look up order by ID", ...},
        {"name": "refund_order", "description": "Process a refund", ...}
    ]
)
Start a session
session = client.agents.sessions.create(agent_id=agent.id)
Send a message
response = client.agents.sessions.message(
    session_id=session.id,
    content="I need a refund for order #12345"
)

When to use managed agents:

You want to focus on business logic, not infrastructure.
You need persistent sessions with history.
You're building a chatbot or virtual assistant.

Advanced Features

Vision and Image Processing

Claude can analyze images. Pass image data as base64 or URL.

import base64
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode()
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this chart."},
                {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}}
            ]
        }
    ]
)

Structured Outputs

For programmatic consumption, request structured JSON output.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system="Always respond with valid JSON.",
    messages=[
        {"role": "user", "content": "List 3 famous scientists and their discoveries as JSON."}
    ]
)

Prompt Caching

Reduce costs and latency by caching system prompts or large context blocks.

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant...",
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[...]
)

Best Practices for Production

Handle stop reasons: Always check response.stop_reason. It could be "end_turn", "tool_use", "max_tokens", or "stop_sequence".
Implement retries: Use exponential backoff for rate limits.
Monitor token usage: Track input and output tokens to control costs.
Use evaluation tools: Test your prompts with Claude's Evaluation Tool before deploying.
Strengthen guardrails: Add system prompts to reduce hallucinations and mitigate jailbreaks.

Key Takeaways

Start with the Messages API for full control over conversation state and tool loops. Use managed agents when you want to skip infrastructure.
Enable extended thinking for complex reasoning tasks, but disable it for simple queries to save tokens.
Use tools to give Claude real-world capabilities—weather lookups, database queries, code execution. Handle tool calls in your code and return results.
Stream responses for a better user experience in real-time applications.
Optimize with prompt caching and structured outputs to reduce costs and improve reliability in production.