BeClaude
Guide2026-04-24

Mastering Claude API: A Practical Guide to Building with Anthropic’s AI

Learn how to build with the Claude API using practical examples, from setup to advanced features like tool use, streaming, and structured outputs. A hands-on guide for developers.

Quick Answer

This guide walks you through building real-world applications with the Claude API, covering authentication, message handling, tool integration, streaming, and structured outputs with code examples in Python and TypeScript.

Claude APItool usestreamingstructured outputsprompt engineering

Introduction

The Claude API is your gateway to integrating Anthropic’s most advanced AI models into your own applications. Whether you're building a chatbot, a content generator, or an automated research assistant, the API gives you direct access to Claude’s reasoning, tool use, and structured output capabilities.

This guide is written for developers who want to move beyond the quickstart and build production-ready features. You’ll learn how to authenticate, send messages, handle streaming, use tools, and enforce structured outputs—all with practical code examples.

Getting Started with the Claude API

Authentication

First, you need an API key from the Anthropic Console. Store it securely as an environment variable.

export ANTHROPIC_API_KEY="sk-ant-..."

Basic Message Request

Here’s the simplest way to send a message using the Messages API in Python:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[ {"role": "user", "content": "Explain quantum computing in one sentence."} ] )

print(response.content[0].text)

In TypeScript:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

async function main() { const response = await client.messages.create({ model: 'claude-sonnet-4-20250514', max_tokens: 1024, messages: [{ role: 'user', content: 'Explain quantum computing in one sentence.' }] }); console.log(response.content[0].text); }

main();

Handling Stop Reasons

Every API response includes a stop_reason field. Understanding it helps you control conversation flow:

  • end_turn: Claude finished naturally.
  • max_tokens: Output was truncated—increase max_tokens or continue the conversation.
  • stop_sequence: Claude hit a custom stop sequence you defined.
  • tool_use: Claude wants to call a tool—handle it in your code.
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=50,
    messages=[{"role": "user", "content": "Write a 1000-word essay on AI."}]
)

if response.stop_reason == "max_tokens": print("Response was truncated. Consider increasing max_tokens.")

Streaming Responses

For real-time user experiences, stream responses token by token:

with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a short story."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Streaming also supports tool calls and content blocks. Use stream.on('text', ...) in TypeScript for similar behavior.

Using Tools with Claude

Tools let Claude interact with external systems. Here’s how to define a weather lookup tool:

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
]

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, tools=tools, messages=[{"role": "user", "content": "What's the weather in Tokyo?"}] )

Check if Claude wants to use a tool

if response.stop_reason == "tool_use": tool_call = response.content[-1] print(f"Tool requested: {tool_call.name}") print(f"Arguments: {tool_call.input}")

You then execute the tool, return the result, and let Claude continue.

Structured Outputs

To get JSON responses that match a schema, use structured outputs. This is perfect for data extraction or API integration:

from pydantic import BaseModel

class CalendarEvent(BaseModel): name: str date: str participants: list[str]

response = client.messages.create( model="claude-sonnet-4-20250514", max_tokens=1024, messages=[ {"role": "user", "content": "Schedule a team meeting for next Friday with Alice and Bob."} ], response_model=CalendarEvent )

print(response.name) # "Team Meeting" print(response.date) # "2025-05-23"

Without Pydantic, you can pass a JSON schema directly:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[...],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "calendar_event",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "date": {"type": "string"},
                    "participants": {"type": "array", "items": {"type": "string"}}
                },
                "required": ["name", "date", "participants"]
            }
        }
    }
)

Batch Processing

For high-volume tasks, use batch processing to send multiple requests asynchronously:

results = client.messages.batch(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    requests=[
        {"custom_id": "req1", "params": {"messages": [{"role": "user", "content": "Summarize this article."}]}},
        {"custom_id": "req2", "params": {"messages": [{"role": "user", "content": "Translate to French."}]}}
    ]
)

for result in results: print(f"{result.custom_id}: {result.response.content[0].text}")

Context Management and Prompt Caching

Claude’s context window can hold large amounts of text. Use prompt caching to reuse expensive system prompts across multiple requests:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a helpful assistant with deep knowledge of Python.",
            "cache_control": {"type": "ephemeral"}
        }
    ],
    messages=[{"role": "user", "content": "Explain decorators."}]
)

Cached prompts reduce latency and cost for repeated system instructions.

Best Practices

  • Set appropriate max_tokens: Avoid truncation by estimating output length.
  • Use streaming for chat UIs: Improves perceived responsiveness.
  • Validate structured outputs: Always parse and validate JSON responses.
  • Handle tool calls gracefully: Return meaningful error messages if a tool fails.
  • Monitor token usage: Use the usage field in responses to track costs.

Key Takeaways

  • The Claude API is simple to authenticate and use, with both Python and TypeScript SDKs.
  • Streaming, tool use, and structured outputs enable real-time, interactive, and reliable applications.
  • Batch processing and prompt caching help optimize for scale and cost.
  • Always handle stop reasons and tool calls in your application logic for robust behavior.
  • Start with the Messages API, then layer on advanced features as your use case grows.