GuideBeginnerAgents2026-05-13

How to Master the Claude API: A Practical Guide for Developers

Learn how to integrate and optimize the Claude API with Python and TypeScript. Covers setup, streaming, tool use, and best practices for production apps.

Quick Answer

This guide walks you through setting up the Claude API, making your first request, enabling streaming, using tools (function calling), and following best practices for reliability and cost efficiency.

Claude APIPythonTypeScriptstreamingtool use

Introduction

The Claude API is the gateway to integrating Anthropic's powerful language models into your own applications. Whether you're building a chatbot, a content generator, a code assistant, or an agentic workflow, the API gives you direct programmatic access to Claude's capabilities.

This guide is written for developers who already have a basic understanding of APIs and want to move from theory to practice. You'll learn how to authenticate, send messages, handle streaming responses, use tools (function calling), and follow best practices for production deployments.

By the end, you'll have a solid foundation for building reliable, cost-effective applications powered by Claude.

Prerequisites

An Anthropic API key (get one at console.anthropic.com)
Python 3.8+ or Node.js 18+ installed
Basic familiarity with REST APIs and JSON

Step 1: Setting Up Your Environment

Python

Install the official Anthropic Python SDK:

pip install anthropic

Set your API key as an environment variable (recommended):

export ANTHROPIC_API_KEY="sk-ant-..."

TypeScript / Node.js

Install the SDK:

npm install @anthropic-ai/sdk

Set the environment variable similarly:

export ANTHROPIC_API_KEY="sk-ant-..."

Step 2: Your First API Call

Let's make a simple request to Claude.

Python Example

import anthropic
import os
client = anthropic.Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY")
)
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain the concept of recursion in one sentence."}
    ]
)
print(message.content[0].text)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});
async function main() {
  const message = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: 'Explain the concept of recursion in one sentence.' }
    ],
  });
console.log(message.content[0].text);
}
main();

What's happening?

We create a client with our API key.
We call messages.create() with the model name, token limit, and a conversation history.
The response contains the assistant's reply in content[0].text.

Step 3: Handling Streaming Responses

For real-time applications (chat UIs, live assistants), streaming is essential. It reduces perceived latency and improves user experience.

Python Streaming

with client.messages.stream(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short poem about APIs."}
    ]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

TypeScript Streaming

const stream = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a short poem about APIs.' }],
  stream: true,
});
for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

Pro tip: Always use streaming for user-facing applications. It makes your app feel faster and more responsive.

Step 4: Using Tools (Function Calling)

Claude can call external functions or APIs on your behalf. This is the foundation of building agents.

Define a tool that gets the current weather:

import json
def get_weather(location: str) -> str:
    # In production, call a real weather API
    return f"The weather in {location} is sunny, 72°F."
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo?"}
    ],
    tools=[
        {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., Tokyo"
                    }
                },
                "required": ["location"]
            }
        }
    ]
)
Check if Claude wants to use a tool
if message.stop_reason == "tool_use":
    tool_use = message.content[-1]  # last content block is the tool use request
    tool_name = tool_use.name
    tool_input = tool_use.input
    
    if tool_name == "get_weather":
        result = get_weather(tool_input["location"])
        
        # Send the result back to Claude
        follow_up = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=[
                {"role": "user", "content": "What's the weather in Tokyo?"},
                {"role": "assistant", "content": message.content},
                {"role": "user", "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_use.id,
                        "content": result
                    }
                ]}
            ],
            tools=[...]  # same tools as before
        )
        
        print(follow_up.content[0].text)

Key points:

Tools are defined with a name, description, and JSON schema for inputs.
Claude can decide to call a tool; you execute it and return the result.
This pattern enables agents that can query databases, call APIs, or perform calculations.

Step 5: Best Practices for Production

1. Handle Errors Gracefully

Always wrap API calls in try/except blocks and handle rate limits (429) and authentication errors (401).

from anthropic import RateLimitError, APIStatusError
try:
    response = client.messages.create(...)
except RateLimitError:
    print("Rate limited. Retrying after delay...")
    time.sleep(2)
except APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")

2. Manage Token Usage

Set max_tokens appropriately. For short answers, use 256–512. For long-form content, use 2048–4096. Monitor usage via the Anthropic console.

3. Use System Prompts

System prompts set the behavior and tone of Claude. Always include one for consistent results.

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    system="You are a helpful coding assistant. Keep answers concise and provide code examples.",
    messages=[...]
)

4. Implement Retry Logic

Network issues happen. Implement exponential backoff for transient failures.

import time
def call_with_retry(client, **kwargs):
    for attempt in range(3):
        try:
            return client.messages.create(**kwargs)
        except (RateLimitError, ConnectionError) as e:
            if attempt == 2:
                raise
            time.sleep(2 ** attempt)

5. Keep Conversations Manageable

Long conversations consume tokens and increase cost. Summarize or truncate older messages when they exceed a threshold (e.g., 100k tokens).

Conclusion

The Claude API is straightforward to use but offers powerful features like streaming, tool use, and system prompts. By following the patterns in this guide, you can build responsive, intelligent applications that leverage Claude's full potential.

Remember to always monitor your usage, handle errors gracefully, and iterate based on real-world feedback.

Key Takeaways

Start simple: Authenticate with your API key and make your first message call before adding complexity.
Stream for UX: Always use streaming for user-facing applications to reduce latency.
Leverage tools: Function calling enables Claude to interact with external systems, making it an agent.
Handle errors: Implement retry logic and catch rate limits for production reliability.
Optimize tokens: Set appropriate max_tokens and manage conversation length to control costs.