Guide2026-04-27

Getting Started with the Claude API: A Practical Guide for Developers

Learn how to build with Claude using the Messages API. This guide covers setup, API calls, model selection, and key features like extended thinking and tool use.

Quick Answer

This guide walks you through setting up the Claude API, making your first API call with Python, understanding the Messages API structure, choosing the right model, and exploring advanced features like extended thinking and tool use.

Claude APIMessages APIPython SDKdeveloper guideAI integration

Introduction

Claude is Anthropic's family of large language models designed for a wide range of tasks—from text generation and code completion to vision processing and complex reasoning. Whether you're building a chatbot, an AI-powered coding assistant, or an enterprise workflow automation tool, the Claude API gives you direct programmatic access to these capabilities.

This guide is your practical starting point. You'll learn how to set up your environment, make your first API call, understand the core Messages API structure, choose the right model for your use case, and explore advanced features like extended thinking, tool use, and structured outputs.

Prerequisites

Before you begin, make sure you have:

An Anthropic account and an API key from the Developer Console
Python 3.8+ installed on your machine
Basic familiarity with REST APIs and JSON

Step 1: Make Your First API Call

Let's start by installing the official Anthropic Python SDK and sending your first message to Claude.

Install the SDK

pip install anthropic

Set Your API Key

Set your API key as an environment variable for security:

export ANTHROPIC_API_KEY="your-api-key-here"

Send Your First Message

Create a file named first_call.py:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1000,
    messages=[
        {"role": "user", "content": "Hello, Claude! What can you help me with?"}
    ]
)
print(message.content[0].text)

Run it:

python first_call.py

You should see Claude's friendly response printed to your terminal. That's it—you've made your first API call!

Step 2: Understand the Messages API

The Messages API is the primary way to interact with Claude. It uses a simple request/response structure that supports multi-turn conversations, system prompts, and various content types.

Request Structure

Here's a more complete example showing the key parameters:

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="You are a helpful assistant that speaks like a pirate.",
    messages=[
        {"role": "user", "content": "What's the best way to cross the ocean?"},
        {"role": "assistant", "content": "Arr, a ship be the finest way!"},
        {"role": "user", "content": "What if I'm afraid of water?"}
    ]
)
print(response.content[0].text)

Understanding the Response

The response object contains:

id: Unique identifier for the message
model: The model that generated the response
role: Always "assistant"
content: An array of content blocks (text, tool_use, etc.)
stop_reason: Why the model stopped generating (e.g., "end_turn", "max_tokens", "tool_use")
usage: Token counts for input and output

Handling Stop Reasons

Stop reasons tell you why Claude stopped generating. Common values include:

"end_turn": The model naturally finished its response
"max_tokens": The response hit the max_tokens limit
"tool_use": The model wants to call a tool
"stop_sequence": A custom stop sequence was encountered

You can inspect the stop reason to decide next steps in your application logic:

if response.stop_reason == "max_tokens":
    print("Response was truncated. Consider increasing max_tokens.")
elif response.stop_reason == "tool_use":
    print("Claude wants to use a tool. Handle the tool call.")

Step 3: Choose the Right Model

Claude offers several models optimized for different use cases:

Model	Best For	Key Strength
Claude Opus 4.7	Complex reasoning, agentic coding	Highest capability, step-change over Opus 4.6
Claude Sonnet 4.6	Coding, agents, enterprise workflows	Frontier intelligence at scale
Claude Haiku 4.5	High-speed, near-frontier tasks	Fastest model with strong intelligence

Recommendation: Start with Claude Sonnet 4.6 for most applications. It offers an excellent balance of capability, speed, and cost. Use Opus for tasks requiring deep reasoning, and Haiku for latency-sensitive or high-volume use cases.

Step 4: Explore Key Features

Once you're comfortable with basic API calls, explore these powerful features:

Extended Thinking

For complex reasoning tasks, enable extended thinking to let Claude "think" before responding:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    thinking={"type": "enabled", "budget_tokens": 2048},
    messages=[
        {"role": "user", "content": "Solve this math problem step by step: 15 * 24 + 37"}
    ]
)
The thinking content is separate from the visible response
for block in response.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking}")
    elif block.type == "text":
        print(f"Response: {block.text}")

Structured Outputs

Get Claude to return structured JSON data:

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Extract the name, age, and city from: 'John is 28 and lives in Boston.'"}
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person_info",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "city": {"type": "string"}
                },
                "required": ["name", "age", "city"]
            }
        }
    }
)
print(response.content[0].text)

Tool Use (Function Calling)

Claude can call external tools and APIs. Here's a minimal example:

def get_weather(location: str) -> str:
    # In production, call a real weather API
    return f"The weather in {location} is sunny, 72°F."
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    tools=[
        {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"}
                },
                "required": ["location"]
            }
        }
    ],
    messages=[
        {"role": "user", "content": "What's the weather in San Francisco?"}
    ]
)
Check if Claude wants to use a tool
for block in response.content:
    if block.type == "tool_use":
        tool_name = block.name
        tool_input = block.input
        print(f"Claude wants to call {tool_name} with {tool_input}")
        # Execute the tool and send result back

Vision (Image Processing)

Claude can analyze images:

import base64
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this chart in detail."},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                }
            ]
        }
    ]
)
print(response.content[0].text)

Best Practices for Production

Handle errors gracefully: Always wrap API calls in try/except blocks and handle rate limits (429 errors) with exponential backoff.
Use prompt caching: For repeated system prompts or large context, enable prompt caching to reduce costs and latency.
Stream responses: For real-time applications, use streaming to show tokens as they're generated.
Monitor token usage: Track input and output tokens to manage costs effectively.
Test with evaluations: Use the Anthropic Console's Evaluation Tool to test your prompts systematically.

Next Steps

Now that you have a solid foundation, here's what to explore next:

Claude Cookbook: Interactive Jupyter notebooks covering PDFs, embeddings, and more
API Reference: Full documentation for all endpoints and parameters
Prompt Engineering Guide: Best practices for crafting effective prompts
Managed Agents: For long-running, asynchronous tasks without managing infrastructure

Key Takeaways

The Claude API is accessed via the Messages API, which supports multi-turn conversations, system prompts, and multiple content types.
Start with Claude Sonnet 4.6 for most use cases—it offers the best balance of capability, speed, and cost.
Key features like extended thinking, structured outputs, tool use, and vision can dramatically expand what you can build.
Always handle stop reasons and errors in production code to build robust applications.
Use the Anthropic Developer Console and Cookbook to prototype, test, and learn interactively.