Guide2026-04-22

Mastering the Messages API: A Practical Guide to Building with Claude

Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision. Includes code examples in Python and TypeScript.

Quick Answer

This guide covers the core patterns of the Claude Messages API: making basic requests, building multi-turn conversations, using prefill to shape responses, and sending images for vision tasks. You'll get practical code examples and best practices for each pattern.

Messages APIClaude APIMulti-turn conversationsPrefillVision

Introduction

The Claude Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generator, or a vision-enabled application, understanding the Messages API is essential. This guide walks you through the most common and powerful patterns: basic requests, multi-turn conversations, prefill techniques, and vision capabilities.

Basic Request and Response

At its simplest, the Messages API lets you send a user message and receive Claude's response. Here's a minimal example in Python:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

And the equivalent in TypeScript:

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        { role: 'user', content: 'Hello, Claude' }
    ]
});
console.log(message);

The response includes the model's reply, usage statistics, and a stop_reason that tells you why the response ended (e.g., "end_turn" for a natural completion).

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context. To continue a conversation, simply append new turns to the messages array:

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message)

You can also include synthetic assistant messages—responses that didn't actually come from Claude. This is useful for:

Injecting system-like instructions in the conversation flow
Correcting or guiding Claude's behavior
Simulating a conversation history for testing

Prefill: Putting Words in Claude's Mouth

Prefill lets you start Claude's response by providing the beginning of its answer. This is powerful for:

Forcing a specific format (e.g., JSON, multiple choice)
Guiding the tone or structure of the response
Reducing token usage for predictable outputs

Here's an example that forces Claude to answer a multiple-choice question with a single letter:

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message)

By setting max_tokens=1 and prefilling with "The answer is (", Claude will only generate the next token—likely "C".

Note: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. For those models, use structured outputs or system prompt instructions instead.

Vision: Sending Images to Claude

Claude can analyze images sent via the Messages API. You can provide images as base64-encoded data or as a URL. Here's how to send a base64-encoded image:

import anthropic
import base64
client = anthropic.Anthropic()
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

You can also use a URL:

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/chart.png"
                    }
                },
                {
                    "type": "text",
                    "text": "What trends do you see in this data?"
                }
            ]
        }
    ]
)

Supported media types include image/jpeg, image/png, image/gif, and image/webp. For best results, use images under 20MB and with reasonable dimensions.

Best Practices

Always set max_tokens to control response length and cost.
Use stop_reason in the response to handle different scenarios (e.g., "max_tokens" means the response was cut off).
Keep conversation history concise to stay within context windows and reduce latency.
Use prefill sparingly—it's powerful but can lead to unexpected behavior if not carefully tested.
For vision tasks, provide clear text instructions alongside images for best results.

Key Takeaways

The Messages API is stateless; always send the full conversation history.
Prefill lets you shape Claude's response by providing the beginning of its answer.
Vision support allows you to send images as base64 or URL for analysis.
Synthetic assistant messages give you fine-grained control over conversation flow.
Always handle stop_reason in your application logic to manage incomplete responses.