Guide2026-05-05

Mastering the Messages API: A Practical Guide to Building with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision. Includes code examples in Python and TypeScript.

Quick Answer

This guide covers the core patterns of Claude's Messages API: making basic requests, building multi-turn conversations, using prefill to shape responses, and handling images with vision. You'll get practical code examples and best practices for each pattern.

Messages APIClaude APIMulti-turn conversationsPrefillVision

Introduction

Claude's Messages API is the primary way to interact with the model programmatically. Whether you're building a chatbot, a content generator, or a vision-enabled application, understanding the Messages API is essential. This guide walks you through the most common patterns—from simple requests to advanced techniques like prefill and vision—with practical code examples in Python and TypeScript.

Basic Request and Response

At its simplest, the Messages API takes a list of messages and returns a response. Here's how to send a basic greeting to Claude:

Python

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

TypeScript

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: "claude-opus-4-7",
    max_tokens: 1024,
    messages: [
        { role: "user", content: "Hello, Claude" }
    ]
});
console.log(message);

Response Structure

The response includes:

id: Unique message identifier
role: Always "assistant"
content: Array of content blocks (typically text)
model: The model used
stop_reason: Why generation stopped (e.g., "end_turn", "max_tokens")
usage: Token counts for input and output

Example output:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {"input_tokens": 12, "output_tokens": 6}
}

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with each request. This gives you complete control over context but requires you to manage the conversation state on your end.

Building a Conversation

To continue a conversation, append new messages to the history:

import anthropic
client = anthropic.Anthropic()
Start the conversation
messages = [
    {"role": "user", "content": "Hello, Claude"},
    {"role": "assistant", "content": "Hello!"},
    {"role": "user", "content": "Can you describe LLMs to me?"}
]
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
print(message.content[0].text)

Synthetic Assistant Messages

You can inject synthetic assistant messages into the history. This is useful for:

Guiding the conversation: Pre-filling context that Claude didn't generate
Role-playing scenarios: Setting up a character or persona
Correcting mistakes: Providing the correct answer for Claude to build upon

messages = [
    {"role": "user", "content": "What's the capital of France?"},
    # Synthetic assistant message to set context
    {"role": "assistant", "content": "The capital of France is Paris."},
    {"role": "user", "content": "What's its population?"}
]

Prefill: Putting Words in Claude's Mouth

Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:

Constraining output format (e.g., JSON, multiple choice)
Setting tone or style
Reducing token usage by guiding the response early

Multiple Choice Example

Here's how to get a single-letter answer from Claude:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is Latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # Output: "C"

Important Prefill Limitations

Prefill is not supported on the following models:

Claude Mythos Preview
Claude Opus 4.7
Claude Opus 4.6
Claude Sonnet 4.6

Requests using prefill with these models will return a 400 error. For these models, use structured outputs or system prompt instructions instead.

Vision: Working with Images

Claude can process images sent via the Messages API. This enables use cases like image analysis, document OCR, and visual question answering.

Sending an Image

Images can be sent as base64-encoded data or via URL:

import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

Supported Image Types

JPEG
PNG
GIF
WebP

Best Practices for Vision

Use high-resolution images when details matter
Combine with text prompts for precise instructions
Keep images under 20MB for optimal performance
Use multiple images in a single message for comparison tasks

Handling Stop Reasons

The stop_reason field in the response tells you why Claude stopped generating. Common values:

stop_reason	Meaning
`end_turn`	Claude finished naturally
`max_tokens`	Response hit the token limit
`stop_sequence`	A stop sequence was encountered
`tool_use`	Claude wants to use a tool

Example: Handling max_tokens

If you get max_tokens, you can continue the conversation by sending the partial response back:

if message.stop_reason == "max_tokens":
    # Append Claude's partial response to history
    messages.append({"role": "assistant", "content": message.content[0].text})
    messages.append({"role": "user", "content": "Please continue."})
    
    # Send continuation request
    continuation = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )

Streaming Responses

For real-time applications, use streaming to receive tokens as they're generated:

import anthropic
client = anthropic.Anthropic()
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Error Handling

Always handle potential errors gracefully:

import anthropic
from anthropic import APIError, APIConnectionError, RateLimitError
try:
    message = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError:
    print("Rate limit exceeded. Retrying...")
    time.sleep(5)
except APIConnectionError:
    print("Network error. Check your connection.")
except APIError as e:
    print(f"API error: {e}")

Key Takeaways

The Messages API is stateless—you must send the full conversation history with each request. Manage conversation state on your end.
Use prefill to constrain outputs for multiple choice, JSON, or structured responses, but check model compatibility first.
Vision support enables image analysis—send base64-encoded images alongside text prompts for powerful multimodal applications.
Handle stop reasons appropriately—max_tokens means you can continue the conversation; end_turn means Claude finished naturally.
Stream responses for real-time UX—use the streaming API to show tokens as they're generated, improving perceived performance.