GuideBeginnerAPI2026-05-12

Mastering the Messages API: Build Multi-Turn Conversations with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational applications, including sending basic requests, managing multi-turn dialogues, pre-filling responses, and processing images.

Messages APIClaude APImulti-turn conversationsprefillvision

Introduction

The Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a document analysis tool, or a creative writing assistant, understanding how to structure your API calls is essential. This guide covers the core patterns you'll use daily: basic requests, multi-turn conversations, prefill techniques, and vision capabilities.

Basic Request and Response

At its simplest, a Messages API call requires three things:

model: The Claude model you want to use (e.g., claude-opus-4-7)
max_tokens: The maximum number of tokens in Claude's response
messages: An array of message objects, each with a role and content

Here's a minimal example in Python:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

The response includes:

id: Unique message identifier
role: Always "assistant"
content: Array of content blocks (typically text)
model: The model used
stop_reason: Why generation stopped ("end_turn", "max_tokens", etc.)
usage: Token counts for input and output

Example output:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {"input_tokens": 12, "output_tokens": 6}
}

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with each request. This gives you complete control over context but requires careful management.

Building a Conversation

To continue a conversation, append both Claude's previous response and the user's new message to the messages array:

import anthropic
client = anthropic.Anthropic()
First turn
messages = [
    {"role": "user", "content": "What is the capital of France?"}
]
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
Add Claude's response to history
messages.append({"role": "assistant", "content": response.content[0].text})
Add user's follow-up
messages.append({"role": "user", "content": "What about Italy?"})
Second turn
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
print(response.content[0].text)

Synthetic Assistant Messages

You can inject pre-written assistant messages into the history. This is useful for:

Setting up a scenario or persona
Providing example responses (few-shot prompting)
Correcting or editing Claude's past responses

messages = [
    {"role": "user", "content": "Explain quantum computing in simple terms."},
    {"role": "assistant", "content": "Quantum computing uses qubits that can be 0 and 1 simultaneously, unlike classical bits."},
    {"role": "user", "content": "Give me an analogy."}
]

Prefill: Putting Words in Claude's Mouth

Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:

Enforcing a specific format (e.g., JSON, multiple choice)
Guiding the tone or direction
Reducing token usage by constraining output

Example: Multiple Choice

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {"role": "user", "content": "What is the best programming language for beginners?\nA) Python\nB) Java\nC) C++\nD) Rust"},
        {"role": "assistant", "content": "A"}
    ]
)
print(message.content[0].text)  # Outputs: A

Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. For these models, use structured outputs or system prompt instructions instead.

Vision: Working with Images

Claude can process images alongside text. You can supply images in three ways:

base64: Inline base64-encoded image data
url: Publicly accessible image URL
file: Reference to a file uploaded via the Files API

Supported media types: image/jpeg, image/png, image/gif, image/webp

Example with Base64

import anthropic
import base64
client = anthropic.Anthropic()
with open("photo.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/jpeg",
                        "data": image_data
                    }
                }
            ]
        }
    ]
)
print(message.content[0].text)

Example with URL

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image"},
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/photo.jpg"
                    }
                }
            ]
        }
    ]
)

Best Practices

Manage token usage: Monitor usage.input_tokens and usage.output_tokens to control costs. Use max_tokens to limit response length.
Handle stop reasons: Check stop_reason in responses. "end_turn" means Claude finished naturally; "max_tokens" means the response was cut off.
Use streaming for long responses: For real-time applications, enable streaming to get partial results as Claude generates them.
Cache frequent prefixes: Use prompt caching for system prompts or long conversation histories to reduce latency and cost.
Validate image sizes: Large images consume more tokens. Resize or compress images before sending to optimize performance.

Key Takeaways

The Messages API is stateless—always send the full conversation history with each request.
Prefill lets you control the beginning of Claude's response, useful for formatting and guidance.
Claude supports vision with images in base64, URL, or file reference formats.
Synthetic assistant messages allow you to inject example responses or correct past interactions.
Always check stop_reason and usage fields to monitor response completeness and costs.