GuideBeginnerAPI2026-05-15

Mastering the Messages API: A Practical Guide to Building Conversations with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Quick Answer

This guide teaches you how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques to shape responses, and vision capabilities for image analysis—all with practical Python and TypeScript code examples.

Messages APIConversation DesignClaude APIPrefillVision

Mastering the Messages API: A Practical Guide to Building Conversations with Claude

Claude's Messages API is the primary way to interact with the model programmatically. Whether you're building a chatbot, a content generation tool, or a complex agent system, understanding how to structure messages is essential. This guide walks you through everything from basic requests to advanced techniques like prefill and vision.

Understanding the Messages API vs. Claude Managed Agents

Before diving in, it's important to know that Anthropic offers two approaches for building with Claude:

Messages API: Direct model prompting access—you control every aspect of the conversation loop. Best for custom agent loops and fine-grained control.
Claude Managed Agents: A pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.

This guide focuses on the Messages API, which gives you maximum flexibility.

Making Your First API Request

The simplest interaction with Claude involves sending a single user message and receiving a response. Here's how it looks in Python:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

The response includes several important fields:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key Response Fields

id: Unique identifier for the message
content: Array of content blocks (text, tool_use, etc.)
stop_reason: Why Claude stopped generating (end_turn, max_tokens, stop_sequence, or tool_use)
usage: Token counts for billing and monitoring

Building Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context but requires careful management.

Basic Multi-Turn Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message.content[0].text)

Managing Conversation State

Since you control the history, you can:

Trim old messages to stay within context windows
Inject system prompts at the beginning
Add synthetic assistant messages to guide behavior
Persist conversations across sessions by storing message arrays

Pro Tip: Earlier turns don't need to originate from Claude. You can inject synthetic assistant messages to provide context or correct behavior without waiting for the API.

Putting Words in Claude's Mouth: The Prefill Technique

Prefilling allows you to start Claude's response for it. This is incredibly useful for:

Constraining output format (e.g., getting a single letter answer)
Guiding response structure (e.g., starting with "The answer is")
Reducing token usage for predictable responses

Example: Multiple Choice Answer

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # Outputs: "C"

Important Prefill Limitations

Not supported on: Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6
These models return a 400 error for prefill requests
Alternative: Use structured outputs or system prompt instructions instead
Check the migration guide for patterns

When to Use Prefill vs. System Prompts

Technique	Best For
Prefill	Short, constrained outputs (single tokens, specific formats)
System Prompt	General behavior guidance, long instructions
Structured Outputs	JSON schemas, complex structured data

Working with Vision Capabilities

Claude can analyze images sent through the Messages API. This opens up use cases like:

Document analysis
Image description
UI/UX review
Visual data extraction

Sending an Image

import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail"
                }
            ]
        }
    ]
)
print(message.content[0].text)

Supported Image Formats

JPEG, PNG, GIF, WebP
Maximum size: ~100MB (base64 encoded)
Best results with clear, high-contrast images

Handling Stop Reasons

Understanding why Claude stopped generating helps you build robust applications:

Stop Reason	Meaning	Action
`end_turn`	Claude finished naturally	Continue or end conversation
`max_tokens`	Hit token limit	Increase `max_tokens` or continue
`stop_sequence`	Found a stop sequence	Process the response
`tool_use`	Claude wants to use a tool	Execute tool and return result

if message.stop_reason == "max_tokens":
    # Continue the conversation to get more output
    messages.append({"role": "assistant", "content": message.content[0].text})
    messages.append({"role": "user", "content": "Please continue"})
elif message.stop_reason == "tool_use":
    # Handle tool calls
    pass

Best Practices for Production

1. Manage Context Windows

Claude's context window is large but finite. Implement strategies like:

Sliding window (keep last N messages)
Summarization of old conversations
Prompt caching for frequently used context

2. Handle Errors Gracefully

try:
    message = client.messages.create(...)
except anthropic.APIError as e:
    print(f"API error: {e}")
    # Implement retry logic
except anthropic.APIConnectionError as e:
    print(f"Connection error: {e}")
    # Queue for retry
except anthropic.RateLimitError as e:
    print(f"Rate limited: {e}")
    # Implement exponential backoff

3. Monitor Token Usage

Track usage.input_tokens and usage.output_tokens for cost management and optimization.

4. Use Streaming for Real-Time UX

For chat applications, enable streaming to show responses as they're generated:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Key Takeaways

The Messages API is stateless—you must send full conversation history with every request, giving you complete control over context management
Prefill technique lets you start Claude's response, enabling constrained outputs and format control (but check model compatibility)
Multi-turn conversations require careful history management; you can inject synthetic assistant messages to guide behavior
Vision capabilities allow image analysis by sending base64-encoded images alongside text prompts
Always handle stop reasons (end_turn, max_tokens, tool_use) to build robust, production-ready applications

For complete API specifications, refer to the Messages API reference.