Guide2026-05-06

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Learn how to use Claude's Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision tasks with practical code examples in Python and TypeScript.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with real-world code examples.

Messages APIClaude APIConversational AIPrefillVision

Introduction

The Messages API is the core interface for interacting with Claude programmatically. Whether you're building a chatbot, a document analyzer, or a vision-enabled assistant, understanding how to structure requests and handle responses is essential. This guide walks you through the most common patterns—from a simple "Hello, Claude" to multi-turn conversations, prefill techniques, and image analysis.

Basic Request and Response

At its simplest, the Messages API accepts a list of messages and returns Claude's response. Here's a minimal example in Python:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

The response includes the model's reply, metadata, and token usage:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to watch:

stop_reason: Indicates why Claude stopped. "end_turn" means the response is complete.
usage: Track input and output tokens for cost monitoring.

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you full control over context but requires you to manage state on your end.

Building a Conversation

import anthropic
client = anthropic.Anthropic()
First turn
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
Second turn: include previous exchange
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message.content[0].text)

Synthetic Assistant Messages

You can inject pre-written assistant messages into the history. This is useful for:

Setting context: Providing a backstory or persona.
Guiding behavior: Showing Claude how you want it to respond.
Simulating conversations: Testing dialogue flows.

messages = [
    {"role": "user", "content": "You are a helpful tutor. Explain quantum computing."},
    {"role": "assistant", "content": "I'd be happy to explain quantum computing! Let's start with the basics."},
    {"role": "user", "content": "What is a qubit?"}
]

Pro tip: Always include the full history. Omitting turns can confuse Claude and lead to inconsistent responses.

Prefill: Putting Words in Claude's Mouth

Prefilling lets you start Claude's response by providing the beginning of its answer. This is powerful for:

Constraining output format (e.g., JSON, multiple choice)
Setting tone or style
Reducing token waste on boilerplate

Example: Multiple Choice Answer

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # Output: "C"

By setting max_tokens=1 and prefilling "The answer is (", Claude only generates the letter—perfect for structured outputs.

Important Limitations

Prefilling is not supported on these models:

Claude Mythos Preview
Claude Opus 4.7
Claude Opus 4.6
Claude Sonnet 4.6

Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.

Migration from Prefill

If you're moving away from prefill, here's how to achieve similar results:

Option 1: System Prompt

client.messages.create(
    model="claude-opus-4-7",
    system="Always respond in JSON format with keys: 'answer', 'explanation'.",
    messages=[{"role": "user", "content": "What is the capital of France?"}]
)

Option 2: Structured Outputs Use the structured outputs feature (available in the API) to define a schema for Claude's response.

Vision: Working with Images

Claude can analyze images sent via the Messages API. This enables use cases like:

Document analysis: Extracting text from screenshots or PDFs.
Visual QA: Answering questions about diagrams or photos.
Content moderation: Identifying objects or text in images.

Sending an Image

import anthropic
import base64
client = anthropic.Anthropic()
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "What does this chart show?"
                }
            ]
        }
    ]
)
print(message.content[0].text)

Supported media types: image/png, image/jpeg, image/webp, image/gif.

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
import fs from 'fs';
const client = new Anthropic();
const imageBuffer = fs.readFileSync('chart.png');
const base64Image = imageBuffer.toString('base64');
const message = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [
    {
      role: 'user',
      content: [
        {
          type: 'image',
          source: {
            type: 'base64',
            media_type: 'image/png',
            data: base64Image
          }
        },
        {
          type: 'text',
          text: 'Describe this image in detail.'
        }
      ]
    }
  ]
});
console.log(message.content[0].text);

Best Practices

Manage token usage: Track usage.input_tokens and usage.output_tokens to stay within limits and control costs.
Handle stop reasons: Check stop_reason to determine if Claude finished naturally (end_turn) or was cut off (max_tokens).
Use system prompts for instructions: For models that don't support prefill, leverage the system parameter for high-level guidance.
Keep conversations focused: Include only relevant history to avoid exceeding context windows.
Test with different models: Each Claude model has unique strengths—experiment to find the best fit.

Key Takeaways

Stateless design: Always send the full conversation history; manage state on your end.
Prefill for precision: Use prefill to constrain output format, but check model compatibility first.
Vision is powerful: Send images as base64-encoded data for visual analysis tasks.
Monitor usage: Track token counts and stop reasons to optimize performance and cost.
Migrate when needed: For models that don't support prefill, use system prompts or structured outputs.

Now you're ready to build sophisticated conversational AI applications with Claude's Messages API. Happy coding!