GuideBeginnerAPI2026-05-13

Mastering the Messages API: A Practical Guide to Building with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Quick Answer

This guide covers the core patterns of Claude's Messages API, including sending basic requests, building multi-turn conversations, using prefill to shape responses, and handling images with vision. You'll get practical code examples in Python and TypeScript.

Messages APIClaude APIMulti-turn ConversationsPrefillVision

Introduction

Claude's Messages API is the primary way to interact with the model programmatically. Whether you're building a chatbot, a content generator, or a complex agent, understanding the Messages API is essential. This guide walks you through the most common patterns—from a simple request to multi-turn conversations, prefill techniques, and vision capabilities.

Anthropic offers two paths for building with Claude: the Messages API for direct model access and fine-grained control, and Claude Managed Agents for pre-built, configurable agent harnesses. This guide focuses on the Messages API, which is ideal for custom agent loops and applications that need precise control over the conversation flow.

Basic Request and Response

At its simplest, the Messages API accepts a list of messages (each with a role and content) and returns a response from Claude. Here's a minimal example in Python and TypeScript.

Python Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function main() {
  const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
      { role: 'user', content: 'Hello, Claude' }
    ]
  });
console.log(message);
}
main();

Understanding the Response

The API returns a structured JSON object. Here's a typical response:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

content: An array of content blocks. Currently, the main type is text, but you'll also see tool_use when using tools.
stop_reason: Indicates why the response ended. Common values are "end_turn" (Claude finished naturally), "max_tokens" (hit the token limit), or "tool_use" (Claude wants to call a tool).
usage: Token counts for billing and monitoring.

Building Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over the context but requires you to manage the message list on your end.

Example: Two-Turn Conversation

import anthropic
client = anthropic.Anthropic()
First turn
messages = [
    {"role": "user", "content": "What is the capital of France?"}
]
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
Append assistant response to history
messages.append({"role": "assistant", "content": response.content[0].text})
Second turn
messages.append({"role": "user", "content": "What is its population?"})
response2 = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
print(response2.content[0].text)

Synthetic Assistant Messages

You can also inject synthetic assistant messages—messages that weren't actually generated by Claude. This is useful for:

Providing context: "You previously said X."
Guiding the conversation: "Now, let's move on to topic Y."
Simulating a multi-step process: "Step 1 is complete. Here's the result."

messages = [
    {"role": "user", "content": "What's the weather in Tokyo?"},
    {"role": "assistant", "content": "I don't have real-time data, but I can help you find a weather API."},
    {"role": "user", "content": "Okay, show me how to call it."}
]

Note: Earlier turns don't need to originate from Claude. You can use synthetic messages to simulate a conversation history or provide structured context.

Putting Words in Claude's Mouth (Prefill)

Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:

Constraining output format: Force Claude to start with a specific word or phrase.
Multiple choice questions: Get a single letter answer by setting max_tokens: 1.
Structured outputs: Begin a JSON object or XML block.

Example: Multiple Choice

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {"role": "user", "content": "What is the capital of France? A) London B) Paris C) Berlin D) Madrid"},
        {"role": "assistant", "content": "B"}
    ]
)
print(message.content[0].text)  # Output: B

Important Limitations

Not supported on: Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests with prefill on these models return a 400 error.
Alternative: Use structured outputs or system prompt instructions for these models. See the migration guide for patterns.

Prefill for JSON Output

messages = [
    {"role": "user", "content": "Extract the name and age from: 'John is 30 years old.'"},
    {"role": "assistant", "content": "{\"name\": \""}
]
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=50,
    messages=messages
)
print(response.content[0].text)  # Output: John", "age": 30}

Vision: Working with Images

Claude can process images in requests using the Messages API. You can supply images via:

base64: Base64-encoded image data.
url: A publicly accessible URL.
file: An image uploaded through the Files API.

Supported media types: image/jpeg, image/png, image/gif, image/webp.

Example: Image from URL

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/ant.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "What is in this image?"
                }
            ]
        }
    ]
)
print(message.content[0].text)

Example: Image from Base64

import base64
import anthropic
client = anthropic.Anthropic()
with open("ant.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/jpeg",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this image in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

Vision Response

Claude returns a text description of the image. For example:

{
  "content": [
    {
      "type": "text",
      "text": "This image shows an ant, specifically a close-up view of an ant. The ant is shown in detail, with its distinct head, antennae, and legs clearly visible."
    }
  ]
}

Best Practices

Manage context windows carefully: Since the API is stateless, you control the context. Be mindful of token limits—trim older messages if needed.
Use max_tokens wisely: For constrained outputs (like multiple choice), set max_tokens low. For open-ended responses, set it higher.
Handle stop_reason: Always check stop_reason in the response. If it's "max_tokens", the response was truncated. If it's "tool_use", you need to execute a tool and continue the conversation.
Prefill for structure: Use prefill to enforce output formats, but remember it's not supported on all models. Fall back to system prompts or structured outputs for newer models.
Image optimization: For vision, use compressed images (JPEG/WebP) to reduce token usage. Larger images consume more tokens.

Key Takeaways

The Messages API is stateless—you must send the full conversation history with every request. Manage the message list on your end.
Prefill lets you shape Claude's response by providing the beginning of its answer. Use it for multiple choice, JSON output, or any constrained format—but check model compatibility.
Vision support is built-in via base64, url, or file sources. Claude can analyze images alongside text in a single request.
Synthetic assistant messages allow you to inject context or guide the conversation without requiring actual Claude responses.
Always check stop_reason to understand why Claude stopped and handle truncation or tool calls appropriately.