BeClaude
Guide2026-04-22

Mastering the Messages API: A Practical Guide to Building with Claude

Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision. Includes code examples in Python and TypeScript.

Quick Answer

This guide covers the core patterns of the Claude Messages API: making basic requests, building multi-turn conversations, using prefill to shape responses, and sending images for vision tasks. You'll get practical code examples and best practices for each pattern.

Messages APIClaude APIMulti-turn conversationsPrefillVision

Introduction

The Claude Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generator, or a vision-enabled application, understanding the Messages API is essential. This guide walks you through the most common and powerful patterns: basic requests, multi-turn conversations, prefill techniques, and vision capabilities.

Basic Request and Response

At its simplest, the Messages API lets you send a user message and receive Claude's response. Here's a minimal example in Python:

import anthropic

client = anthropic.Anthropic() message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] ) print(message)

And the equivalent in TypeScript:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); const message = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 1024, messages: [ { role: 'user', content: 'Hello, Claude' } ] }); console.log(message);

The response includes the model's reply, usage statistics, and a stop_reason that tells you why the response ended (e.g., "end_turn" for a natural completion).

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context. To continue a conversation, simply append new turns to the messages array:

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message)

You can also include synthetic assistant messages—responses that didn't actually come from Claude. This is useful for:

  • Injecting system-like instructions in the conversation flow
  • Correcting or guiding Claude's behavior
  • Simulating a conversation history for testing

Prefill: Putting Words in Claude's Mouth

Prefill lets you start Claude's response by providing the beginning of its answer. This is powerful for:

  • Forcing a specific format (e.g., JSON, multiple choice)
  • Guiding the tone or structure of the response
  • Reducing token usage for predictable outputs
Here's an example that forces Claude to answer a multiple-choice question with a single letter:

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message)

By setting max_tokens=1 and prefilling with "The answer is (", Claude will only generate the next token—likely "C".

Note: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. For those models, use structured outputs or system prompt instructions instead.

Vision: Sending Images to Claude

Claude can analyze images sent via the Messages API. You can provide images as base64-encoded data or as a URL. Here's how to send a base64-encoded image:

import anthropic
import base64

client = anthropic.Anthropic()

with open("chart.png", "rb") as f: image_data = base64.b64encode(f.read()).decode("utf-8")

message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": image_data } }, { "type": "text", "text": "Describe this chart in detail." } ] } ] ) print(message.content[0].text)

You can also use a URL:

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/chart.png"
                    }
                },
                {
                    "type": "text",
                    "text": "What trends do you see in this data?"
                }
            ]
        }
    ]
)

Supported media types include image/jpeg, image/png, image/gif, and image/webp. For best results, use images under 20MB and with reasonable dimensions.

Best Practices

  • Always set max_tokens to control response length and cost.
  • Use stop_reason in the response to handle different scenarios (e.g., "max_tokens" means the response was cut off).
  • Keep conversation history concise to stay within context windows and reduce latency.
  • Use prefill sparingly—it's powerful but can lead to unexpected behavior if not carefully tested.
  • For vision tasks, provide clear text instructions alongside images for best results.

Key Takeaways

  • The Messages API is stateless; always send the full conversation history.
  • Prefill lets you shape Claude's response by providing the beginning of its answer.
  • Vision support allows you to send images as base64 or URL for analysis.
  • Synthetic assistant messages give you fine-grained control over conversation flow.
  • Always handle stop_reason in your application logic to manage incomplete responses.