BeClaude
Guide2026-05-05

Mastering the Messages API: A Practical Guide to Building with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision. Includes code examples in Python and TypeScript.

Quick Answer

This guide covers the core patterns of Claude's Messages API: making basic requests, building multi-turn conversations, using prefill to shape responses, and handling images with vision. You'll get practical code examples and best practices for each pattern.

Messages APIClaude APIMulti-turn conversationsPrefillVision

Introduction

Claude's Messages API is the primary way to interact with the model programmatically. Whether you're building a chatbot, a content generator, or a vision-enabled application, understanding the Messages API is essential. This guide walks you through the most common patterns—from simple requests to advanced techniques like prefill and vision—with practical code examples in Python and TypeScript.

Basic Request and Response

At its simplest, the Messages API takes a list of messages and returns a response. Here's how to send a basic greeting to Claude:

Python

import anthropic

client = anthropic.Anthropic() message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] ) print(message)

TypeScript

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); const message = await client.messages.create({ model: "claude-opus-4-7", max_tokens: 1024, messages: [ { role: "user", content: "Hello, Claude" } ] }); console.log(message);

Response Structure

The response includes:

  • id: Unique message identifier
  • role: Always "assistant"
  • content: Array of content blocks (typically text)
  • model: The model used
  • stop_reason: Why generation stopped (e.g., "end_turn", "max_tokens")
  • usage: Token counts for input and output
Example output:
{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {"input_tokens": 12, "output_tokens": 6}
}

Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with each request. This gives you complete control over context but requires you to manage the conversation state on your end.

Building a Conversation

To continue a conversation, append new messages to the history:

import anthropic

client = anthropic.Anthropic()

Start the conversation

messages = [ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ]

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=messages ) print(message.content[0].text)

Synthetic Assistant Messages

You can inject synthetic assistant messages into the history. This is useful for:

  • Guiding the conversation: Pre-filling context that Claude didn't generate
  • Role-playing scenarios: Setting up a character or persona
  • Correcting mistakes: Providing the correct answer for Claude to build upon
messages = [
    {"role": "user", "content": "What's the capital of France?"},
    # Synthetic assistant message to set context
    {"role": "assistant", "content": "The capital of France is Paris."},
    {"role": "user", "content": "What's its population?"}
]

Prefill: Putting Words in Claude's Mouth

Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:

  • Constraining output format (e.g., JSON, multiple choice)
  • Setting tone or style
  • Reducing token usage by guiding the response early

Multiple Choice Example

Here's how to get a single-letter answer from Claude:

import anthropic

client = anthropic.Anthropic() message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1, messages=[ { "role": "user", "content": "What is Latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae" }, { "role": "assistant", "content": "The answer is (" } ] ) print(message.content[0].text) # Output: "C"

Important Prefill Limitations

Prefill is not supported on the following models:

  • Claude Mythos Preview
  • Claude Opus 4.7
  • Claude Opus 4.6
  • Claude Sonnet 4.6
Requests using prefill with these models will return a 400 error. For these models, use structured outputs or system prompt instructions instead.

Vision: Working with Images

Claude can process images sent via the Messages API. This enables use cases like image analysis, document OCR, and visual question answering.

Sending an Image

Images can be sent as base64-encoded data or via URL:

import anthropic
import base64

client = anthropic.Anthropic()

Read and encode image

with open("chart.png", "rb") as f: image_data = base64.b64encode(f.read()).decode("utf-8")

message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": image_data } }, { "type": "text", "text": "Describe this chart in detail." } ] } ] ) print(message.content[0].text)

Supported Image Types

  • JPEG
  • PNG
  • GIF
  • WebP

Best Practices for Vision

  • Use high-resolution images when details matter
  • Combine with text prompts for precise instructions
  • Keep images under 20MB for optimal performance
  • Use multiple images in a single message for comparison tasks

Handling Stop Reasons

The stop_reason field in the response tells you why Claude stopped generating. Common values:

stop_reasonMeaning
end_turnClaude finished naturally
max_tokensResponse hit the token limit
stop_sequenceA stop sequence was encountered
tool_useClaude wants to use a tool

Example: Handling max_tokens

If you get max_tokens, you can continue the conversation by sending the partial response back:

if message.stop_reason == "max_tokens":
    # Append Claude's partial response to history
    messages.append({"role": "assistant", "content": message.content[0].text})
    messages.append({"role": "user", "content": "Please continue."})
    
    # Send continuation request
    continuation = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=messages
    )

Streaming Responses

For real-time applications, use streaming to receive tokens as they're generated:

import anthropic

client = anthropic.Anthropic()

with client.messages.stream( model="claude-opus-4-7", max_tokens=1024, messages=[{"role": "user", "content": "Tell me a story"}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)

Error Handling

Always handle potential errors gracefully:

import anthropic
from anthropic import APIError, APIConnectionError, RateLimitError

try: message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[{"role": "user", "content": "Hello"}] ) except RateLimitError: print("Rate limit exceeded. Retrying...") time.sleep(5) except APIConnectionError: print("Network error. Check your connection.") except APIError as e: print(f"API error: {e}")

Key Takeaways

  • The Messages API is stateless—you must send the full conversation history with each request. Manage conversation state on your end.
  • Use prefill to constrain outputs for multiple choice, JSON, or structured responses, but check model compatibility first.
  • Vision support enables image analysis—send base64-encoded images alongside text prompts for powerful multimodal applications.
  • Handle stop reasons appropriatelymax_tokens means you can continue the conversation; end_turn means Claude finished naturally.
  • Stream responses for real-time UX—use the streaming API to show tokens as they're generated, improving perceived performance.