Guide2026-05-06

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Learn how to use the Claude Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision. Includes code examples in Python and TypeScript.

Quick Answer

This guide covers the Claude Messages API: how to send basic requests, build multi-turn conversations, prefill Claude's responses, and use vision capabilities. You'll get practical Python and TypeScript code examples for each pattern.

Messages APIClaude APIConversational AIPrefillVision

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

If you're building applications with Claude, the Messages API is your primary interface. Whether you're creating a simple chatbot, a multi-turn assistant, or a vision-powered tool, understanding the Messages API patterns is essential.

This guide walks you through the most common and powerful patterns for working with the Messages API, including basic requests, multi-turn conversations, prefill techniques, and vision capabilities. By the end, you'll be able to build robust conversational applications with Claude.

Understanding the Messages API vs. Managed Agents

Anthropic offers two ways to build with Claude:

Messages API: Direct model prompting access. Best for custom agent loops and fine-grained control.
Claude Managed Agents: Pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.

This guide focuses on the Messages API, which gives you full control over every request and response.

Basic Request and Response

Let's start with the simplest pattern: sending a single message to Claude and getting a response.

Python Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        { role: 'user', content: 'Hello, Claude' }
    ]
});
console.log(message);

Response Structure

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

content: An array of content blocks (usually text).
stop_reason: Why the response ended (end_turn, max_tokens, stop_sequence, etc.).
usage: Token counts for billing and monitoring.

Building Multi-Turn Conversations

The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context.

Python Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message.content[0].text)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        { role: 'user', content: 'Hello, Claude' },
        { role: 'assistant', content: 'Hello!' },
        { role: 'user', content: 'Can you describe LLMs to me?' }
    ]
});
console.log(message.content[0].text);

Best Practices for Multi-Turn Conversations

Maintain full history: Always include all previous messages in the messages array.
Synthetic assistant messages: You can inject pre-written assistant responses to guide the conversation.
Token management: Be mindful of context window limits. Longer histories consume more tokens.
Role alternation: Messages must alternate between user and assistant roles.

Prefilling Claude's Response

Prefilling lets you start Claude's response for it. This is powerful for:

Forcing structured outputs (e.g., JSON, multiple choice)
Guiding tone or style
Reducing token usage for predictable responses

Python Example: Multiple Choice

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # Output: "C"

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 1,
    messages: [
        {
            role: 'user',
            content: 'What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae'
        },
        {
            role: 'assistant',
            content: 'The answer is ('
        }
    ]
});
console.log(message.content[0].text);  // Output: "C"

Important Limitations

Prefilling is not supported on these models:

Claude Mythos Preview
Claude Opus 4.7
Claude Opus 4.6
Claude Sonnet 4.6

Requests using prefill with these models will return a 400 error. For these models, use structured outputs or system prompt instructions instead.

Vision Capabilities

Claude can analyze images sent through the Messages API. This opens up use cases like:

Document analysis
Image description
Visual Q&A

Python Example

import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
import * as fs from 'fs';
const client = new Anthropic();
// Read and encode image
const imageBuffer = fs.readFileSync('chart.png');
const imageBase64 = imageBuffer.toString('base64');
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        {
            role: 'user',
            content: [
                {
                    type: 'image',
                    source: {
                        type: 'base64',
                        media_type: 'image/png',
                        data: imageBase64
                    }
                },
                {
                    type: 'text',
                    text: 'Describe this chart in detail.'
                }
            ]
        }
    ]
});
console.log(message.content[0].text);

Supported Image Formats

Claude supports common image formats including PNG, JPEG, GIF, and WebP. For best results, use clear, high-resolution images.

Handling Stop Reasons

The stop_reason field tells you why Claude stopped generating. Common values:

stop_reason	Meaning
`end_turn`	Claude finished naturally
`max_tokens`	Response hit the token limit
`stop_sequence`	A stop sequence was encountered
`tool_use`	Claude wants to use a tool

For max_tokens, you may want to continue the conversation by sending the partial response back as a new assistant message.

Key Takeaways

Stateless design: Always send the full conversation history. The API does not maintain state between requests.
Prefill for control: Use prefill to guide Claude's responses, but check model compatibility first.
Vision is powerful: Send images alongside text for document analysis, visual Q&A, and more.
Monitor stop reasons: Handle max_tokens and tool_use appropriately in your application logic.
Token awareness: Track usage fields to manage costs and context limits effectively.