Guide2026-04-24

Mastering the Messages API: Build Conversational AI with Claude

Learn how to use Claude's Messages API for multi-turn conversations, response prefilling, and vision tasks. Includes Python and TypeScript code examples.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational AI applications. You'll learn basic requests, multi-turn conversations, prefilling responses, and vision capabilities with practical code examples.

Messages APIClaude APIMulti-turn conversationPrefillVision

Introduction

Claude's Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generator, or a vision-enabled assistant, the Messages API gives you direct access to Claude's powerful language and reasoning capabilities.

This guide covers the essential patterns you'll need to work with the Messages API effectively: basic requests, multi-turn conversations, prefilling responses, and vision capabilities. By the end, you'll be able to build sophisticated conversational applications with Claude.

Basic Request and Response

Let's start with the simplest interaction: sending a single message and getting a response. Here's how it looks in Python and TypeScript:

Python

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message.content[0].text)

TypeScript

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        { role: 'user', content: 'Hello, Claude' }
    ]
});
console.log(message.content[0].text);

The response includes the model's reply, along with metadata like the stop_reason and token usage. The stop_reason tells you why Claude stopped generating—commonly "end_turn" (natural completion) or "max_tokens" (hit the token limit).

Multi-Turn Conversations

The Messages API is stateless—it doesn't remember previous interactions. To maintain a conversation, you must send the full history with each request. This gives you complete control over the context.

Here's how to build a two-turn conversation:

Python

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": "Hello! How can I help you today?"},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message.content[0].text)

TypeScript

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        { role: 'user', content: 'Hello, Claude' },
        { role: 'assistant', content: 'Hello! How can I help you today?' },
        { role: 'user', content: 'Can you describe LLMs to me?' }
    ]
});
console.log(message.content[0].text);

Notice that you include the assistant's previous response as part of the input. This pattern allows you to build long-running conversations by appending each new turn to the message array.

Pro tip: You can also inject synthetic assistant messages—they don't have to come from Claude. This is useful for guiding the conversation or providing context.

Putting Words in Claude's Mouth (Prefilling)

Prefilling lets you start Claude's response for it. You include a partial assistant message at the end of the input, and Claude continues from there. This is powerful for:

Constraining responses (e.g., multiple choice answers)
Setting the tone or format
Guiding Claude toward a specific structure

Here's an example that forces Claude to output just a single letter for a multiple-choice question:

Python

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # Outputs: "C"

TypeScript

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
    model: 'claude-sonnet-4-5',
    max_tokens: 1,
    messages: [
        {
            role: 'user',
            content: 'What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae'
        },
        {
            role: 'assistant',
            content: 'The answer is ('
        }
    ]
});
console.log(message.content[0].text);  // Outputs: "C"

By setting max_tokens=1, Claude only generates the next token—in this case, the letter "C". The prefilled text "The answer is (" sets the context so Claude completes naturally.

Important: Prefilling is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. For these models, use structured outputs or system prompt instructions instead.

Vision Capabilities

The Messages API also supports image inputs. You can send images as base64-encoded data or via URLs. Here's an example:

Python

import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

TypeScript

import Anthropic from '@anthropic-ai/sdk';
import * as fs from 'fs';
const client = new Anthropic();
// Read and encode image
const imageBuffer = fs.readFileSync('chart.png');
const base64Image = imageBuffer.toString('base64');
const message = await client.messages.create({
    model: 'claude-opus-4-7',
    max_tokens: 1024,
    messages: [
        {
            role: 'user',
            content: [
                {
                    type: 'image',
                    source: {
                        type: 'base64',
                        media_type: 'image/png',
                        data: base64Image
                    }
                },
                {
                    type: 'text',
                    text: 'Describe this chart in detail.'
                }
            ]
        }
    ]
});
console.log(message.content[0].text);

You can also use image URLs:

{
    "type": "image",
    "source": {
        "type": "url",
        "url": "https://example.com/chart.png"
    }
}

Supported media types include image/jpeg, image/png, image/gif, and image/webp.

Handling Stop Reasons

Every response includes a stop_reason field. Understanding these helps you handle different scenarios:

end_turn: Claude finished naturally. The response is complete.
max_tokens: Claude hit the token limit. The response may be truncated. You can continue by sending the partial response back with a follow-up request.
stop_sequence: Claude encountered a custom stop sequence you defined.
tool_use: Claude wants to call a tool (if you've enabled tools).

For example, if you get stop_reason: "max_tokens", you can append the partial response and ask Claude to continue:

# After getting a truncated response
messages.append({"role": "assistant", "content": partial_response})
messages.append({"role": "user", "content": "Please continue."})

Best Practices

Manage context length: Since you send the full history, be mindful of token limits. For long conversations, consider summarizing earlier turns or using prompt caching.

Use system prompts for instructions: For general behavior guidelines, use the system parameter instead of repeating instructions in every user message.

Handle errors gracefully: The API may return errors for invalid requests (e.g., unsupported model for prefilling). Always check the response status.

Stream for real-time applications: Use streaming to get tokens as they're generated, improving perceived responsiveness.

Key Takeaways

The Messages API is stateless—you must send the full conversation history with each request to maintain context.
Prefilling lets you start Claude's response, which is useful for constraining outputs or guiding format.
Vision capabilities allow you to send images as base64 or URLs for analysis.
Monitor stop_reason to handle truncated responses or tool calls appropriately.
Always check model compatibility for advanced features like prefilling.

With these patterns, you're ready to build powerful conversational applications with Claude. Start experimenting with the API today!