GuideBeginnerAPI2026-05-12

Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude

Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities. Practical code examples included.

Quick Answer

This guide teaches you how to use Claude's Messages API to build stateless multi-turn conversations, prefill responses for structured outputs, and send images for vision analysis, with practical Python and TypeScript examples.

Messages APIClaude APIMulti-turn ConversationsVisionPrefill

Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude

Claude's Messages API is the backbone of programmatic interaction with Anthropic's powerful language models. Whether you're building a chatbot, a document analysis tool, or a vision-enabled application, understanding how to work with messages is essential.

This guide covers the core patterns you'll use daily: basic requests, multi-turn conversations, prefill techniques, and vision capabilities. By the end, you'll be able to build sophisticated, stateful applications on top of Claude's stateless API.

Understanding the Messages API vs. Claude Managed Agents

Anthropic offers two primary ways to build with Claude:

Feature	Messages API	Claude Managed Agents
What it is	Direct model prompting access	Pre-built, configurable agent harness
Best for	Custom agent loops and fine-grained control	Long-running tasks and asynchronous work
Learn more	Messages API docs	Managed Agents docs

For most developers building custom applications, the Messages API is the right choice. It gives you full control over every message sent and received.

Making Your First API Request

Let's start with the simplest possible interaction: sending a single message and getting a response.

Python Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
  model: 'claude-opus-4-7',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Hello, Claude' }
  ]
});
console.log(message);

Understanding the Response

The API returns a structured response object:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

content: An array of content blocks (text, tool_use, etc.)
stop_reason: Why the model stopped ("end_turn", "max_tokens", "stop_sequence", or "tool_use")
usage: Token counts for billing and monitoring

Building Multi-Turn Conversations

The Messages API is stateless — Claude doesn't remember previous interactions. You must send the full conversation history with every request.

The Conversation Pattern

To build a multi-turn conversation, you append each new message to the messages array:

import anthropic
client = anthropic.Anthropic()
Start the conversation
messages = [
    {"role": "user", "content": "What is the capital of France?"}
]
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
Append Claude's response
messages.append({"role": "assistant", "content": response.content[0].text})
Ask a follow-up
messages.append({"role": "user", "content": "What is its population?"})
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=messages
)
print(response.content[0].text)

Synthetic Assistant Messages

You don't have to use only real Claude responses. You can inject synthetic assistant messages — pre-written text that appears to come from Claude. This is useful for:

Guiding the conversation: Inserting context or corrections
Role-playing scenarios: Setting up a character's backstory
Data augmentation: Creating training datasets

messages = [
    {"role": "user", "content": "Tell me about ancient Rome."},
    # Synthetic assistant message to steer the response
    {"role": "assistant", "content": "Let me focus specifically on Roman engineering achievements."},
    {"role": "user", "content": "What were their most impressive structures?"}
]

Prefilling Claude's Response (Putting Words in Claude's Mouth)

You can prefill part of Claude's response by including an assistant message at the end of your input. This shapes the beginning of Claude's output.

Use Case: Multiple Choice Questions

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,  # Only need one token for the answer
    messages=[
        {
            "role": "user",
            "content": "What is the correct answer? A) 1 B) 2 C) 3 D) 4"
        },
        {
            "role": "assistant",
            "content": "The correct answer is "  # Prefill starts here
        }
    ]
)
print(response.content[0].text)  # Output: "C"

Important Limitations

Prefilling is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error.

For these models, use structured outputs or system prompt instructions instead.

Vision: Sending Images to Claude

Claude can analyze images alongside text. You can supply images using three source types:

1. Base64 Encoding

import anthropic
import base64
client = anthropic.Anthropic()
with open("photo.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/jpeg",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "What is in this image?"
                }
            ]
        }
    ]
)
print(response.content[0].text)

2. URL Reference

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/photo.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this image in detail."
                }
            ]
        }
    ]
)

3. File Reference (via Files API)

If you've uploaded an image through the Files API, you can reference it by file ID:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "file",
                        "file_id": "file_01ABCDEFGHIJKLMNOPQRSTUV"
                    }
                },
                {
                    "type": "text",
                    "text": "What can you tell me about this image?"
                }
            ]
        }
    ]
)

Supported Media Types

Format	MIME Type
JPEG	`image/jpeg`
PNG	`image/png`
GIF	`image/gif`
WebP	`image/webp`

Handling Stop Reasons

Understanding why Claude stopped generating helps you build robust applications:

`stop_reason`	Meaning	Action
`"end_turn"`	Claude finished naturally	Return the response
`"max_tokens"`	Hit the token limit	Increase `max_tokens` or truncate
`"stop_sequence"`	Found a stop sequence	Process the response
`"tool_use"`	Claude wants to call a tool	Execute the tool and continue

if response.stop_reason == "max_tokens":
    print("Response was cut off. Consider increasing max_tokens.")
elif response.stop_reason == "tool_use":
    print("Claude requested a tool call. Handle it before continuing.")

Best Practices

Always send the full history — The API is stateless; don't rely on server-side memory.
Use synthetic messages wisely — They're powerful for steering but can confuse Claude if overused.
Monitor token usage — The usage field helps you track costs and optimize prompts.
Handle max_tokens gracefully — Detect truncated responses and ask follow-up questions.
Compress long conversations — Use prompt caching or compaction for very long histories.

Key Takeaways

The Messages API is stateless — You must send the full conversation history with every request to maintain context.
Multi-turn conversations are built by appending user and assistant messages to the messages array.
Prefilling lets you shape Claude's response by providing the beginning of an assistant message, but it's not supported on all models.
Vision capabilities allow you to send images via base64, URL, or file reference, supporting JPEG, PNG, GIF, and WebP formats.
Always check stop_reason to understand why Claude stopped and handle edge cases like token limits or tool calls.