BeClaude
GuideBeginnerAPI2026-05-12

Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude

Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities. Practical code examples included.

Quick Answer

This guide teaches you how to use Claude's Messages API to build stateless multi-turn conversations, prefill responses for structured outputs, and send images for vision analysis, with practical Python and TypeScript examples.

Messages APIClaude APIMulti-turn ConversationsVisionPrefill

Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude

Claude's Messages API is the backbone of programmatic interaction with Anthropic's powerful language models. Whether you're building a chatbot, a document analysis tool, or a vision-enabled application, understanding how to work with messages is essential.

This guide covers the core patterns you'll use daily: basic requests, multi-turn conversations, prefill techniques, and vision capabilities. By the end, you'll be able to build sophisticated, stateful applications on top of Claude's stateless API.

Understanding the Messages API vs. Claude Managed Agents

Anthropic offers two primary ways to build with Claude:

FeatureMessages APIClaude Managed Agents
What it isDirect model prompting accessPre-built, configurable agent harness
Best forCustom agent loops and fine-grained controlLong-running tasks and asynchronous work
Learn moreMessages API docsManaged Agents docs
For most developers building custom applications, the Messages API is the right choice. It gives you full control over every message sent and received.

Making Your First API Request

Let's start with the simplest possible interaction: sending a single message and getting a response.

Python Example

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] )

print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const message = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 1024, messages: [ { role: 'user', content: 'Hello, Claude' } ] });

console.log(message);

Understanding the Response

The API returns a structured response object:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

  • content: An array of content blocks (text, tool_use, etc.)
  • stop_reason: Why the model stopped ("end_turn", "max_tokens", "stop_sequence", or "tool_use")
  • usage: Token counts for billing and monitoring

Building Multi-Turn Conversations

The Messages API is stateless — Claude doesn't remember previous interactions. You must send the full conversation history with every request.

The Conversation Pattern

To build a multi-turn conversation, you append each new message to the messages array:

import anthropic

client = anthropic.Anthropic()

Start the conversation

messages = [ {"role": "user", "content": "What is the capital of France?"} ]

response = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=messages )

Append Claude's response

messages.append({"role": "assistant", "content": response.content[0].text})

Ask a follow-up

messages.append({"role": "user", "content": "What is its population?"})

response = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=messages )

print(response.content[0].text)

Synthetic Assistant Messages

You don't have to use only real Claude responses. You can inject synthetic assistant messages — pre-written text that appears to come from Claude. This is useful for:

  • Guiding the conversation: Inserting context or corrections
  • Role-playing scenarios: Setting up a character's backstory
  • Data augmentation: Creating training datasets
messages = [
    {"role": "user", "content": "Tell me about ancient Rome."},
    # Synthetic assistant message to steer the response
    {"role": "assistant", "content": "Let me focus specifically on Roman engineering achievements."},
    {"role": "user", "content": "What were their most impressive structures?"}
]

Prefilling Claude's Response (Putting Words in Claude's Mouth)

You can prefill part of Claude's response by including an assistant message at the end of your input. This shapes the beginning of Claude's output.

Use Case: Multiple Choice Questions

import anthropic

client = anthropic.Anthropic()

response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1, # Only need one token for the answer messages=[ { "role": "user", "content": "What is the correct answer? A) 1 B) 2 C) 3 D) 4" }, { "role": "assistant", "content": "The correct answer is " # Prefill starts here } ] )

print(response.content[0].text) # Output: "C"

Important Limitations

Prefilling is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error.

For these models, use structured outputs or system prompt instructions instead.

Vision: Sending Images to Claude

Claude can analyze images alongside text. You can supply images using three source types:

1. Base64 Encoding

import anthropic
import base64

client = anthropic.Anthropic()

with open("photo.jpg", "rb") as f: image_data = base64.b64encode(f.read()).decode("utf-8")

response = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": image_data } }, { "type": "text", "text": "What is in this image?" } ] } ] )

print(response.content[0].text)

2. URL Reference

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://example.com/photo.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this image in detail."
                }
            ]
        }
    ]
)

3. File Reference (via Files API)

If you've uploaded an image through the Files API, you can reference it by file ID:

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "file",
                        "file_id": "file_01ABCDEFGHIJKLMNOPQRSTUV"
                    }
                },
                {
                    "type": "text",
                    "text": "What can you tell me about this image?"
                }
            ]
        }
    ]
)

Supported Media Types

FormatMIME Type
JPEGimage/jpeg
PNGimage/png
GIFimage/gif
WebPimage/webp

Handling Stop Reasons

Understanding why Claude stopped generating helps you build robust applications:

stop_reasonMeaningAction
"end_turn"Claude finished naturallyReturn the response
"max_tokens"Hit the token limitIncrease max_tokens or truncate
"stop_sequence"Found a stop sequenceProcess the response
"tool_use"Claude wants to call a toolExecute the tool and continue
if response.stop_reason == "max_tokens":
    print("Response was cut off. Consider increasing max_tokens.")
elif response.stop_reason == "tool_use":
    print("Claude requested a tool call. Handle it before continuing.")

Best Practices

  • Always send the full history — The API is stateless; don't rely on server-side memory.
  • Use synthetic messages wisely — They're powerful for steering but can confuse Claude if overused.
  • Monitor token usage — The usage field helps you track costs and optimize prompts.
  • Handle max_tokens gracefully — Detect truncated responses and ask follow-up questions.
  • Compress long conversations — Use prompt caching or compaction for very long histories.

Key Takeaways

  • The Messages API is stateless — You must send the full conversation history with every request to maintain context.
  • Multi-turn conversations are built by appending user and assistant messages to the messages array.
  • Prefilling lets you shape Claude's response by providing the beginning of an assistant message, but it's not supported on all models.
  • Vision capabilities allow you to send images via base64, URL, or file reference, supporting JPEG, PNG, GIF, and WebP formats.
  • Always check stop_reason to understand why Claude stopped and handle edge cases like token limits or tool calls.