BeClaude
Guide2026-05-02

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Learn how to use Claude's Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision. Includes code examples and best practices.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Messages APIClaude APIConversational AIPrefillVision

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Claude's Messages API is the foundation for building custom AI applications. Whether you're creating a chatbot, a content generator, or a vision-enabled assistant, understanding how to work with messages is essential. This guide walks you through the core patterns, from simple requests to advanced techniques like prefill and vision.

Understanding the Messages API vs. Managed Agents

Anthropic offers two primary ways to build with Claude:

  • Messages API: Direct model prompting access. Best for custom agent loops and fine-grained control.
  • Claude Managed Agents: Pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.
This guide focuses on the Messages API, which gives you full control over every aspect of the conversation.

Basic Request and Response

Let's start with the simplest interaction: sending a single message to Claude and receiving a response.

Python Example

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] )

print(message)

Response Structure

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to understand:

  • content: An array of content blocks (text, images, tool use, etc.)
  • stop_reason: Why the response ended ("end_turn", "max_tokens", "stop_sequence", or "tool_use")
  • usage: Token counts for billing and optimization

Building Multi-Turn Conversations

The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context.

Python Example: Multi-Turn

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ] )

print(message.content[0].text)

Important Notes

  • Conversational history is explicit: You must include all previous messages in each request.
  • Synthetic assistant messages: Earlier turns don't need to come from Claude. You can inject pre-written assistant responses to guide the conversation.
  • Role alternation: Messages must alternate between "user" and "assistant" roles. The last message must be from the user.

Prefill Technique: Putting Words in Claude's Mouth

Prefill allows you to start Claude's response for it. This is powerful for:

  • Forcing structured outputs (e.g., JSON, multiple choice)
  • Guiding tone or style
  • Reducing output tokens for specific tasks

Example: Multiple Choice Answer

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1, messages=[ { "role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae" }, { "role": "assistant", "content": "The answer is (" } ] )

print(message.content[0].text) # Outputs: "C"

Prefill Limitations

Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.

Vision: Working with Images

Claude can analyze images sent via the Messages API. This enables use cases like:

  • Document analysis
  • Screenshot interpretation
  • Visual Q&A

Python Example: Image Analysis

import anthropic
import base64

client = anthropic.Anthropic()

Read and encode image

with open("chart.png", "rb") as f: image_data = base64.b64encode(f.read()).decode("utf-8")

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": image_data } }, { "type": "text", "text": "Describe this chart in detail." } ] } ] )

print(message.content[0].text)

Supported Image Formats

  • JPEG
  • PNG
  • GIF
  • WebP
Images are resized and compressed by Claude to optimize processing. The API supports up to 20 images per request (or more with certain configurations).

Handling Stop Reasons

Understanding why Claude stopped generating helps you handle the response appropriately:

stop_reasonMeaningAction
"end_turn"Claude finished naturallyProcess the response
"max_tokens"Hit the token limitIncrease max_tokens or continue the conversation
"stop_sequence"Hit a custom stop sequenceHandle based on your logic
"tool_use"Claude wants to use a toolExecute the tool and return results

Best Practices

  • Manage token usage: Track usage.input_tokens and usage.output_tokens to optimize costs.
  • Use system prompts: For persistent instructions, use the system parameter instead of repeating instructions in every user message.
  • Handle errors gracefully: Implement retry logic for rate limits and timeouts.
  • Stream responses: For long outputs, use streaming to improve user experience.
  • Validate prefill compatibility: Check model support before using prefill.

Key Takeaways

  • The Messages API is stateless—always send the full conversation history with each request.
  • Multi-turn conversations require alternating user and assistant messages, with the last message from the user.
  • Prefill lets you start Claude's response, but check model compatibility (not supported on Opus 4.7, Sonnet 4.6, etc.).
  • Vision capabilities allow Claude to analyze images sent as base64-encoded data.
  • Always check stop_reason to determine why the response ended and handle it appropriately.