BeClaude
Guide2026-05-05

Mastering the Messages API: Build Conversational AI with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with code examples in Python and TypeScript.

Messages APIClaude APIConversational AIPrefillVision

Mastering the Messages API: Build Conversational AI with Claude

Claude's Messages API is the primary interface for integrating Claude into your applications. Whether you're building a simple chatbot, a multi-turn assistant, or a vision-powered tool, understanding the Messages API is essential. This guide walks you through the core patterns, from basic requests to advanced techniques like prefill and vision.

Understanding the Messages API

Anthropic offers two ways to build with Claude: the Messages API and Claude Managed Agents. The Messages API gives you direct model prompting access and fine-grained control over your agent loops. It's ideal for custom workflows where you need to manage conversation state, handle tool calls, or implement complex logic.

Note: The Messages API is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.

Basic Request and Response

Let's start with the simplest possible interaction: sending a single message to Claude and receiving a response.

Python Example

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] )

print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const message = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 1024, messages: [ { role: 'user', content: 'Hello, Claude' } ] });

console.log(message);

Understanding the Response

The API returns a structured response containing:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

  • id: Unique identifier for the message
  • content: Array of content blocks (text, tool_use, etc.)
  • stop_reason: Why the model stopped (end_turn, max_tokens, stop_sequence, etc.)
  • usage: Token counts for billing and monitoring

Building Multi-Turn Conversations

The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context but requires you to manage conversation state on your end.

Example: Two-Turn Conversation

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ] )

print(message.content[0].text)

Key Points for Multi-Turn Conversations

  • Always include the full history: Each request must contain all previous messages in order.
  • Alternate roles: Messages must alternate between user and assistant roles.
  • Synthetic assistant messages: You can inject pre-written assistant responses to guide the conversation or simulate past interactions.
  • Manage context window: Be mindful of token limits — long conversations may require summarization or trimming.

Prefill: Putting Words in Claude's Mouth

Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:

  • Constraining responses to specific formats
  • Guiding Claude toward a particular style or tone
  • Implementing multiple-choice selection

Example: Multiple Choice with Prefill

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1, messages=[ { "role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae" }, { "role": "assistant", "content": "The answer is (" } ] )

print(message.content[0].text) # Outputs: "C"

Prefill Limitations

Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.

For models that don't support prefill, consider these alternatives:

  • Structured outputs: Define a JSON schema for the response
  • System prompt instructions: Use the system parameter to specify response format

Vision: Working with Images

Claude can analyze images sent through the Messages API. This enables use cases like image captioning, visual question answering, and document analysis.

Example: Sending an Image

import anthropic
import base64

client = anthropic.Anthropic()

Read and encode image

with open("chart.png", "rb") as f: image_data = base64.b64encode(f.read()).decode()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": image_data } }, { "type": "text", "text": "Describe this chart in detail." } ] } ] )

print(message.content[0].text)

Supported Image Formats

Claude supports common image formats including PNG, JPEG, GIF, and WebP. For best results, ensure images are clear and not excessively large.

Best Practices

1. Manage Token Usage

Monitor the usage field in responses to track costs and optimize your prompts. Use max_tokens to limit response length.

2. Handle Stop Reasons

Different stop_reason values indicate different conditions:

  • end_turn: Claude finished naturally
  • max_tokens: Response was cut off — consider increasing max_tokens or continuing the conversation
  • stop_sequence: A custom stop sequence was triggered

3. Use System Prompts for Consistency

For production applications, use the system parameter to set Claude's behavior, tone, and constraints. This is more reliable than prefill for newer models.

message = client.messages.create(
    model="claude-opus-4-7",
    system="You are a helpful assistant that always responds in rhyming verse.",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Tell me about AI"}
    ]
)

4. Implement Error Handling

Always wrap API calls in try-catch blocks and handle common errors like rate limits, authentication failures, and invalid requests.

Key Takeaways

  • The Messages API is stateless — you must send the full conversation history with each request, giving you complete control over context.
  • Prefill is powerful but limited — use it to guide responses, but check model compatibility as newer models like Opus 4.7 don't support it.
  • Vision capabilities allow Claude to analyze images alongside text, enabling rich multimodal interactions.
  • Monitor token usage to optimize costs and manage context windows effectively.
  • System prompts are the preferred method for setting behavior in modern Claude models, especially when prefill isn't available.