BeClaude
GuideBeginner2026-05-06

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples in Python and TypeScript.

Quick Answer

This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Messages APIClaude APIConversational AIPrefillVision

Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude

Claude's Messages API is the primary interface for integrating Claude's powerful language capabilities into your applications. Whether you're building a chatbot, a content generation tool, or a complex agent system, understanding how to work with messages effectively is crucial.

This guide walks you through the essential patterns for using the Messages API, from basic requests to advanced techniques like prefill and vision. By the end, you'll have a solid foundation for building production-ready applications with Claude.

Understanding the Messages API vs. Managed Agents

Anthropic offers two primary ways to build with Claude:

  • Messages API: Direct model prompting access, ideal for custom agent loops and fine-grained control.
  • Claude Managed Agents: A pre-built, configurable agent harness that runs in managed infrastructure, best for long-running tasks and asynchronous work.
This guide focuses on the Messages API, which gives you full control over every aspect of the conversation.

Basic Request and Response

Let's start with the simplest possible interaction: sending a single message to Claude and receiving a response.

Python Example

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"} ] )

print(message)

TypeScript Example

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const message = await client.messages.create({ model: 'claude-opus-4-7', max_tokens: 1024, messages: [ { role: 'user', content: 'Hello, Claude' } ] });

console.log(message);

Understanding the Response

The API returns a structured response object:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Hello!"
    }
  ],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

  • content: An array of content blocks (text, image, tool_use, etc.)
  • stop_reason: Indicates why the response ended (end_turn, max_tokens, stop_sequence, or tool_use)
  • usage: Token counts for billing and monitoring

Building Multi-Turn Conversations

The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context but requires careful management.

Example: Two-Turn Conversation

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ {"role": "user", "content": "Hello, Claude"}, {"role": "assistant", "content": "Hello!"}, {"role": "user", "content": "Can you describe LLMs to me?"} ] )

print(message.content[0].text)

Important Pattern: Synthetic Assistant Messages

Earlier conversational turns don't need to originate from Claude. You can inject synthetic assistant messages to:

  • Set up context or examples
  • Guide Claude's behavior
  • Implement few-shot prompting
messages = [
    {"role": "user", "content": "What is the capital of France?"},
    {"role": "assistant", "content": "The capital of France is Paris."},
    {"role": "user", "content": "What about Germany?"}
]

The Prefill Technique: Putting Words in Claude's Mouth

One of the most powerful features of the Messages API is prefilling — you can start Claude's response by providing the beginning of its answer. This is incredibly useful for:

  • Constraining output format
  • Guiding multiple-choice answers
  • Ensuring structured responses

Example: Multiple Choice with Prefill

import anthropic

client = anthropic.Anthropic()

message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1, messages=[ { "role": "user", "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae" }, { "role": "assistant", "content": "The answer is (" } ] )

print(message.content[0].text) # Output: "C"

Prefill Limitations

Important: Prefilling is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.

For models that don't support prefill, consider:

  • Using structured outputs (JSON mode)
  • Providing detailed system prompt instructions
  • Using tool use to enforce output structure

Vision Capabilities: Working with Images

Claude can analyze images sent via the Messages API. This opens up use cases like:

  • Document analysis
  • Image description
  • Visual question answering

Sending an Image

import anthropic
import base64

client = anthropic.Anthropic()

Read and encode image

with open("chart.png", "rb") as f: image_data = base64.b64encode(f.read()).decode()

message = client.messages.create( model="claude-opus-4-7", max_tokens=1024, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/png", "data": image_data } }, { "type": "text", "text": "Describe this chart in detail." } ] } ] )

print(message.content[0].text)

Supported Image Formats

Claude supports common image formats including PNG, JPEG, GIF, and WebP. For best results, ensure images are clear and not overly compressed.

Handling Stop Reasons

Understanding why Claude stopped generating is essential for building robust applications:

Stop ReasonMeaningAction Needed
end_turnClaude finished naturallyNone
max_tokensOutput hit token limitIncrease max_tokens or continue
stop_sequenceClaude hit a custom stop sequenceHandle as needed
tool_useClaude wants to use a toolProcess tool call and continue

Best Practices

1. Manage Context Window

Since the API is stateless, your conversation history grows with each turn. Be mindful of:

  • Token limits (context window size varies by model)
  • Cost (you pay for input tokens)
  • Latency (larger inputs take longer)

2. Use System Messages for Instructions

For persistent instructions, use the system parameter instead of repeating instructions in every user message:

message = client.messages.create(
    model="claude-opus-4-7",
    system="You are a helpful assistant that always responds in JSON format.",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "List three programming languages."}
    ]
)

3. Implement Error Handling

Always handle API errors gracefully:

try:
    message = client.messages.create(...)
except anthropic.APIError as e:
    print(f"API error: {e}")
    # Implement retry logic or fallback
except anthropic.APIConnectionError as e:
    print(f"Connection error: {e}")
    # Retry with backoff

Conclusion

The Messages API is the foundation for building any application with Claude. By mastering basic requests, multi-turn conversations, prefill techniques, and vision capabilities, you can create sophisticated AI-powered experiences.

Remember that the API is stateless — you control the conversation history. This gives you maximum flexibility but requires careful management of context and tokens.

Key Takeaways

  • The Messages API is stateless — you must send the full conversation history with every request, giving you complete control over context.
  • Prefill is powerful but model-specific — use it to guide responses, but check model compatibility (not supported on Opus 4.7, Sonnet 4.6, and others).
  • Vision capabilities enable multimodal applications — Claude can analyze images sent as base64-encoded data alongside text.
  • Handle stop reasons appropriatelyend_turn means natural completion, while max_tokens may require continuation logic.
  • Use system messages for persistent instructions — this reduces token usage and keeps conversations cleaner than repeating instructions.