GuideBeginnerAPI2026-05-17

Mastering the Messages API: Building Conversational AI with Claude

Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.

Quick Answer

This guide covers how to use the Claude Messages API to build conversational AI applications, including basic requests, multi-turn conversations, prefill techniques, and vision capabilities with Python and TypeScript code examples.

Messages APIClaude APIConversational AIPrefillVision

Introduction

The Claude Messages API is the primary interface for building conversational AI applications with Anthropic's Claude models. Whether you're creating a simple chatbot or a complex multi-turn assistant, understanding how to work with messages effectively is essential.

This guide covers the core patterns you'll use daily: basic requests, managing conversation history, pre-filling responses, and working with images. By the end, you'll have a solid foundation for building production-ready applications with Claude.

Basic Request and Response

At its simplest, the Messages API takes a list of messages and returns Claude's response. Here's the minimal example in Python:

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message)

The response includes several important fields:

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{"type": "text", "text": "Hello!"}],
  "model": "claude-opus-4-7",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 6
  }
}

Key fields to note:

content: An array of content blocks (usually text, but can include tool use blocks)
stop_reason: Why Claude stopped generating ("end_turn", "max_tokens", "stop_sequence", or "tool_use")
usage: Token counts for billing and context window management

Multi-Turn Conversations

The Messages API is stateless — each request must include the full conversation history. This gives you complete control over context but requires you to manage the conversation state on your end.

Here's how to build a multi-turn conversation:

import anthropic
client = anthropic.Anthropic()
First turn
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
Extract Claude's response
assistant_response = message.content[0].text
Second turn: include the full history
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"},
        {"role": "assistant", "content": assistant_response},
        {"role": "user", "content": "Can you describe LLMs to me?"}
    ]
)
print(message.content[0].text)

Synthetic Assistant Messages

You can inject synthetic assistant messages into the history. This is useful for:

Providing few-shot examples
Guiding conversation flow
Implementing system-like behavior without the system prompt

messages = [
    {"role": "user", "content": "What's the capital of France?"},
    {"role": "assistant", "content": "The capital of France is Paris."},
    {"role": "user", "content": "What about Italy?"}
]

Prefill: Putting Words in Claude's Mouth

Prefilling allows you to start Claude's response for it. This is powerful for:

Enforcing response format (e.g., JSON, multiple choice)
Guiding the tone or structure
Reducing output tokens for constrained tasks

Basic Prefill Example

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1,
    messages=[
        {
            "role": "user",
            "content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
        },
        {
            "role": "assistant",
            "content": "The answer is ("
        }
    ]
)
print(message.content[0].text)  # "C"

By setting max_tokens=1 and prefilling with "The answer is (", Claude only needs to output the letter. This is perfect for multiple-choice classification tasks.

Prefill Limitations

Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.

For models that don't support prefill, consider:

Structured outputs: Define a JSON schema for the response
System prompt instructions: Use clear formatting instructions in the system prompt

Working with Images (Vision)

Claude can analyze images sent via the Messages API. This enables use cases like document analysis, screenshot interpretation, and visual question answering.

Sending an Image

import anthropic
import base64
client = anthropic.Anthropic()
Read and encode the image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this chart in detail."
                }
            ]
        }
    ]
)
print(message.content[0].text)

Supported media types: image/jpeg, image/png, image/gif, image/webp.

Image Size Limits

Claude processes images at different resolutions depending on size:

Images under 1,950 pixels on the longest side are processed at original resolution
Larger images are scaled down to fit within 1,950 pixels
Very large images (over 8,000 pixels) may be rejected

For optimal performance, resize images to around 1,000-2,000 pixels on the longest side before sending.

Handling Stop Reasons

Understanding stop_reason helps you build robust applications:

Stop Reason	Meaning	Action
`end_turn`	Claude finished naturally	Continue conversation
`max_tokens`	Output hit token limit	Increase `max_tokens` or truncate
`stop_sequence`	Custom stop sequence triggered	Handle as needed
`tool_use`	Claude wants to call a tool	Execute tool and continue

if message.stop_reason == "max_tokens":
    print("Response was truncated. Consider increasing max_tokens.")
elif message.stop_reason == "tool_use":
    print("Claude requested a tool call.")
    # Handle tool execution...

Best Practices

1. Manage Token Usage

Always check usage.input_tokens and usage.output_tokens to track costs. For long conversations, consider:

Summarizing older messages
Using prompt caching for repeated system instructions
Trimming history when approaching context limits

2. Handle Errors Gracefully

try:
    message = client.messages.create(
        model="claude-opus-4-7",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
except anthropic.APIError as e:
    print(f"API error: {e}")
    # Implement retry logic or fallback
except anthropic.APIConnectionError as e:
    print(f"Connection error: {e}")
    # Retry after delay

3. Use Streaming for Responsive UIs

For chat applications, use streaming to show tokens as they're generated:

with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Conclusion

The Messages API is the foundation for building with Claude. By mastering basic requests, multi-turn conversations, prefill, and vision capabilities, you can create sophisticated conversational AI applications. Remember that the API is stateless — you manage the conversation history — and always check stop reasons to handle different scenarios appropriately.

Key Takeaways

The Messages API is stateless — always send the full conversation history with each request
Prefill allows you to guide Claude's responses by starting its reply, but check model compatibility
Vision capabilities let Claude analyze images sent as base64-encoded data
Always check stop_reason to understand why Claude stopped generating and handle edge cases
Use streaming for real-time user interfaces and track token usage to manage costs