Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude
Learn how to use Claude's Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision tasks with practical code examples in Python and TypeScript.
This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with real-world code examples.
Introduction
The Messages API is the core interface for interacting with Claude programmatically. Whether you're building a chatbot, a document analyzer, or a vision-enabled assistant, understanding how to structure requests and handle responses is essential. This guide walks you through the most common patterns—from a simple "Hello, Claude" to multi-turn conversations, prefill techniques, and image analysis.
Basic Request and Response
At its simplest, the Messages API accepts a list of messages and returns Claude's response. Here's a minimal example in Python:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
The response includes the model's reply, metadata, and token usage:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "Hello!"}],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to watch:
stop_reason: Indicates why Claude stopped."end_turn"means the response is complete.usage: Track input and output tokens for cost monitoring.
Multi-Turn Conversations
The Messages API is stateless—you must send the full conversation history with every request. This gives you full control over context but requires you to manage state on your end.
Building a Conversation
import anthropic
client = anthropic.Anthropic()
First turn
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
Second turn: include previous exchange
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Synthetic Assistant Messages
You can inject pre-written assistant messages into the history. This is useful for:
- Setting context: Providing a backstory or persona.
- Guiding behavior: Showing Claude how you want it to respond.
- Simulating conversations: Testing dialogue flows.
messages = [
{"role": "user", "content": "You are a helpful tutor. Explain quantum computing."},
{"role": "assistant", "content": "I'd be happy to explain quantum computing! Let's start with the basics."},
{"role": "user", "content": "What is a qubit?"}
]
Pro tip: Always include the full history. Omitting turns can confuse Claude and lead to inconsistent responses.
Prefill: Putting Words in Claude's Mouth
Prefilling lets you start Claude's response by providing the beginning of its answer. This is powerful for:
- Constraining output format (e.g., JSON, multiple choice)
- Setting tone or style
- Reducing token waste on boilerplate
Example: Multiple Choice Answer
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Output: "C"
By setting max_tokens=1 and prefilling "The answer is (", Claude only generates the letter—perfect for structured outputs.
Important Limitations
Prefilling is not supported on these models:
- Claude Mythos Preview
- Claude Opus 4.7
- Claude Opus 4.6
- Claude Sonnet 4.6
Migration from Prefill
If you're moving away from prefill, here's how to achieve similar results:
Option 1: System Promptclient.messages.create(
model="claude-opus-4-7",
system="Always respond in JSON format with keys: 'answer', 'explanation'.",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
Option 2: Structured Outputs
Use the structured outputs feature (available in the API) to define a schema for Claude's response.
Vision: Working with Images
Claude can analyze images sent via the Messages API. This enables use cases like:
- Document analysis: Extracting text from screenshots or PDFs.
- Visual QA: Answering questions about diagrams or photos.
- Content moderation: Identifying objects or text in images.
Sending an Image
import anthropic
import base64
client = anthropic.Anthropic()
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "What does this chart show?"
}
]
}
]
)
print(message.content[0].text)
Supported media types: image/png, image/jpeg, image/webp, image/gif.
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
import fs from 'fs';
const client = new Anthropic();
const imageBuffer = fs.readFileSync('chart.png');
const base64Image = imageBuffer.toString('base64');
const message = await client.messages.create({
model: 'claude-sonnet-4-5',
max_tokens: 1024,
messages: [
{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: base64Image
}
},
{
type: 'text',
text: 'Describe this image in detail.'
}
]
}
]
});
console.log(message.content[0].text);
Best Practices
- Manage token usage: Track
usage.input_tokensandusage.output_tokensto stay within limits and control costs. - Handle stop reasons: Check
stop_reasonto determine if Claude finished naturally (end_turn) or was cut off (max_tokens). - Use system prompts for instructions: For models that don't support prefill, leverage the
systemparameter for high-level guidance. - Keep conversations focused: Include only relevant history to avoid exceeding context windows.
- Test with different models: Each Claude model has unique strengths—experiment to find the best fit.
Key Takeaways
- Stateless design: Always send the full conversation history; manage state on your end.
- Prefill for precision: Use prefill to constrain output format, but check model compatibility first.
- Vision is powerful: Send images as base64-encoded data for visual analysis tasks.
- Monitor usage: Track token counts and stop reasons to optimize performance and cost.
- Migrate when needed: For models that don't support prefill, use system prompts or structured outputs.