Mastering the Messages API: Build Conversational AI with Claude
Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with code examples in Python and TypeScript.
Mastering the Messages API: Build Conversational AI with Claude
Claude's Messages API is the primary interface for integrating Claude into your applications. Whether you're building a simple chatbot, a multi-turn assistant, or a vision-powered tool, understanding the Messages API is essential. This guide walks you through the core patterns, from basic requests to advanced techniques like prefill and vision.
Understanding the Messages API
Anthropic offers two ways to build with Claude: the Messages API and Claude Managed Agents. The Messages API gives you direct model prompting access and fine-grained control over your agent loops. It's ideal for custom workflows where you need to manage conversation state, handle tool calls, or implement complex logic.
Note: The Messages API is eligible for Zero Data Retention (ZDR). When your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned.
Basic Request and Response
Let's start with the simplest possible interaction: sending a single message to Claude and receiving a response.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' }
]
});
console.log(message);
Understanding the Response
The API returns a structured response containing:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to note:
id: Unique identifier for the messagecontent: Array of content blocks (text, tool_use, etc.)stop_reason: Why the model stopped (end_turn,max_tokens,stop_sequence, etc.)usage: Token counts for billing and monitoring
Building Multi-Turn Conversations
The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context but requires you to manage conversation state on your end.
Example: Two-Turn Conversation
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Key Points for Multi-Turn Conversations
- Always include the full history: Each request must contain all previous messages in order.
- Alternate roles: Messages must alternate between
userandassistantroles. - Synthetic assistant messages: You can inject pre-written assistant responses to guide the conversation or simulate past interactions.
- Manage context window: Be mindful of token limits — long conversations may require summarization or trimming.
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:
- Constraining responses to specific formats
- Guiding Claude toward a particular style or tone
- Implementing multiple-choice selection
Example: Multiple Choice with Prefill
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Outputs: "C"
Prefill Limitations
Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.
For models that don't support prefill, consider these alternatives:
- Structured outputs: Define a JSON schema for the response
- System prompt instructions: Use the system parameter to specify response format
Vision: Working with Images
Claude can analyze images sent through the Messages API. This enables use cases like image captioning, visual question answering, and document analysis.
Example: Sending an Image
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this chart in detail."
}
]
}
]
)
print(message.content[0].text)
Supported Image Formats
Claude supports common image formats including PNG, JPEG, GIF, and WebP. For best results, ensure images are clear and not excessively large.
Best Practices
1. Manage Token Usage
Monitor the usage field in responses to track costs and optimize your prompts. Use max_tokens to limit response length.
2. Handle Stop Reasons
Different stop_reason values indicate different conditions:
end_turn: Claude finished naturallymax_tokens: Response was cut off — consider increasingmax_tokensor continuing the conversationstop_sequence: A custom stop sequence was triggered
3. Use System Prompts for Consistency
For production applications, use the system parameter to set Claude's behavior, tone, and constraints. This is more reliable than prefill for newer models.
message = client.messages.create(
model="claude-opus-4-7",
system="You are a helpful assistant that always responds in rhyming verse.",
max_tokens=1024,
messages=[
{"role": "user", "content": "Tell me about AI"}
]
)
4. Implement Error Handling
Always wrap API calls in try-catch blocks and handle common errors like rate limits, authentication failures, and invalid requests.
Key Takeaways
- The Messages API is stateless — you must send the full conversation history with each request, giving you complete control over context.
- Prefill is powerful but limited — use it to guide responses, but check model compatibility as newer models like Opus 4.7 don't support it.
- Vision capabilities allow Claude to analyze images alongside text, enabling rich multimodal interactions.
- Monitor token usage to optimize costs and manage context windows effectively.
- System prompts are the preferred method for setting behavior in modern Claude models, especially when prefill isn't available.