Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude
Learn how to use Claude's Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision. Includes code examples and best practices.
This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
Mastering the Messages API: A Practical Guide to Building Conversational AI with Claude
Claude's Messages API is the foundation for building custom AI applications. Whether you're creating a chatbot, a content generator, or a vision-enabled assistant, understanding how to work with messages is essential. This guide walks you through the core patterns, from simple requests to advanced techniques like prefill and vision.
Understanding the Messages API vs. Managed Agents
Anthropic offers two primary ways to build with Claude:
- Messages API: Direct model prompting access. Best for custom agent loops and fine-grained control.
- Claude Managed Agents: Pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.
Basic Request and Response
Let's start with the simplest interaction: sending a single message to Claude and receiving a response.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
Response Structure
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to understand:
content: An array of content blocks (text, images, tool use, etc.)stop_reason: Why the response ended ("end_turn","max_tokens","stop_sequence", or"tool_use")usage: Token counts for billing and optimization
Building Multi-Turn Conversations
The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context.
Python Example: Multi-Turn
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Important Notes
- Conversational history is explicit: You must include all previous messages in each request.
- Synthetic assistant messages: Earlier turns don't need to come from Claude. You can inject pre-written assistant responses to guide the conversation.
- Role alternation: Messages must alternate between
"user"and"assistant"roles. The last message must be from the user.
Prefill Technique: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response for it. This is powerful for:
- Forcing structured outputs (e.g., JSON, multiple choice)
- Guiding tone or style
- Reducing output tokens for specific tasks
Example: Multiple Choice Answer
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Outputs: "C"
Prefill Limitations
Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.
Vision: Working with Images
Claude can analyze images sent via the Messages API. This enables use cases like:
- Document analysis
- Screenshot interpretation
- Visual Q&A
Python Example: Image Analysis
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this chart in detail."
}
]
}
]
)
print(message.content[0].text)
Supported Image Formats
- JPEG
- PNG
- GIF
- WebP
Handling Stop Reasons
Understanding why Claude stopped generating helps you handle the response appropriately:
stop_reason | Meaning | Action |
|---|---|---|
"end_turn" | Claude finished naturally | Process the response |
"max_tokens" | Hit the token limit | Increase max_tokens or continue the conversation |
"stop_sequence" | Hit a custom stop sequence | Handle based on your logic |
"tool_use" | Claude wants to use a tool | Execute the tool and return results |
Best Practices
- Manage token usage: Track
usage.input_tokensandusage.output_tokensto optimize costs. - Use system prompts: For persistent instructions, use the
systemparameter instead of repeating instructions in every user message. - Handle errors gracefully: Implement retry logic for rate limits and timeouts.
- Stream responses: For long outputs, use streaming to improve user experience.
- Validate prefill compatibility: Check model support before using prefill.
Key Takeaways
- The Messages API is stateless—always send the full conversation history with each request.
- Multi-turn conversations require alternating user and assistant messages, with the last message from the user.
- Prefill lets you start Claude's response, but check model compatibility (not supported on Opus 4.7, Sonnet 4.6, etc.).
- Vision capabilities allow Claude to analyze images sent as base64-encoded data.
- Always check
stop_reasonto determine why the response ended and handle it appropriately.