Mastering the Messages API: Build Multi-Turn Conversations with Claude
Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
This guide teaches you how to use Claude's Messages API to build conversational applications, including stateless multi-turn chats, prefill techniques to shape responses, and vision capabilities for image analysis.
Mastering the Messages API: Build Multi-Turn Conversations with Claude
Claude's Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generation tool, or an AI-powered assistant, understanding the Messages API is essential. This guide covers everything from basic requests to advanced techniques like prefill and vision.
Understanding the Messages API vs. Claude Managed Agents
Anthropic offers two paths for building with Claude:
- Messages API: Direct model access for custom agent loops and fine-grained control. You manage the conversation state and logic.
- Claude Managed Agents: A pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.
Making Your First API Call
Let's start with the simplest possible request: sending a single message to Claude and getting a response.
Basic Request (Python)
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
Response Structure
The API returns a structured JSON object:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to understand:
content: An array of content blocks (usually text, but can include tool use blocks).stop_reason: Why the response ended ("end_turn","max_tokens","stop_sequence", or"tool_use").usage: Token counts for billing and monitoring.
Building Multi-Turn Conversations
The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context but requires you to manage state on your end.
Sending Conversation History
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Important Notes
- The conversation must alternate between
userandassistantroles. - You can include synthetic assistant messages—they don't need to have come from Claude. This is useful for providing examples or guiding behavior.
- Always start with a
usermessage. - The last message must be from the
userrole (unless you're using prefill).
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response for it. This is useful for:
- Forcing a specific format (e.g., JSON, multiple choice)
- Guiding the tone or structure
- Reducing token usage by constraining the output
Example: Multiple Choice Answer
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Outputs: "C"
Prefill Limitations
- Not supported on: Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, Claude Sonnet 4.6
- Using prefill with these models returns a 400 error
- Alternative: Use structured outputs or system prompt instructions instead
When to Use Prefill vs. System Prompts
| Technique | Best For |
|---|---|
| Prefill | Short, constrained outputs (multiple choice, yes/no, single word) |
| System prompt | Longer instructions, tone setting, behavior guidelines |
| Structured outputs | JSON schemas, typed responses |
Vision Capabilities: Analyzing Images
The Messages API supports image inputs, enabling Claude to analyze and describe visual content.
Sending an Image
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode the image
with open("diagram.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this diagram in detail."
}
]
}
]
)
print(message.content[0].text)
Supported Image Formats
- JPEG, PNG, GIF, WebP
- Maximum size: 100MB (though larger images are downscaled)
- Optimal resolution: 1568x1568 pixels (Claude processes at this resolution)
Vision Use Cases
- Document analysis: Extract text from scanned documents
- UI/UX review: Analyze screenshots for design feedback
- Data visualization: Interpret charts and graphs
- Product photography: Generate alt text or descriptions
Handling Stop Reasons
Understanding why Claude stopped generating helps you build robust applications:
| stop_reason | Meaning | Action |
|---|---|---|
"end_turn" | Claude finished naturally | Continue conversation or end |
"max_tokens" | Hit the token limit | Increase max_tokens or continue |
"stop_sequence" | Hit a custom stop sequence | Handle based on your logic |
"tool_use" | Claude wants to use a tool | Execute the tool and return results |
Example: Handling max_tokens
if message.stop_reason == "max_tokens":
# Continue the conversation to get more output
messages.append({"role": "assistant", "content": message.content[0].text})
messages.append({"role": "user", "content": "Please continue."})
# Send the new request...
Best Practices for Production
1. Manage Context Window
- Keep conversation history within Claude's context window (varies by model)
- Use prompt caching for frequently repeated system instructions
- Summarize or truncate old messages when approaching limits
2. Handle Errors Gracefully
try:
message = client.messages.create(...)
except anthropic.APIError as e:
print(f"API error: {e}")
# Implement retry logic with exponential backoff
except anthropic.RateLimitError as e:
print(f"Rate limited: {e}")
# Wait and retry
except anthropic.APIConnectionError as e:
print(f"Connection error: {e}")
# Check network and retry
3. Monitor Token Usage
Track usage.input_tokens and usage.output_tokens to:
- Estimate costs
- Detect unexpectedly long conversations
- Optimize prompts for efficiency
4. Use Streaming for Responsive UIs
For chat applications, use streaming to show Claude's response as it's generated:
with client.messages.stream(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell me a story"}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Conclusion
The Messages API is the foundation for building with Claude. By mastering basic requests, multi-turn conversations, prefill, and vision, you can create sophisticated applications that leverage Claude's full capabilities.
Remember these key points:
- The API is stateless—you manage conversation history
- Prefill is powerful but has model limitations
- Vision enables image analysis workflows
- Always handle stop reasons and errors in production
Key Takeaways
- Stateless by design: You must send the full conversation history with every request, giving you complete control over context.
- Prefill shapes responses: Use prefill to constrain outputs (e.g., multiple choice), but avoid it on newer models—use structured outputs instead.
- Vision is built-in: Send images as base64-encoded content blocks for document analysis, UI review, and more.
- Handle stop reasons: Different stop reasons (
end_turn,max_tokens,tool_use) require different handling logic. - Stream for UX: Use streaming for real-time applications to show Claude's response as it's generated.