Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude
Learn how to use the Claude Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities. Practical code examples included.
This guide teaches you how to use Claude's Messages API to build stateless multi-turn conversations, prefill responses for structured outputs, and send images for vision analysis, with practical Python and TypeScript examples.
Mastering the Messages API: Build Multi-Turn Conversations and Vision-Powered Apps with Claude
Claude's Messages API is the backbone of programmatic interaction with Anthropic's powerful language models. Whether you're building a chatbot, a document analysis tool, or a vision-enabled application, understanding how to work with messages is essential.
This guide covers the core patterns you'll use daily: basic requests, multi-turn conversations, prefill techniques, and vision capabilities. By the end, you'll be able to build sophisticated, stateful applications on top of Claude's stateless API.
Understanding the Messages API vs. Claude Managed Agents
Anthropic offers two primary ways to build with Claude:
| Feature | Messages API | Claude Managed Agents |
|---|---|---|
| What it is | Direct model prompting access | Pre-built, configurable agent harness |
| Best for | Custom agent loops and fine-grained control | Long-running tasks and asynchronous work |
| Learn more | Messages API docs | Managed Agents docs |
Making Your First API Request
Let's start with the simplest possible interaction: sending a single message and getting a response.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' }
]
});
console.log(message);
Understanding the Response
The API returns a structured response object:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to note:
content: An array of content blocks (text, tool_use, etc.)stop_reason: Why the model stopped ("end_turn","max_tokens","stop_sequence", or"tool_use")usage: Token counts for billing and monitoring
Building Multi-Turn Conversations
The Messages API is stateless — Claude doesn't remember previous interactions. You must send the full conversation history with every request.
The Conversation Pattern
To build a multi-turn conversation, you append each new message to the messages array:
import anthropic
client = anthropic.Anthropic()
Start the conversation
messages = [
{"role": "user", "content": "What is the capital of France?"}
]
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
Append Claude's response
messages.append({"role": "assistant", "content": response.content[0].text})
Ask a follow-up
messages.append({"role": "user", "content": "What is its population?"})
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
print(response.content[0].text)
Synthetic Assistant Messages
You don't have to use only real Claude responses. You can inject synthetic assistant messages — pre-written text that appears to come from Claude. This is useful for:
- Guiding the conversation: Inserting context or corrections
- Role-playing scenarios: Setting up a character's backstory
- Data augmentation: Creating training datasets
messages = [
{"role": "user", "content": "Tell me about ancient Rome."},
# Synthetic assistant message to steer the response
{"role": "assistant", "content": "Let me focus specifically on Roman engineering achievements."},
{"role": "user", "content": "What were their most impressive structures?"}
]
Prefilling Claude's Response (Putting Words in Claude's Mouth)
You can prefill part of Claude's response by including an assistant message at the end of your input. This shapes the beginning of Claude's output.
Use Case: Multiple Choice Questions
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1, # Only need one token for the answer
messages=[
{
"role": "user",
"content": "What is the correct answer? A) 1 B) 2 C) 3 D) 4"
},
{
"role": "assistant",
"content": "The correct answer is " # Prefill starts here
}
]
)
print(response.content[0].text) # Output: "C"
Important Limitations
Prefilling is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error.
For these models, use structured outputs or system prompt instructions instead.
Vision: Sending Images to Claude
Claude can analyze images alongside text. You can supply images using three source types:
1. Base64 Encoding
import anthropic
import base64
client = anthropic.Anthropic()
with open("photo.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": image_data
}
},
{
"type": "text",
"text": "What is in this image?"
}
]
}
]
)
print(response.content[0].text)
2. URL Reference
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "url",
"url": "https://example.com/photo.jpg"
}
},
{
"type": "text",
"text": "Describe this image in detail."
}
]
}
]
)
3. File Reference (via Files API)
If you've uploaded an image through the Files API, you can reference it by file ID:
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "file",
"file_id": "file_01ABCDEFGHIJKLMNOPQRSTUV"
}
},
{
"type": "text",
"text": "What can you tell me about this image?"
}
]
}
]
)
Supported Media Types
| Format | MIME Type |
|---|---|
| JPEG | image/jpeg |
| PNG | image/png |
| GIF | image/gif |
| WebP | image/webp |
Handling Stop Reasons
Understanding why Claude stopped generating helps you build robust applications:
stop_reason | Meaning | Action |
|---|---|---|
"end_turn" | Claude finished naturally | Return the response |
"max_tokens" | Hit the token limit | Increase max_tokens or truncate |
"stop_sequence" | Found a stop sequence | Process the response |
"tool_use" | Claude wants to call a tool | Execute the tool and continue |
if response.stop_reason == "max_tokens":
print("Response was cut off. Consider increasing max_tokens.")
elif response.stop_reason == "tool_use":
print("Claude requested a tool call. Handle it before continuing.")
Best Practices
- Always send the full history — The API is stateless; don't rely on server-side memory.
- Use synthetic messages wisely — They're powerful for steering but can confuse Claude if overused.
- Monitor token usage — The
usagefield helps you track costs and optimize prompts. - Handle
max_tokensgracefully — Detect truncated responses and ask follow-up questions. - Compress long conversations — Use prompt caching or compaction for very long histories.
Key Takeaways
- The Messages API is stateless — You must send the full conversation history with every request to maintain context.
- Multi-turn conversations are built by appending user and assistant messages to the
messagesarray. - Prefilling lets you shape Claude's response by providing the beginning of an assistant message, but it's not supported on all models.
- Vision capabilities allow you to send images via base64, URL, or file reference, supporting JPEG, PNG, GIF, and WebP formats.
- Always check
stop_reasonto understand why Claude stopped and handle edge cases like token limits or tool calls.