Mastering the Messages API: Building Conversational AI with Claude
Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with code examples in Python and TypeScript.
Introduction
Claude's Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generation tool, or a complex agent system, understanding the Messages API is essential. This guide walks you through everything you need to know—from making your first request to handling multi-turn conversations and using advanced techniques like prefill and vision.
Anthropic offers two ways to build with Claude: the Messages API for direct model access and fine-grained control, and Claude Managed Agents for pre-built, configurable agent harnesses. This guide focuses on the Messages API, which is ideal for custom agent loops and applications requiring precise control over the conversation flow.
Basic Request and Response
Let's start with the simplest possible interaction: sending a single message to Claude and getting a response.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' }
]
});
console.log(message);
Understanding the Response
The API returns a structured response object containing:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to note:
content: An array of content blocks (text, images, tool use, etc.)stop_reason: Why the response ended (end_turn,max_tokens,stop_sequence, ortool_use)usage: Token counts for billing and monitoring
Multi-Turn Conversations
The Messages API is stateless—you must send the full conversation history with every request. This gives you complete control over context but requires you to manage state on your end.
Building a Conversation
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Important Patterns
- Full history required: Always include all previous messages in the
messagesarray - Synthetic assistant messages: You can insert pre-written assistant responses (e.g., for system prompts or guided conversations)
- Alternating roles: Messages must alternate between
userandassistantroles, starting withuser
Putting Words in Claude's Mouth (Prefill)
Prefill allows you to start Claude's response for it. This is useful for:
- Guiding Claude toward a specific format
- Forcing multiple-choice answers
- Providing a response template
Prefill Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Outputs: "C"
Prefill Limitations
Important: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.
For models that don't support prefill, consider:
- Structured outputs: Define a JSON schema for Claude to follow
- System prompt instructions: Use the
systemparameter to guide response format
Vision Capabilities
Claude can process images alongside text. This enables use cases like:
- Image analysis and description
- Document processing (receipts, forms, etc.)
- Visual question answering
Vision Request Example
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("receipt.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": image_data
}
},
{
"type": "text",
"text": "What's the total amount on this receipt?"
}
]
}
]
)
print(message.content[0].text)
Supported Image Formats
- JPEG, PNG, GIF, WebP
- Maximum size: 100MB per image
- Claude automatically resizes images to fit context window limits
Handling Stop Reasons
Understanding why Claude stopped generating is crucial for building robust applications:
| Stop Reason | Meaning | Action |
|---|---|---|
end_turn | Claude finished naturally | Continue or end conversation |
max_tokens | Hit token limit | Increase max_tokens or continue |
stop_sequence | Found a stop sequence | Handle as needed |
tool_use | Claude wants to use a tool | Execute tool and return result |
Example: Handling Tool Use
if message.stop_reason == "tool_use":
for block in message.content:
if block.type == "tool_use":
# Execute the tool and continue the conversation
result = execute_tool(block.name, block.input)
messages.append({"role": "assistant", "content": message.content})
messages.append({"role": "user", "content": result})
Best Practices
- Manage context windows: Keep conversations within Claude's context limit. Use techniques like summarization or sliding windows for long conversations.
- Use system prompts: For consistent behavior, use the
systemparameter to set Claude's persona and constraints.
- Monitor token usage: Track
usage.input_tokensandusage.output_tokensto control costs and optimize prompts.
- Handle errors gracefully: Implement retry logic for transient failures and validate responses before using them.
- Stream responses: For better user experience, use streaming to show responses as they're generated.
Conclusion
The Messages API is the foundation for building with Claude. By mastering basic requests, multi-turn conversations, prefill techniques, and vision capabilities, you can create powerful conversational AI applications. Remember that the API is stateless—you manage the conversation history—and always check stop reasons to handle different response scenarios.
Key Takeaways
- The Messages API is stateless: You must send the full conversation history with every request; manage state on your end.
- Prefill guides responses: Use assistant messages to start Claude's response, but check model compatibility as newer models may not support it.
- Vision enables multimodal use cases: Claude can analyze images alongside text for document processing, visual QA, and more.
- Stop reasons dictate next steps: Always check
stop_reasonto determine whether to continue the conversation, increase tokens, or handle tool calls. - Streaming improves UX: For real-time applications, use streaming to show responses as they're generated rather than waiting for the complete response.