Mastering the Messages API: Build Conversational AI with Claude
Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
This guide teaches you how to use Claude's Messages API to build conversational AI applications, covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities with Python and TypeScript examples.
Introduction
The Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generator, or a complex agent system, understanding how to work with messages is essential. This guide covers everything from basic requests to advanced techniques like prefill and vision, with practical code examples you can use immediately.
Understanding the Messages API vs. Claude Managed Agents
Anthropic offers two approaches for building with Claude:
- Messages API: Direct model prompting access. Best for custom agent loops and fine-grained control.
- Claude Managed Agents: Pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.
Making Your First API Request
Let's start with the simplest possible request: sending a single message to Claude and getting a response.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' }
]
});
console.log(message);
Understanding the Response
The API returns a structured response:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to note:
content: An array of content blocks (text, images, etc.)stop_reason: Why the model stopped generating (e.g., "end_turn", "max_tokens")usage: Token counts for billing and optimization
Building Multi-Turn Conversations
The Messages API is stateless — you must send the full conversation history with each request. This gives you complete control over context.
Python Example
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' },
{ role: 'assistant', content: 'Hello!' },
{ role: 'user', content: 'Can you describe LLMs to me?' }
]
});
console.log(message.content[0].text);
Important Notes
- Earlier turns don't need to originate from Claude — you can use synthetic assistant messages for context
- The conversation history grows with each turn, so manage token limits carefully
- Use prompt caching for long conversations to reduce costs
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response, guiding the model toward a specific output format or direction. This is powerful for structured outputs.
Example: Multiple Choice Answer
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Output: "C"
When to Use Prefill
- Structured outputs: Force JSON or specific formats
- Multiple choice: Get concise answers
- Chain-of-thought: Start reasoning patterns
- Format control: Ensure consistent response structure
Limitations
Note: Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests using prefill with these models return a 400 error. Use structured outputs or system prompt instructions instead.
Working with Vision Capabilities
The Messages API supports image inputs, enabling visual understanding.
Python Example
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "What does this chart show?"
}
]
}
]
)
print(message.content[0].text)
Supported Image Formats
- PNG
- JPEG
- WEBP
- GIF (first frame only)
Handling Stop Reasons
Understanding why Claude stopped generating helps you handle different scenarios:
| Stop Reason | Meaning | Action |
|---|---|---|
end_turn | Claude finished naturally | Continue conversation |
max_tokens | Token limit reached | Increase max_tokens or split response |
stop_sequence | Custom stop sequence triggered | Handle as designed |
tool_use | Claude wants to use a tool | Execute tool and continue |
Best Practices
1. Manage Token Usage
- Monitor
usage.input_tokensandusage.output_tokensin responses - Use prompt caching for repeated context
- Set appropriate
max_tokensto control costs
2. Handle Errors Gracefully
try:
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
except anthropic.APIError as e:
print(f"API Error: {e}")
# Implement retry logic
3. Use System Prompts for Instructions
For models that don't support prefill, use system prompts:
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
system="You are a helpful assistant that always responds in JSON format.",
messages=[
{"role": "user", "content": "List three programming languages"}
]
)
4. Streaming for Better UX
For real-time applications, use streaming:
stream = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.type == "content_block_delta":
print(chunk.delta.text, end="")
Conclusion
The Messages API is your gateway to building powerful conversational AI applications with Claude. By mastering basic requests, multi-turn conversations, prefill techniques, and vision capabilities, you can create sophisticated interactions tailored to your specific use case.
Remember that the API is stateless — you control the conversation history. Use this to your advantage by carefully managing context, leveraging prefill for structured outputs, and handling stop reasons appropriately.
Key Takeaways
- The Messages API is stateless — always send the full conversation history with each request
- Prefill lets you guide Claude's response by starting it yourself, but check model compatibility
- Vision capabilities allow Claude to analyze images alongside text in a single request
- Monitor
stop_reasonto handle different completion scenarios (end_turn, max_tokens, tool_use) - Use streaming for real-time applications and prompt caching for long conversations to optimize performance and costs