Mastering the Messages API: A Practical Guide to Building with Claude
Learn how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities with practical code examples.
This guide teaches you how to use Claude's Messages API for basic requests, multi-turn conversations, prefill techniques, and vision capabilities, with practical code examples in Python and TypeScript.
Mastering the Messages API: A Practical Guide to Building with Claude
Claude's Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generator, or a complex agent system, understanding how to structure requests and handle responses is essential. This guide covers everything you need to know to work effectively with the Messages API, from basic calls to advanced patterns like prefill and vision.
Understanding the Messages API vs. Managed Agents
Anthropic offers two main ways to build with Claude:
- Messages API: Direct model prompting access. Best for custom agent loops and fine-grained control over every request and response.
- Claude Managed Agents: A pre-built, configurable agent harness that runs in managed infrastructure. Best for long-running tasks and asynchronous work.
Making Your First API Call
Let's start with the simplest possible request: sending a single message to Claude and getting a response.
Basic Request (Python)
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
Basic Request (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude' }
]
});
console.log(message);
Understanding the Response
The API returns a structured JSON response:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
Key fields to note:
content: An array of content blocks (usually text, but can include tool use blocks, images, etc.)stop_reason: Why the response ended (e.g.,"end_turn","max_tokens","stop_sequence", or"tool_use")usage: Token counts for billing and context management
Building Multi-Turn Conversations
The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over the context but requires you to manage the conversation state on your end.
Multi-Turn Example (Python)
import anthropic
client = anthropic.Anthropic()
Start with a simple greeting
messages = [
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=messages
)
print(message.content[0].text)
Multi-Turn Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const messages = [
{ role: 'user', content: 'Hello, Claude' },
{ role: 'assistant', content: 'Hello!' },
{ role: 'user', content: 'Can you describe LLMs to me?' }
];
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: messages
});
console.log(message.content[0].text);
Important Notes on Conversation History
- Synthetic assistant messages: You can include messages that Claude never actually generated. This is useful for providing examples or setting up a scenario.
- Earlier turns don't need to originate from Claude: You can pre-populate the conversation with any content you like.
- Token limits: Remember that every message in the history counts toward your context window. Be mindful of how much history you include.
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response for it. This is useful for:
- Forcing structured outputs (e.g., JSON, multiple choice answers)
- Guiding the tone or direction of the response
- Reducing token usage by constraining the output
Prefill Example (Python)
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Outputs: "C"
Prefill Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: 'claude-sonnet-4-5',
max_tokens: 1,
messages: [
{
role: 'user',
content: 'What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae'
},
{
role: 'assistant',
content: 'The answer is ('
}
]
});
console.log(message.content[0].text); // Outputs: "C"
Prefill Limitations
- Not supported on: Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6
- Error handling: Requests using prefill with these models return a 400 error
- Alternative: Use structured outputs or system prompt instructions instead
Working with Vision: Sending Images
The Messages API supports image inputs, enabling Claude to analyze and describe visual content.
Vision Example (Python)
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode the image
with open("path/to/image.jpg", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this image in detail."
}
]
}
]
)
print(message.content[0].text)
Vision Example (TypeScript)
import Anthropic from '@anthropic-ai/sdk';
import * as fs from 'fs';
const client = new Anthropic();
// Read and encode the image
const imageBuffer = fs.readFileSync('path/to/image.jpg');
const imageBase64 = imageBuffer.toString('base64');
const message = await client.messages.create({
model: 'claude-opus-4-7',
max_tokens: 1024,
messages: [
{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/jpeg',
data: imageBase64
}
},
{
type: 'text',
text: 'Describe this image in detail.'
}
]
}
]
});
console.log(message.content[0].text);
Supported Image Formats
- JPEG
- PNG
- GIF
- WebP
Handling Stop Reasons
Understanding why Claude stopped generating is crucial for building robust applications. The stop_reason field can be one of:
| Stop Reason | Meaning |
|---|---|
"end_turn" | Claude finished its response naturally |
"max_tokens" | The response was cut off due to the max_tokens limit |
"stop_sequence" | Claude encountered a custom stop sequence you defined |
"tool_use" | Claude wants to use a tool (relevant for agent implementations) |
Example: Handling Stop Reasons
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a long essay about AI"}]
)
if message.stop_reason == "max_tokens":
print("Response was truncated. Consider increasing max_tokens.")
# You could continue the conversation here
elif message.stop_reason == "end_turn":
print("Response completed successfully.")
Best Practices for the Messages API
1. Manage Context Window Efficiently
- Trim old messages when conversations get long
- Use system prompts for instructions that apply to the entire conversation
- Be mindful of token usage — every message in history counts toward your context limit
2. Handle Errors Gracefully
- Implement retry logic for transient errors
- Validate input before sending (e.g., image size, message format)
- Monitor
stop_reasonto detect truncation or unexpected behavior
3. Optimize for Latency
- Use streaming for real-time applications (see the Streaming Messages guide)
- Keep
max_tokensas low as possible for your use case - Consider prompt caching for repeated system prompts or large context
4. Security Considerations
- The Messages API is eligible for Zero Data Retention (ZDR) — when your organization has a ZDR arrangement, data sent through this feature is not stored after the API response is returned
- Never send sensitive data unless you have appropriate agreements in place
Conclusion
The Messages API is the foundation for building with Claude programmatically. By mastering basic requests, multi-turn conversations, prefill techniques, and vision capabilities, you can build powerful applications that leverage Claude's full potential.
Start with simple requests, then gradually add complexity as you become comfortable with the API patterns. Remember that the API is stateless — you control the conversation history, which gives you maximum flexibility but also requires careful management of context.
Key Takeaways
- The Messages API is stateless — you must send the full conversation history with every request, giving you complete control over context
- Prefill lets you guide Claude's responses by starting its reply, but it's not supported on all models (check the documentation for your specific model)
- Vision capabilities allow you to send images alongside text, enabling multimodal interactions
- Monitor
stop_reasonto handle truncation, tool use, and other edge cases in your application - Optimize context management by trimming old messages and using system prompts to stay within token limits