Mastering the Claude Messages API: From Basic Requests to Advanced Patterns
Learn how to use the Claude Messages API effectively with practical examples covering basic requests, multi-turn conversations, prefill techniques, and vision capabilities.
This guide teaches you how to work with the Claude Messages API, including making basic requests, building multi-turn conversations, using prefill to shape responses, and sending images for vision tasks.
Introduction
The Claude Messages API is the primary way to interact with Claude programmatically. Whether you're building a chatbot, a content generation tool, or a complex agent system, understanding how to structure your API calls is essential. This guide walks you through the core patterns for working with the Messages API, from simple requests to advanced techniques like prefill and vision.
Basic Request and Response
At its simplest, the Messages API accepts a list of messages and returns a response. Here's a minimal example using Python:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
The response includes:
- id: A unique identifier for the message
- role: Always "assistant" for responses
- content: An array of content blocks (usually text)
- model: The model used
- stop_reason: Why the generation stopped (e.g., "end_turn", "max_tokens")
- usage: Token counts for input and output
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "Hello!"}],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {"input_tokens": 12, "output_tokens": 6}
}
Multi-Turn Conversations
The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context and allows you to build dynamic conversations over time.
Building a Conversation
To continue a conversation, simply append new messages to the history:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
print(message.content[0].text)
Synthetic Assistant Messages
You don't have to use only real Claude responses. You can inject synthetic assistant messages to guide the conversation or simulate context. For example, you might pre-populate a conversation with a system-like assistant response:
messages = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What is its population?"}
]
This is useful for:
- Providing context from previous sessions
- Simulating a specific assistant persona
- Building few-shot examples into the conversation
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response by providing the beginning of its answer. This is powerful for:
- Constraining responses to specific formats
- Guiding the model toward a particular structure
- Getting single-word or single-token answers
Basic Prefill Example
Here's how to use prefill to get a multiple-choice answer:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Output: "C"
Important Prefill Limitations
- Not supported on: Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6
- These models return a 400 error if you attempt prefill
- Alternative: Use structured outputs or system prompt instructions instead
When to Use Prefill
- Classification tasks: Force Claude to output a specific label
- JSON extraction: Start with
{"to ensure valid JSON output - Format control: Begin a list or table structure
- Single-token answers: Combine with
max_tokens=1for constrained responses
Vision: Sending Images to Claude
Claude can analyze images sent via the Messages API. This is useful for:
- Document analysis
- Image description
- Visual question answering
Image Request Format
Images are sent as content blocks with a source object:
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode the image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this chart in detail."
}
]
}
]
)
print(message.content[0].text)
Supported Media Types
image/jpegimage/pngimage/gif(first frame only)image/webp
Tips for Vision Requests
- Combine with text: Always include a text prompt alongside the image for best results
- Image size: Larger images consume more tokens; resize if needed
- Multiple images: You can send multiple images in a single message
Handling Stop Reasons
Every response includes a stop_reason field that tells you why generation stopped:
| Stop Reason | Meaning |
|---|---|
end_turn | Claude finished naturally |
max_tokens | Hit the token limit; response may be truncated |
stop_sequence | A custom stop sequence was encountered |
tool_use | Claude wants to use a tool (for agent workflows) |
max_tokens, you should continue the conversation by sending the partial response back and asking Claude to continue.
Best Practices
1. Manage Token Usage
- Monitor
usage.input_tokensandusage.output_tokensto control costs - Use
max_tokensto limit response length - Consider prompt caching for repeated system prompts
2. Handle Errors Gracefully
- Implement retry logic with exponential backoff
- Check for 400 errors (invalid requests) and 429 errors (rate limits)
- Validate your message structure before sending
3. Optimize for Your Use Case
- Chatbots: Use multi-turn patterns with full history
- Classification: Use prefill with
max_tokens=1 - Content generation: Use system prompts and longer
max_tokens - Vision tasks: Combine images with clear text instructions
4. Security Considerations
- Never expose API keys in client-side code
- Validate and sanitize user input before sending to the API
- Be aware of data retention policies (ZDR available for eligible organizations)
Conclusion
The Claude Messages API is flexible and powerful. By mastering basic requests, multi-turn conversations, prefill, and vision, you can build sophisticated applications that leverage Claude's capabilities. Remember that the API is stateless — you control the context by managing the conversation history yourself.
Key Takeaways
- The Messages API is stateless — always send the full conversation history with each request
- Use prefill to guide Claude's responses, but check model compatibility (not supported on Opus 4.7 and later)
- Vision capabilities allow you to send images alongside text prompts for analysis
- Monitor
stop_reasonto handle truncated responses and tool use scenarios - Synthetic assistant messages give you full control over conversation context and few-shot examples