Mastering the Messages API: A Practical Guide to Building with Claude
Learn how to use Claude's Messages API for single-turn queries, multi-turn conversations, prefill techniques, and vision tasks with practical code examples.
This guide teaches you how to use Claude's Messages API to send requests, manage multi-turn conversations, prefill responses, and handle images, with code examples in Python and TypeScript.
Introduction
Claude's Messages API is the primary way to interact with the model programmatically. Whether you're building a chatbot, a content generator, or a tool-using agent, understanding the Messages API is essential. This guide covers the core patterns you'll use every day: basic requests, multi-turn conversations, prefill techniques, and vision capabilities.
Note: Anthropic offers two ways to build with Claude: the Messages API (direct model access, best for custom agent loops) and Claude Managed Agents (pre-built harness for long-running tasks). This guide focuses on the Messages API.
Basic Request and Response
At its simplest, a Messages API call sends a list of messages and returns Claude's response. Here's a minimal example in Python:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(message)
Understanding the Response
The response is a JSON object with several key fields:
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Hello!"
}
],
"model": "claude-opus-4-7",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 12,
"output_tokens": 6
}
}
content: An array of content blocks (usually text, but can include tool use calls).stop_reason: Why Claude stopped generating. Common values:"end_turn"(normal completion),"max_tokens"(hit token limit),"tool_use"(Claude wants to call a tool).usage: Token counts for billing and context management.
Multi-Turn Conversations
The Messages API is stateless — you must send the full conversation history with every request. This gives you complete control over context but means you need to manage state on your end.
Building a Conversation
To continue a conversation, append new messages to the messages array:
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"},
{"role": "assistant", "content": "Hello!"},
{"role": "user", "content": "Can you describe LLMs to me?"}
]
)
Synthetic Assistant Messages
You can also inject synthetic assistant messages — responses that didn't actually come from Claude. This is useful for:
- Providing examples of the response format you want
- Correcting context (e.g., "Actually, I already know X")
- Simulating multi-step workflows
messages = [
{"role": "user", "content": "Summarize this article: ..."},
# Pre-seed a good summary format
{"role": "assistant", "content": "Here is a summary in bullet points:\n- "}
]
Prefill: Putting Words in Claude's Mouth
Prefill allows you to start Claude's response by providing the beginning of its output. This is a powerful technique for:
- Constraining output format (e.g., JSON, multiple choice)
- Guiding tone or style
- Reducing token usage by steering the response early
Example: Multiple Choice Answer
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1,
messages=[
{
"role": "user",
"content": "What is latin for Ant? (A) Apoidea, (B) Rhopalocera, (C) Formicidae"
},
{
"role": "assistant",
"content": "The answer is ("
}
]
)
print(message.content[0].text) # Output: "C"
By setting max_tokens=1 and prefilling with "The answer is (", Claude only needs to output the letter. This is efficient and predictable.
Important Limitations
- Prefill is not supported on Claude Mythos Preview, Claude Opus 4.7, Claude Opus 4.6, and Claude Sonnet 4.6. Requests with these models return a 400 error.
- For those models, use structured outputs or system prompt instructions instead.
Vision: Working with Images
Claude can analyze images sent through the Messages API. Images are sent as base64-encoded data or via URL.
Sending an Image
import anthropic
import base64
client = anthropic.Anthropic()
Read and encode image
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
message = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": image_data
}
},
{
"type": "text",
"text": "Describe this chart in detail."
}
]
}
]
)
print(message.content[0].text)
Supported Image Types
image/jpegimage/pngimage/gif(first frame only)image/webp
Tips for Vision
- Keep images under 20MB for optimal performance.
- Combine with text prompts for best results (e.g., "What's the trend in this graph?")
- Use high-resolution images when fine details matter (Claude supports up to 8K resolution).
Handling Stop Reasons
Understanding stop_reason helps you build robust applications:
| Stop Reason | Meaning | Action |
|---|---|---|
end_turn | Claude finished naturally | Continue or end conversation |
max_tokens | Hit token limit | Increase max_tokens or continue |
tool_use | Claude wants to call a tool | Execute the tool and return result |
stop_sequence | Found a custom stop sequence | Handle as needed |
Example: Handling max_tokens
response = client.messages.create(
model="claude-opus-4-7",
max_tokens=100,
messages=[{"role": "user", "content": "Write a long story"}]
)
if response.stop_reason == "max_tokens":
# Continue the conversation
messages.append({"role": "assistant", "content": response.content[0].text})
messages.append({"role": "user", "content": "Continue"})
# Send again...
Best Practices
- Manage context windows: Keep conversation history within Claude's context limit (varies by model). Use prompt caching for long histories.
- Use system prompts: For persistent instructions, use the
systemparameter instead of repeating in every user message. - Handle errors gracefully: The API may return errors for invalid requests, rate limits, or server issues. Implement retry logic with exponential backoff.
- Monitor token usage: Track
usage.input_tokensandusage.output_tokensto optimize costs and avoid surprises. - Stream for responsiveness: For long responses, use streaming to show output incrementally.
Conclusion
The Messages API is the foundation for all Claude integrations. By mastering basic requests, multi-turn conversations, prefill, and vision, you can build sophisticated applications that leverage Claude's full capabilities. Start with simple patterns and gradually add complexity as your use case demands.
Key Takeaways
- The Messages API is stateless — always send the full conversation history with each request.
- Prefill lets you guide Claude's response by providing the beginning of its output, but is not supported on all models.
- Vision capabilities allow Claude to analyze images sent as base64 or URLs.
- Always check
stop_reasonto determine the next action in your application logic. - Use synthetic assistant messages to provide examples or correct context without real Claude responses.