How to Build a Custom Claude Integration Using the Partners API
A practical guide to integrating Claude AI via the Anthropic Partners API, covering authentication, message streaming, and best practices for production deployments.
This guide teaches you how to authenticate, send messages, and stream responses using the Anthropic Partners API, with production-ready code examples in Python and TypeScript.
How to Build a Custom Claude Integration Using the Partners API
Claude AI’s power extends far beyond the chat interface. With the Anthropic Partners API, you can embed Claude’s reasoning and generation capabilities directly into your own applications, services, and workflows. Whether you’re building a customer support bot, a content generation tool, or an internal analytics assistant, the Partners API gives you the same underlying model access that powers Claude.ai.
This guide walks you through everything you need to know to get started with the Partners API: authentication, sending your first message, handling streaming responses, and preparing your integration for production.
What Is the Partners API?
The Partners API is Anthropic’s programmatic interface for accessing Claude models. It allows approved partners and developers to:
- Send text prompts and receive completions
- Stream responses token by token for real-time UX
- Configure model parameters (temperature, max tokens, etc.)
- Manage conversation context with system prompts and multi-turn messages
Prerequisites
Before you begin, ensure you have:
- An Anthropic account with API access (sign up at console.anthropic.com)
- An API key generated from the console
- Basic familiarity with Python (3.8+) or TypeScript/Node.js (18+)
curlor a REST client for quick testing
Step 1: Authentication
All API requests require an x-api-key header containing your secret key. Never expose your API key in client-side code – always keep it server-side.
Python Setup
import os
from anthropic import Anthropic
client = Anthropic(
api_key=os.environ.get("ANTHROPIC_API_KEY")
)
TypeScript/Node.js Setup
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env['ANTHROPIC_API_KEY'],
});
Security tip: Store your API key in environment variables or a secrets manager. Never hardcode it.
Step 2: Send Your First Message
The core endpoint is POST /v1/messages. You send a list of messages (with roles user or assistant) and optionally a system prompt.
Python Example
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a helpful assistant that speaks like a pirate.",
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
print(message.content[0].text)
TypeScript Example
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: 'You are a helpful assistant that speaks like a pirate.',
messages: [
{ role: 'user', content: 'What is the capital of France?' }
],
});
console.log(message.content[0].text);
Response:
Arr, the capital o' France be Paris, me hearty!
Step 3: Streaming Responses for Real-Time UX
For chat-like experiences, streaming is essential. Instead of waiting for the full response, you receive each token as it’s generated.
Python Streaming
stream = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
stream=True,
messages=[
{"role": "user", "content": "Write a haiku about APIs."}
]
)
for event in stream:
if event.type == "content_block_delta":
print(event.delta.text, end="", flush=True)
TypeScript Streaming
const stream = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
stream: true,
messages: [
{ role: 'user', content: 'Write a haiku about APIs.' }
],
});
for await (const event of stream) {
if (event.type === 'content_block_delta') {
process.stdout.write(event.delta.text);
}
}
Streaming dramatically improves perceived responsiveness in your application.
Step 4: Multi-Turn Conversations
To maintain context across multiple exchanges, include the entire message history in each request.
conversation = [
{"role": "user", "content": "What is the speed of light?"},
{"role": "assistant", "content": "The speed of light in a vacuum is approximately 299,792,458 meters per second."},
{"role": "user", "content": "How long does it take to reach Mars?"}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=conversation
)
Note: The API does not store conversation state. You must manage history on your end.
Step 5: Production Best Practices
1. Handle Errors Gracefully
try:
response = client.messages.create(...)
except anthropic.APIError as e:
print(f"API error: {e}")
except anthropic.APIConnectionError as e:
print(f"Connection error: {e}")
except anthropic.RateLimitError as e:
print(f"Rate limited: {e}")
2. Implement Retry Logic
Use exponential backoff for transient failures:
import time
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def call_claude(messages):
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=messages
)
3. Monitor Token Usage
Track input_tokens and output_tokens from the response to manage costs:
response = client.messages.create(...)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
4. Use System Prompts for Consistent Behavior
System prompts set the tone and rules for the entire conversation. They are more reliable than injecting instructions into user messages.
system_prompt = """
You are a customer support agent for Acme Corp.
- Always be polite and professional.
- If you don't know an answer, say so and offer to escalate.
- Never share internal company data.
"""
Common Pitfalls to Avoid
| Pitfall | Solution |
|---|---|
| Exposing API keys in client code | Always use server-side proxies |
| Not handling rate limits | Implement retry with backoff |
| Sending overly long histories | Trim or summarize old messages |
| Ignoring token limits | Set max_tokens appropriately |
Forgetting to set stream=True for chat | Use streaming for real-time apps |
Next Steps
Once your basic integration is working, explore:
- Tool use (function calling) – let Claude interact with your APIs
- Vision – send images for Claude to analyze
- Batch processing – handle large volumes of requests efficiently
- Custom model fine-tuning (if available for your tier)
Key Takeaways
- The Partners API provides programmatic access to Claude models via a simple REST interface.
- Always authenticate with an API key stored server-side in environment variables.
- Use streaming for real-time user experiences and multi-turn conversations for context retention.
- Implement error handling, retry logic, and token monitoring for production readiness.
- System prompts are the most effective way to control Claude’s behavior across an entire session.