How to Build a Custom Partner Integration with the Claude API
A practical guide to integrating Claude AI into your own platform or service using the Anthropic API, covering authentication, messaging, streaming, and best practices.
This guide walks you through building a custom partner integration with Claude, from API key setup and authentication to sending messages, handling streaming responses, and following Anthropic's partner best practices.
Introduction
As Claude AI continues to reshape how businesses interact with language models, many organizations are looking to build their own custom integrations—becoming what Anthropic calls "Partners." Whether you're embedding Claude into a SaaS product, building an internal assistant, or creating a new customer-facing chatbot, understanding the official API integration patterns is essential.
This guide covers the practical steps to build a robust partner integration using the Claude API. You'll learn how to authenticate, send messages, handle streaming, and follow best practices that Anthropic recommends for partners.
Prerequisites
Before you begin, make sure you have:
- An Anthropic Console account
- An API key (generated in the console under API Keys)
- Basic familiarity with Python or TypeScript
- A development environment with
curl, Python 3.8+, or Node.js 16+
Step 1: Authentication and API Key Management
Every API call to Claude requires an x-api-key header. Treat your API key like a password—never hardcode it in source code or expose it in client-side applications.
Best Practices for API Keys
- Store keys in environment variables or a secrets manager
- Rotate keys regularly
- Use separate keys for development and production
- Implement rate limiting on your side to avoid hitting Anthropic's limits
Example: Setting Up Authentication
Python (usingrequests):
import os
import requests
API_KEY = os.environ.get("ANTHROPIC_API_KEY")
BASE_URL = "https://api.anthropic.com/v1"
headers = {
"x-api-key": API_KEY,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
}
TypeScript (using fetch):
const API_KEY = process.env.ANTHROPIC_API_KEY;
const BASE_URL = "https://api.anthropic.com/v1";
const headers = {
"x-api-key": API_KEY!,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
};
Step 2: Sending Your First Message
The core endpoint for generating text is POST /v1/messages. You send a list of messages (with roles user or assistant) and receive a completion.
Basic Request
Python:def send_message(user_message: str) -> dict:
payload = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": user_message}
]
}
response = requests.post(
f"{BASE_URL}/messages",
headers=headers,
json=payload
)
response.raise_for_status()
return response.json()
result = send_message("Explain quantum computing in simple terms.")
print(result["content"][0]["text"])
TypeScript:
async function sendMessage(userMessage: string) {
const payload = {
model: "claude-3-5-sonnet-20241022",
max_tokens: 1024,
messages: [
{ role: "user", content: userMessage }
]
};
const response = await fetch(${BASE_URL}/messages, {
method: "POST",
headers,
body: JSON.stringify(payload)
});
if (!response.ok) throw new Error(API error: ${response.status});
return response.json();
}
sendMessage("Explain quantum computing in simple terms.")
.then(data => console.log(data.content[0].text));
Step 3: Handling Streaming Responses
For a better user experience, especially in chat interfaces, use streaming. Claude supports server-sent events (SSE) that let you display tokens as they're generated.
Streaming in Python
import json
def stream_message(user_message: str):
payload = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"stream": True,
"messages": [
{"role": "user", "content": user_message}
]
}
with requests.post(
f"{BASE_URL}/messages",
headers=headers,
json=payload,
stream=True
) as response:
for line in response.iter_lines():
if line:
decoded = line.decode("utf-8")
if decoded.startswith("data: "):
data = json.loads(decoded[6:])
if data["type"] == "content_block_delta":
print(data["delta"]["text"], end="", flush=True)
stream_message("Write a short poem about AI.")
Streaming in TypeScript
async function streamMessage(userMessage: string) {
const payload = {
model: "claude-3-5-sonnet-20241022",
max_tokens: 1024,
stream: true,
messages: [
{ role: "user", content: userMessage }
]
};
const response = await fetch(${BASE_URL}/messages, {
method: "POST",
headers,
body: JSON.stringify(payload)
});
const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop() || "";
for (const line of lines) {
if (line.startsWith("data: ")) {
const data = JSON.parse(line.slice(6));
if (data.type === "content_block_delta") {
process.stdout.write(data.delta.text);
}
}
}
}
}
streamMessage("Write a short poem about AI.");
Step 4: Adding System Prompts and Context
For partner integrations, you often need to control Claude's behavior. Use the system parameter to set instructions that persist across the conversation.
def send_with_system(system_prompt: str, user_message: str) -> str:
payload = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"system": system_prompt,
"messages": [
{"role": "user", "content": user_message}
]
}
response = requests.post(
f"{BASE_URL}/messages",
headers=headers,
json=payload
)
return response.json()["content"][0]["text"]
Example: Customer support bot
system = "You are a helpful customer support agent for Acme Corp. " \
"Be polite, concise, and only answer based on the provided knowledge base."
reply = send_with_system(system, "How do I reset my password?")
print(reply)
Step 5: Error Handling and Retries
Production integrations must handle errors gracefully. Common HTTP status codes include:
- 400: Bad request (check your payload)
- 401: Unauthorized (invalid API key)
- 429: Rate limited (implement exponential backoff)
- 500: Server error (retry after a delay)
Retry Logic Example
import time
from requests.exceptions import RequestException
def send_with_retry(user_message: str, max_retries: int = 3) -> dict:
for attempt in range(max_retries):
try:
response = requests.post(
f"{BASE_URL}/messages",
headers=headers,
json={
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"messages": [{"role": "user", "content": user_message}]
}
)
if response.status_code == 429:
wait = 2 ** attempt # exponential backoff
print(f"Rate limited. Waiting {wait}s...")
time.sleep(wait)
continue
response.raise_for_status()
return response.json()
except RequestException as e:
if attempt == max_retries - 1:
raise
print(f"Attempt {attempt + 1} failed: {e}. Retrying...")
time.sleep(1)
Step 6: Partner-Specific Best Practices
Anthropic encourages partners to follow these guidelines:
1. Respect Rate Limits
Check your plan's rate limits in the Anthropic Console. Implement client-side throttling to avoid hitting limits.
2. Cache Common Responses
For frequently asked questions, cache Claude's responses to reduce API costs and latency. Use a short TTL (time-to-live) to keep responses fresh.
3. Monitor Token Usage
Track both input and output tokens. Use the usage field in API responses to bill customers or optimize prompts.
4. Implement Content Moderation
Use Claude's safety features or a separate moderation layer to filter harmful outputs before displaying to end users.
5. Provide Clear Attribution
When displaying Claude-generated content, clearly indicate it was generated by AI. Anthropic's brand guidelines recommend phrasing like "Generated with Claude by Anthropic."
Full Integration Example
Here's a minimal but complete Python class that wraps the Claude API for a partner integration:
import os
import requests
import json
from typing import List, Dict, Optional
class ClaudePartnerClient:
def __init__(self, api_key: Optional[str] = None):
self.api_key = api_key or os.environ["ANTHROPIC_API_KEY"]
self.base_url = "https://api.anthropic.com/v1"
self.headers = {
"x-api-key": self.api_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
}
def chat(self,
messages: List[Dict[str, str]],
system: Optional[str] = None,
max_tokens: int = 1024,
stream: bool = False) -> Dict:
payload = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": max_tokens,
"messages": messages,
"stream": stream
}
if system:
payload["system"] = system
response = requests.post(
f"{self.base_url}/messages",
headers=self.headers,
json=payload,
stream=stream
)
response.raise_for_status()
if stream:
return self._handle_stream(response)
return response.json()
def _handle_stream(self, response):
full_text = ""
for line in response.iter_lines():
if line:
decoded = line.decode("utf-8")
if decoded.startswith("data: "):
data = json.loads(decoded[6:])
if data["type"] == "content_block_delta":
chunk = data["delta"]["text"]
full_text += chunk
print(chunk, end="", flush=True)
print()
return {"content": [{"text": full_text}]}
Usage
client = ClaudePartnerClient()
response = client.chat(
messages=[{"role": "user", "content": "Hello, Claude!"}],
system="You are a helpful assistant."
)
print(response["content"][0]["text"])
Conclusion
Building a partner integration with Claude is straightforward when you follow the official API patterns. Start with authentication, master the messages endpoint, add streaming for real-time experiences, and layer in error handling and best practices. As Anthropic continues to evolve the API, keep an eye on the changelog for new features like tool use, vision, and expanded model availability.
Key Takeaways
- Authenticate securely using API keys stored in environment variables, never in client-side code.
- Use the
/v1/messagesendpoint for all text generation, and enable streaming for better user experience. - Implement retry logic with exponential backoff to handle rate limits and transient errors gracefully.
- Follow Anthropic's partner best practices: cache responses, monitor token usage, and provide clear AI attribution.
- Build a reusable client class to encapsulate API logic, making your integration maintainable and testable.