How to Build a Custom Partner Integration with the Claude API
A practical guide to creating custom partner integrations with Claude API, covering authentication, message streaming, error handling, and best practices for production deployments.
Learn how to build a production-ready partner integration with Claude API, including API key setup, message streaming, error handling, and rate-limit management using Python and TypeScript examples.
How to Build a Custom Partner Integration with the Claude API
Building a partner integration with Claude API allows you to embed powerful AI capabilities into your own platform, product, or service. Whether you're creating a customer support chatbot, a content generation tool, or an AI-assisted workflow, this guide walks you through the essential steps to build a robust, production-ready integration.
Understanding the Claude API Partner Model
Anthropic's partner ecosystem enables third-party developers to integrate Claude into their applications. As a partner, you get direct API access to Claude models (including Claude 3.5 Sonnet and Claude 3 Opus) with dedicated support and documentation. The integration process involves:
- Obtaining API credentials
- Setting up authentication
- Making API calls with proper request formatting
- Handling responses and errors gracefully
- Managing rate limits and scaling
Prerequisites
Before you begin, ensure you have:
- A registered account on Anthropic Console
- An API key (created in the Console under API Keys)
- Basic familiarity with REST APIs and JSON
- Python 3.8+ or Node.js 16+ installed locally
Step 1: Obtaining and Securing Your API Key
Your API key is the gateway to Claude. Treat it like a password—never expose it in client-side code or commit it to version control.
Best Practices for API Key Management
- Environment variables: Store your key in
.envfiles or your deployment platform's secrets manager. - Restricted keys: Create separate keys for development, staging, and production.
- Rotation: Rotate keys periodically and immediately if compromised.
# .env file (never commit this)
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
Step 2: Making Your First API Call
Claude's Messages API is the primary endpoint for sending prompts and receiving responses. Here's a minimal Python example:
import os
import requests
from dotenv import load_dotenv
load_dotenv()
API_KEY = os.getenv("ANTHROPIC_API_KEY")
API_URL = "https://api.anthropic.com/v1/messages"
headers = {
"x-api-key": API_KEY,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
}
data = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello, Claude!"}
]
}
response = requests.post(API_URL, headers=headers, json=data)
print(response.json()["content"][0]["text"])
TypeScript Equivalent
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
async function main() {
const response = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Hello, Claude!' }],
});
console.log(response.content[0].text);
}
main();
Step 3: Implementing Streaming for Better UX
For partner integrations, streaming responses dramatically improve user experience by showing tokens as they're generated. Here's how to implement streaming:
import os
import json
import requests
def stream_claude_response(prompt):
API_KEY = os.getenv("ANTHROPIC_API_KEY")
headers = {
"x-api-key": API_KEY,
"anthropic-version": "2023-06-01",
"content-type": "application/json",
"accept": "text/event-stream"
}
data = {
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 2048,
"stream": True,
"messages": [{"role": "user", "content": prompt}]
}
with requests.post(
"https://api.anthropic.com/v1/messages",
headers=headers,
json=data,
stream=True
) as response:
for line in response.iter_lines():
if line:
decoded = line.decode('utf-8')
if decoded.startswith('data: '):
event_data = json.loads(decoded[6:])
if event_data['type'] == 'content_block_delta':
yield event_data['delta']['text']
Usage
for token in stream_claude_response("Write a short poem about AI"):
print(token, end='', flush=True)
Step 4: Handling Errors and Rate Limits
Production integrations must handle API errors gracefully. Claude API returns standard HTTP status codes:
| Status Code | Meaning | Handling Strategy |
|---|---|---|
| 200 | Success | Parse response |
| 400 | Bad Request | Validate input |
| 401 | Unauthorized | Check API key |
| 429 | Rate Limited | Implement backoff |
| 500 | Server Error | Retry with delay |
Implementing Exponential Backoff
import time
import random
def call_claude_with_retry(data, max_retries=3):
for attempt in range(max_retries):
try:
response = requests.post(API_URL, headers=headers, json=data)
response.raise_for_status()
return response.json()
except requests.exceptions.HTTPError as e:
if response.status_code == 429:
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limited. Retrying in {wait_time:.2f}s")
time.sleep(wait_time)
elif response.status_code >= 500:
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Server error. Retrying in {wait_time:.2f}s")
time.sleep(wait_time)
else:
raise e
raise Exception("Max retries exceeded")
Step 5: Building a Complete Integration Pattern
Here's a production-ready integration class that combines all best practices:
import os
import json
import time
import random
import requests
from typing import Generator, Optional
class ClaudePartnerIntegration:
def __init__(self, api_key: Optional[str] = None):
self.api_key = api_key or os.getenv("ANTHROPIC_API_KEY")
self.base_url = "https://api.anthropic.com/v1/messages"
self.headers = {
"x-api-key": self.api_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
}
def send_message(
self,
prompt: str,
model: str = "claude-3-5-sonnet-20241022",
max_tokens: int = 1024,
stream: bool = False
) -> Generator[str, None, None]:
data = {
"model": model,
"max_tokens": max_tokens,
"stream": stream,
"messages": [{"role": "user", "content": prompt}]
}
if stream:
yield from self._stream_response(data)
else:
yield self._get_response(data)
def _get_response(self, data: dict) -> str:
response = self._make_request(data)
return response["content"][0]["text"]
def _stream_response(self, data: dict) -> Generator[str, None, None]:
headers = {**self.headers, "accept": "text/event-stream"}
with requests.post(self.base_url, headers=headers, json=data, stream=True) as r:
for line in r.iter_lines():
if line:
decoded = line.decode('utf-8')
if decoded.startswith('data: '):
event = json.loads(decoded[6:])
if event['type'] == 'content_block_delta':
yield event['delta']['text']
def _make_request(self, data: dict, max_retries: int = 3) -> dict:
for attempt in range(max_retries):
try:
response = requests.post(self.base_url, headers=self.headers, json=data)
response.raise_for_status()
return response.json()
except requests.exceptions.HTTPError as e:
if response.status_code == 429:
wait = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait)
elif response.status_code >= 500:
wait = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait)
else:
raise e
raise Exception("Max retries exceeded")
Usage
integration = ClaudePartnerIntegration()
for token in integration.send_message("Explain quantum computing simply", stream=True):
print(token, end='', flush=True)
Step 6: Testing Your Integration
Always test your integration thoroughly before going live:
- Unit tests: Mock API responses to test your logic
- Integration tests: Use a test API key with limited quota
- Load tests: Simulate concurrent users to verify rate limit handling
- Edge cases: Test empty responses, long prompts, and special characters
# Example test using pytest
import pytest
from unittest.mock import patch
def test_send_message_success():
integration = ClaudePartnerIntegration(api_key="test-key")
with patch('requests.post') as mock_post:
mock_post.return_value.status_code = 200
mock_post.return_value.json.return_value = {
"content": [{"text": "Hello!"}]
}
result = list(integration.send_message("Hi"))
assert result == ["Hello!"]
def test_rate_limit_retry():
integration = ClaudePartnerIntegration(api_key="test-key")
with patch('requests.post') as mock_post:
mock_post.return_value.status_code = 429
with pytest.raises(Exception, match="Max retries exceeded"):
list(integration.send_message("Hi"))
Best Practices for Partner Integrations
- Cache responses for identical prompts to reduce API costs and latency.
- Monitor usage with Anthropic's Console dashboards to track token consumption.
- Implement user authentication if your integration serves multiple end users.
- Use system prompts to set Claude's behavior and tone for your specific use case.
- Log errors with context (prompt, model, timestamp) for debugging.
Key Takeaways
- Secure your API key using environment variables and never expose it client-side.
- Implement streaming for real-time token delivery and better user experience.
- Handle rate limits with exponential backoff to ensure reliable service.
- Build a reusable integration class that encapsulates authentication, retry logic, and streaming.
- Test thoroughly with unit, integration, and load tests before production deployment.