Getting Started with the Claude API: A Practical Guide for Developers
Learn how to integrate Claude AI into your applications using the Anthropic API. Covers authentication, message requests, streaming, and best practices for production use.
This guide walks you through setting up the Claude API, making your first request with Python or TypeScript, handling streaming responses, and following best practices for reliable, cost-effective integration.
Introduction
Claude, built by Anthropic, is a powerful large language model that you can integrate into your own applications, tools, and workflows via the Anthropic API. Whether you're building a chatbot, a content generator, a code assistant, or an automated analysis pipeline, the Claude API gives you programmatic access to Claude's intelligence.
This guide is designed for developers who want to move from reading documentation to writing real code. You'll learn how to authenticate, send your first message, handle streaming responses, and follow best practices for production deployments.
Prerequisites
Before you start, make sure you have:
- An Anthropic account and an API key (obtainable from the Anthropic Console)
- Basic familiarity with Python 3.8+ or Node.js 18+
- A code editor and terminal
Step 1: Authentication
Every API call requires an x-api-key header. The safest way to manage this is through environment variables.
Setting your API key
export ANTHROPIC_API_KEY="sk-ant-..."
Installing the SDK
Anthropic provides official SDKs for Python and TypeScript.
Python:pip install anthropic
TypeScript/Node.js:
npm install @anthropic-ai/sdk
Step 2: Your First API Call
Let's send a simple message to Claude and get a response.
Python Example
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain the concept of recursion in one sentence."}
]
)
print(message.content[0].text)
TypeScript Example
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
async function main() {
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Explain the concept of recursion in one sentence.' }
],
});
console.log(message.content[0].text);
}
main();
Expected output:
Recursion is a programming technique where a function calls itself to solve a problem by breaking it down into smaller, identical subproblems.
Step 3: Understanding the Request Structure
The messages.create endpoint accepts several key parameters:
model: The Claude model version (e.g.,claude-sonnet-4-20250514,claude-3-5-haiku-latest)max_tokens: Maximum number of tokens in the responsemessages: An array of message objects, each with arole(userorassistant) andcontentsystem(optional): A system prompt to set Claude's behaviortemperature(optional): Controls randomness (0.0 to 1.0, default 1.0)
Multi-turn conversations
To continue a conversation, include the full history:
messages = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What is its population?"}
]
Step 4: Streaming Responses
For real-time applications, use streaming to receive tokens as they are generated.
Python Streaming
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short poem about APIs."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
TypeScript Streaming
const stream = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a short poem about APIs.' }],
stream: true,
});
for await (const chunk of stream) {
if (chunk.type === 'content_block_delta') {
process.stdout.write(chunk.delta.text);
}
}
Step 5: Error Handling
Always handle potential errors gracefully.
try:
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
except anthropic.APIError as e:
print(f"API Error: {e}")
except anthropic.APIConnectionError as e:
print(f"Connection Error: {e}")
except anthropic.RateLimitError as e:
print(f"Rate limited: {e}")
Best Practices for Production
1. Use System Prompts
Set the tone and constraints upfront.
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a helpful assistant that always responds in rhyming couplets.",
messages=[{"role": "user", "content": "Tell me about the weather."}]
)
2. Implement Retry Logic
Use exponential backoff for transient failures.
import time
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10))
def call_claude(prompt):
return client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
3. Manage Token Usage
Track token consumption to control costs.
message = client.messages.create(...)
print(f"Input tokens: {message.usage.input_tokens}")
print(f"Output tokens: {message.usage.output_tokens}")
4. Use Timeouts
Prevent hanging requests.
client = Anthropic(
api_key=os.environ.get("ANTHROPIC_API_KEY"),
timeout=30.0, # seconds
max_retries=2
)
Conclusion
Integrating Claude into your application is straightforward with the Anthropic API. You now know how to authenticate, send messages, handle streaming, and follow production best practices. The API is flexible enough for simple chatbots and complex multi-step agents alike.
Key Takeaways
- Authentication is simple: Use an API key stored in an environment variable, and install the
anthropicPython or@anthropic-ai/sdkNode.js package. - Messages are structured: Send an array of
{role, content}objects. Include conversation history for multi-turn interactions. - Streaming improves UX: Use
stream=Trueto receive tokens in real-time, reducing perceived latency. - Always handle errors: Implement retry logic with exponential backoff and catch specific API exceptions.
- Monitor token usage: Track
input_tokensandoutput_tokensto optimize costs and stay within rate limits.