Getting Started with Claude API: From First Call to Advanced Features
A practical guide to building with the Claude API, covering setup, Messages API, model selection, and key capabilities like vision, tools, and structured outputs.
Learn how to make your first Claude API call, understand the Messages API structure, choose the right model, and explore advanced features like extended thinking, tool use, and structured outputs.
Getting Started with Claude API: From First Call to Advanced Features
Claude, developed by Anthropic, offers developers two primary ways to integrate its powerful language models into applications: the Messages API for direct, fine-grained control, and Claude Managed Agents for pre-built, configurable agent harnesses. This guide focuses on the Messages API—the most flexible approach for custom agent loops, complex workflows, and production applications.
Whether you're building a coding assistant, a document analysis tool, or an autonomous agent, this guide will take you from your first API call to leveraging Claude's most advanced capabilities.
Prerequisites
Before you begin, ensure you have:
- An Anthropic account and an API key (get one from the Anthropic Console)
- Python 3.8+ or Node.js 18+ installed
- Basic familiarity with REST APIs and JSON
Step 1: Make Your First API Call
Let's start by setting up your environment and sending your first message to Claude.
Install the SDK
Python:pip install anthropic
TypeScript/JavaScript:
npm install @anthropic-ai/sdk
Send Your First Message
Here's a minimal example that sends a prompt and prints Claude's response.
Python:import anthropic
client = anthropic.Anthropic(
api_key="your-api-key-here"
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude! What can you help me with today?"}
]
)
print(message.content[0].text)
TypeScript:
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: 'your-api-key-here',
});
async function main() {
const message = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Hello, Claude! What can you help me with today?' }
],
});
console.log(message.content[0].text);
}
main();
Tip: Always set max_tokens to control response length and avoid unexpected costs.
Step 2: Understand the Messages API
The Messages API is the core interface for interacting with Claude. Let's break down its structure.
Request Structure
A typical request includes:
model: The Claude model identifier (e.g.,claude-sonnet-4-20250514)max_tokens: Maximum number of tokens in the responsemessages: An array of message objects, each with aroleandcontentsystem(optional): A system prompt to set context or behavior
Multi-Turn Conversations
To maintain a conversation, include the full message history:
messages = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What is its population?"}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=256,
messages=messages
)
Stop Reasons
Every response includes a stop_reason field that tells you why Claude stopped generating:
"end_turn": Claude naturally finished its response"max_tokens": The response hit the token limit"stop_sequence": Claude encountered a custom stop sequence"tool_use": Claude wants to call a tool (more on this later)
Step 3: Choose the Right Model
Claude offers several models optimized for different use cases:
| Model | Best For | Key Strength |
|---|---|---|
| Claude Opus 4.7 | Complex reasoning, agentic coding | Step-change jump over Opus 4.6 |
| Claude Sonnet 4.6 | Coding, agents, enterprise workflows | Frontier intelligence at scale |
| Claude Haiku 4.5 | Real-time applications, simple tasks | Fastest model with near-frontier intelligence |
Step 4: Explore Key Features
Extended Thinking
Enable Claude to reason step-by-step before responding, improving accuracy on complex tasks:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=2048,
thinking={"type": "enabled", "budget_tokens": 1024},
messages=[
{"role": "user", "content": "Solve this math problem: 27 * 45 + 13"}
]
)
Structured Outputs
Get responses in a specific JSON format for easier parsing:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Extract the name, age, and city from: John is 28 and lives in New York."}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "person_info",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
}
}
}
)
Vision (Image Processing)
Claude can analyze images and extract information:
import base64
with open("chart.png", "rb") as f:
image_data = base64.b64encode(f.read()).decode("utf-8")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What does this chart show?"},
{"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}}
]
}
]
)
Tool Use (Function Calling)
Claude can call external tools to fetch data, perform actions, or compute results:
tools = [
{
"name": "get_weather",
"description": "Get the current weather for a city",
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools
)
Check if Claude wants to use a tool
if response.stop_reason == "tool_use":
tool_call = response.content[-1]
print(f"Tool to call: {tool_call.name}")
print(f"Arguments: {tool_call.input}")
Streaming Responses
For real-time applications, stream responses token by token:
stream = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short poem about AI."}],
stream=True
)
for event in stream:
if event.type == "content_block_delta":
print(event.delta.text, end="", flush=True)
Best Practices
1. Use System Prompts Effectively
Set the tone, constraints, and persona using the system parameter:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You are a helpful coding assistant. Always provide code examples and explain your reasoning.",
messages=[{"role": "user", "content": "How do I sort a list in Python?"}]
)
2. Implement Prompt Caching
Reduce costs and latency for repeated system prompts or large contexts:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a legal document analyzer...",
"cache_control": {"type": "ephemeral"}
}
],
messages=[...]
)
3. Handle Errors Gracefully
Always wrap API calls in try-except blocks:
try:
response = client.messages.create(...)
except anthropic.APIError as e:
print(f"API error: {e}")
except anthropic.APIConnectionError as e:
print(f"Connection error: {e}")
except anthropic.RateLimitError as e:
print(f"Rate limited: {e}")
Next Steps
Now that you have a solid foundation, explore these advanced topics:
- Batch Processing: Send multiple requests asynchronously for high throughput
- Prompt Caching: Optimize costs for repeated contexts
- Tool Combinations: Chain multiple tools for complex agent workflows
- Managed Agents: Use Anthropic's pre-built agent harness for long-running tasks
Key Takeaways
- Start with the Messages API for maximum flexibility and control over your Claude integrations.
- Choose the right model based on your task complexity: Haiku for speed, Sonnet for balance, Opus for deep reasoning.
- Leverage advanced features like extended thinking, structured outputs, and tool use to build powerful applications.
- Always set
max_tokensand handle errors to ensure robust, cost-effective production deployments. - Use streaming and prompt caching to optimize latency and cost for real-time and high-volume use cases.