Building a Text-to-SQL System with Claude: A Practical Guide
Learn how to build a robust Text-to-SQL system using Claude. Covers prompting, chain-of-thought, RAG for large schemas, and self-improvement loops with code examples.
This guide walks you through building a Text-to-SQL system with Claude, from basic prompting to advanced techniques like chain-of-thought reasoning, RAG for large schemas, and self-improvement loops that let Claude fix its own SQL errors.
Building a Text-to-SQL System with Claude: A Practical Guide
Imagine giving your non-technical team members the power to query your company's database just by asking a question in plain English. That's the promise of Text-to-SQL, and Claude makes it surprisingly achievable.
Text-to-SQL converts natural language queries into structured SQL statements. For organizations, this means:
- Accessibility: Non-technical users can explore data without learning SQL syntax
- Efficiency: Data analysts can prototype queries in seconds rather than minutes
- Integration: Chatbots and internal tools can offer database interactions through natural language
- Complexity: Claude handles joins, subqueries, and aggregations that would take humans much longer to write
Setup: Creating a Test Database
First, let's create a simple SQLite database with two tables: employees and departments. We'll use this throughout the guide.
import sqlite3
conn = sqlite3.connect('company.db')
cursor = conn.cursor()
Create departments table
cursor.execute('''
CREATE TABLE departments (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
location TEXT
)
''')
Create employees table
cursor.execute('''
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
department_id INTEGER,
salary REAL,
hire_date TEXT,
FOREIGN KEY (department_id) REFERENCES departments(id)
)
''')
Insert sample data
cursor.executemany('INSERT INTO departments VALUES (?, ?, ?)', [
(1, 'Engineering', 'New York'),
(2, 'Marketing', 'San Francisco'),
(3, 'Sales', 'Chicago')
])
cursor.executemany('INSERT INTO employees VALUES (?, ?, ?, ?, ?)', [
(1, 'Alice Johnson', 1, 95000, '2020-03-15'),
(2, 'Bob Smith', 1, 85000, '2021-07-01'),
(3, 'Charlie Brown', 2, 72000, '2019-11-20'),
(4, 'Diana Prince', 3, 88000, '2022-01-10')
])
conn.commit()
Basic Text-to-SQL Prompting
A good prompt needs three things: clear instructions, the user's query, and the database schema. Here's a minimal working example:
import anthropic
client = anthropic.Anthropic()
def generate_sql(user_query, schema):
prompt = f"""You are a SQL expert. Convert the following natural language query into a SQL query for the given database schema.
Database Schema:
{schema}
User Query: {user_query}
Generate only the SQL query, nothing else."""
response = client.messages.create(
model="claude-3-sonnet-20241022",
max_tokens=500,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text
Example usage
schema = """
Table: departments (id INTEGER, name TEXT, location TEXT)
Table: employees (id INTEGER, name TEXT, department_id INTEGER, salary REAL, hire_date TEXT)
"""
query = "Show me all employees in Engineering who earn more than 90,000"
sql = generate_sql(query, schema)
print(sql)
Output: SELECT * FROM employees WHERE department_id = 1 AND salary > 90000
Improving Accuracy with Few-Shot Examples
Basic prompts work, but adding examples (few-shot learning) dramatically improves consistency, especially for complex queries:
def generate_prompt_with_examples(user_query, schema):
examples = """
Example 1:
User: List all departments in San Francisco
SQL: SELECT * FROM departments WHERE location = 'San Francisco'
Example 2:
User: What is the average salary per department?
SQL: SELECT d.name, AVG(e.salary) as avg_salary FROM departments d JOIN employees e ON d.id = e.department_id GROUP BY d.name
Example 3:
User: Who was hired most recently?
SQL: SELECT name, hire_date FROM employees ORDER BY hire_date DESC LIMIT 1
"""
prompt = f"""You are a SQL expert. Convert natural language to SQL.
Database Schema:
{schema}
Here are some examples:
{examples}
Now convert this query:
User: {user_query}
SQL:"""
return prompt
Pro tip: Including a few sample rows from your tables in the prompt can give Claude valuable context about data formats and values.
Chain-of-Thought Prompting for Complex Queries
For multi-step queries, chain-of-thought prompting helps Claude reason through the problem before generating SQL. Using XML tags makes the reasoning process explicit:
def generate_sql_with_cot(user_query, schema):
prompt = f"""You are a SQL expert. Convert the following query to SQL.
Database Schema:
{schema}
User Query: {user_query}
<reasoning>
Break down the query step by step:
- Identify the tables needed
- Determine the columns to select
- Identify any joins, filters, or aggregations
- Consider the order of operations
</reasoning>
<sql>
Write only the final SQL query here
</sql>"""
response = client.messages.create(
model="claude-3-sonnet-20241022",
max_tokens=1000,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text
This approach is particularly effective for queries involving multiple joins, subqueries, or conditional logic.
Handling Large Schemas with RAG
When your database has hundreds of tables, you can't include the entire schema in every prompt. Retrieval Augmented Generation (RAG) solves this by dynamically fetching only the relevant schema information.
Here's how to implement it using VoyageAI embeddings:
import voyageai
class VectorDB:
def __init__(self):
self.client = voyageai.Client()
self.embeddings = []
self.metadata = []
def add_table(self, table_name, columns, description):
text = f"Table: {table_name}\nColumns: {columns}\nDescription: {description}"
embedding = self.client.embed([text], model="voyage-2").embeddings[0]
self.embeddings.append(embedding)
self.metadata.append({"table": table_name, "columns": columns, "description": description})
def query(self, user_query, top_k=3):
query_embedding = self.client.embed([user_query], model="voyage-2").embeddings[0]
# Calculate cosine similarity and return top_k results
similarities = [cosine_similarity(query_embedding, emb) for emb in self.embeddings]
top_indices = sorted(range(len(similarities)), key=lambda i: similarities[i], reverse=True)[:top_k]
return [self.metadata[i] for i in top_indices]
def generate_prompt_with_rag(user_query, vector_db):
relevant_tables = vector_db.query(user_query)
schema_context = "\n\n".join([
f"Table: {t['table']}\nColumns: {t['columns']}"
for t in relevant_tables
])
prompt = f"""You are a SQL expert. Based on the following relevant schema, convert the user query to SQL.
Relevant Schema:
{schema_context}
User Query: {user_query}
SQL:"""
return prompt
Self-Improvement Loop: Let Claude Fix Its Own Errors
One of Claude's most powerful capabilities is self-correction. You can build a loop where Claude executes its generated SQL, catches errors, and refines the query:
def text_to_sql_with_self_improvement(user_query, schema, conn, max_iterations=3):
prompt = f"Convert this query to SQL:\nSchema: {schema}\nQuery: {user_query}"
for i in range(max_iterations):
response = client.messages.create(
model="claude-3-sonnet-20241022",
max_tokens=500,
messages=[{"role": "user", "content": prompt}]
)
sql = response.content[0].text
try:
cursor = conn.cursor()
cursor.execute(sql)
results = cursor.fetchall()
return sql, results # Success!
except Exception as e:
error_msg = str(e)
prompt = f"""The previous SQL query failed with error: {error_msg}
Original query: {user_query}
Previous SQL: {sql}
Please fix the SQL query and try again."""
return None, "Failed after max iterations"
This approach handles:
- Syntax errors (missing commas, wrong keywords)
- Schema mismatches (wrong column names)
- Logic errors (incorrect joins)
Evaluating Your Text-to-SQL System
To ensure quality, build an evaluation pipeline:
- Create a test set: 20-50 natural language queries with expected SQL outputs
- Measure accuracy: Compare generated SQL to expected SQL (exact match or execution match)
- Track error types: Schema errors, logic errors, syntax errors
- Iterate: Use failures to improve your prompts or add more examples
def evaluate(test_cases, schema, conn):
correct = 0
for case in test_cases:
generated_sql, _ = text_to_sql_with_self_improvement(case["query"], schema, conn)
if generated_sql and execute_and_compare(generated_sql, case["expected_sql"], conn):
correct += 1
return correct / len(test_cases)
Key Takeaways
- Start simple, then layer: Begin with basic prompting, then add few-shot examples, chain-of-thought, and finally RAG as your schema grows
- Self-improvement is a game-changer: Letting Claude execute and fix its own SQL dramatically reduces error rates without manual intervention
- RAG makes large schemas manageable: Dynamically retrieving relevant schema information keeps prompts focused and accurate even with hundreds of tables
- Always include schema context: Claude needs to understand your database structure to generate correct SQL—never omit it
- Evaluate continuously: Build a test set early and measure performance after each improvement to ensure you're actually making progress