Guide2026-05-06

Building a Robust Text-to-SQL System with Claude: A Practical Guide

Learn how to build a production-ready Text-to-SQL system using Claude. Covers prompting, chain-of-thought, RAG for large schemas, and self-improvement loops.

Quick Answer

This guide walks you through building a Text-to-SQL system with Claude, from basic prompting to advanced techniques like chain-of-thought reasoning, RAG for large schemas, and self-improvement loops that let Claude fix its own SQL errors.

Text-to-SQLClaude APIRAGPrompt EngineeringSQL Generation

Building a Robust Text-to-SQL System with Claude: A Practical Guide

Text-to-SQL is one of the most powerful applications of large language models in the enterprise. It bridges the gap between natural language and structured data, allowing anyone in your organization to query databases without knowing SQL syntax. Claude excels at this task thanks to its strong reasoning capabilities and ability to understand complex context.

In this guide, you'll learn how to build a production-ready Text-to-SQL system using Claude. We'll start with a basic prompt and progressively add sophistication: few-shot examples, chain-of-thought reasoning, Retrieval Augmented Generation (RAG) for large schemas, and a self-improvement loop that lets Claude fix its own mistakes.

Why Text-to-SQL Matters

Before diving into the code, let's understand why this capability is so valuable:

Democratize data access: Non-technical team members can ask questions like "What were our top-selling products last quarter?" without waiting for a data analyst.
Boost analyst productivity: Data professionals can prototype complex queries in seconds instead of minutes.
Enable conversational interfaces: Build chatbots and virtual assistants that can answer data-driven questions in real time.
Handle complexity with ease: Claude can generate multi-join queries, nested subqueries, and aggregations that would take humans significant time to write correctly.

Setting Up Your Environment

First, let's create a test SQLite database with two tables: employees and departments. This will be our sandbox for the entire guide.

import sqlite3
Create an in-memory database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
Create departments table
cursor.execute('''
CREATE TABLE departments (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    location TEXT
)
''')
Create employees table
cursor.execute('''
CREATE TABLE employees (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    department_id INTEGER,
    salary REAL,
    hire_date DATE,
    FOREIGN KEY (department_id) REFERENCES departments(id)
)
''')
Insert sample data
cursor.executemany('INSERT INTO departments VALUES (?, ?, ?)', [
    (1, 'Engineering', 'New York'),
    (2, 'Sales', 'San Francisco'),
    (3, 'Marketing', 'Chicago')
])
cursor.executemany('INSERT INTO employees VALUES (?, ?, ?, ?, ?)', [
    (1, 'Alice', 1, 95000, '2020-03-15'),
    (2, 'Bob', 2, 72000, '2019-07-22'),
    (3, 'Charlie', 1, 88000, '2021-01-10'),
    (4, 'Diana', 3, 65000, '2022-11-01'),
    (5, 'Eve', 2, 78000, '2020-06-30')
])
conn.commit()

Step 1: The Basic Text-to-SQL Prompt

A good prompt needs three things: clear instructions, the user's query, and the database schema. Here's a minimal implementation:

import anthropic
client = anthropic.Anthropic()
def get_schema():
    cursor.execute("SELECT sql FROM sqlite_master WHERE type='table'")
    return '\n'.join(row[0] for row in cursor.fetchall())
def generate_sql(user_query):
    schema = get_schema()
    prompt = f"""You are a SQL expert. Given the following database schema, convert the user's natural language query into a valid SQL query.
Database schema:
{schema}
User query: {user_query}
Generate only the SQL query, no explanation."""
    
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=500,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text
Test it
query = "Show me all employees in the Engineering department"
sql = generate_sql(query)
print(sql)
Output: SELECT * FROM employees WHERE department_id = 1;

This works for simple queries, but it has limitations. Let's improve it.

Step 2: Few-Shot Prompting for Better Accuracy

Adding examples (few-shot learning) helps Claude understand the expected output format and reasoning pattern:

def generate_prompt_with_examples(user_query):
    schema = get_schema()
    examples = """
Examples:
User: List all departments
SQL: SELECT * FROM departments;
User: What is the average salary of employees?
SQL: SELECT AVG(salary) FROM employees;
User: Show me employees hired after 2020 along with their department names
SQL: SELECT e.name, e.hire_date, d.name FROM employees e JOIN departments d ON e.department_id = d.id WHERE e.hire_date > '2020-12-31';
"""
    
    prompt = f"""You are a SQL expert. Convert natural language to SQL.
Database schema:
{schema}
{examples}
User: {user_query}
SQL:"""
    return prompt

Pro tip: Include a few rows of real data alongside the schema. This gives Claude context about data types and values, which improves accuracy for queries involving specific values or ranges.

Step 3: Chain-of-Thought Prompting for Complex Queries

For multi-step queries, chain-of-thought prompting helps Claude reason through the problem before generating SQL. Using XML tags makes the reasoning process explicit:

def generate_sql_cot(user_query):
    schema = get_schema()
    prompt = f"""You are a SQL expert. Convert the user's query to SQL.
Database schema:
{schema}
User query: {user_query}
<reasoning>
Break down the query into steps:
Identify the tables needed
Determine the columns to select
Identify any joins, filters, or aggregations
Construct the final SQL
</reasoning>
<sql>
Generate only the SQL query here.
</sql>"""
    
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1000,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

This approach significantly improves accuracy on complex queries involving multiple joins, subqueries, or window functions.

Step 4: RAG for Large Database Schemas

When your database has hundreds of tables, you can't fit the entire schema in a prompt. Retrieval Augmented Generation (RAG) solves this by dynamically fetching only the relevant schema information.

First, set up a vector database to store table and column descriptions:

import voyageai
vo = voyageai.Client()
class SchemaVectorDB:
    def __init__(self):
        self.vectors = []
        self.metadata = []
    
    def add_table(self, table_name, columns, description=""):
        text = f"Table: {table_name}. Columns: {', '.join(columns)}. {description}"
        embedding = vo.embed([text], model="voyage-2").embeddings[0]
        self.vectors.append(embedding)
        self.metadata.append({"table": table_name, "columns": columns})
    
    def query(self, user_query, top_k=3):
        query_embedding = vo.embed([user_query], model="voyage-2").embeddings[0]
        # Compute cosine similarity and return top_k results
        # (simplified for brevity - use your preferred vector DB)
        return self.metadata[:top_k]
Initialize with our schema
schema_db = SchemaVectorDB()
schema_db.add_table("employees", ["id", "name", "department_id", "salary", "hire_date"])
schema_db.add_table("departments", ["id", "name", "location"])

Now modify your prompt to use only the relevant schema:

def generate_sql_with_rag(user_query):
    relevant_schema = schema_db.query(user_query)
    schema_text = "\n".join(
        [f"Table {m['table']}: {', '.join(m['columns'])}" for m in relevant_schema]
    )
    
    prompt = f"""Using only the following relevant tables, convert the query to SQL.
Relevant schema:
{schema_text}
User query: {user_query}
SQL:"""
    
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=500,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

Step 5: Self-Improvement Loop

One of Claude's most powerful capabilities is the ability to execute SQL, analyze results, and fix its own mistakes. Here's a self-improvement loop:

def execute_and_improve(user_query, max_iterations=3):
    for i in range(max_iterations):
        # Generate SQL
        sql = generate_sql(user_query)
        print(f"Attempt {i+1}: Generated SQL: {sql}")
        
        try:
            cursor.execute(sql)
            results = cursor.fetchall()
            
            # Check if results make sense
            validation_prompt = f"""The user asked: "{user_query}"
The generated SQL was: {sql}
The results were: {results[:5]}
Does this SQL correctly answer the user's question? If not, explain what's wrong."""
            
            validation = client.messages.create(
                model="claude-3-5-sonnet-20241022",
                max_tokens=300,
                messages=[{"role": "user", "content": validation_prompt}]
            )
            
            if "correct" in validation.content[0].text.lower():
                return sql, results
            else:
                # Feed the error back into the next iteration
                user_query = f"{user_query} (Previous attempt failed: {validation.content[0].text})"
                
        except Exception as e:
            # SQL execution error - let Claude fix it
            user_query = f"{user_query} (Previous SQL: {sql} produced error: {str(e)})"
    
    return None, "Failed after max iterations"

This loop handles:

Syntax errors: Claude can fix typos or missing commas
Logical errors: If the results don't match the query intent, Claude adjusts the SQL
Schema mismatches: If a table or column doesn't exist, Claude finds the correct one

Key Takeaways

Start simple, iterate fast: Begin with a basic prompt and schema, then layer on techniques like few-shot examples and chain-of-thought as needed.
RAG is essential for scale: For databases with dozens or hundreds of tables, use vector search to retrieve only the relevant schema information for each query.
Chain-of-thought improves complex queries: Breaking down the reasoning process helps Claude handle multi-join and aggregation queries more accurately.
Self-improvement loops catch errors: Let Claude execute its own SQL, analyze results, and fix mistakes automatically. This dramatically improves reliability in production.
Include sample data in prompts: A few rows of real data alongside the schema helps Claude understand data types, formats, and value distributions.

Next Steps

Now that you have a working Text-to-SQL system, consider these enhancements:

Add query validation: Check for dangerous operations (DROP, DELETE without WHERE) before execution
Implement caching: Store common query patterns to reduce API calls
Build a feedback loop: Let users rate query quality and use that data to fine-tune your prompts
Connect to real databases: Replace SQLite with PostgreSQL or MySQL for production use

Claude's ability to understand natural language and generate accurate SQL makes it an ideal foundation for building data access tools that anyone in your organization can use.