BeClaude
GuideBeginnerBest Practices2026-05-12

Building a Robust Text-to-SQL System with Claude: A Practical Guide

Learn to build a production-ready Text-to-SQL system using Claude. Covers prompting, RAG for complex schemas, chain-of-thought reasoning, and self-improvement loops.

Quick Answer

This guide walks you through building a Text-to-SQL system with Claude, covering basic prompting, few-shot learning, chain-of-thought reasoning, RAG for large schemas, and a self-improvement loop that lets Claude fix its own SQL errors.

Text-to-SQLPrompt EngineeringRAGDatabaseClaude API

Building a Robust Text-to-SQL System with Claude: A Practical Guide

Text-to-SQL is one of the most impactful applications of large language models in enterprise settings. By converting natural language questions into structured SQL queries, Claude can democratize data access across your organization—letting non-technical stakeholders query databases without writing a single SELECT statement.

In this guide, you'll learn how to build a production-ready Text-to-SQL system using Claude. We'll start with basic prompting, then progressively layer in advanced techniques: few-shot learning, chain-of-thought reasoning, Retrieval Augmented Generation (RAG) for complex schemas, and a self-improvement loop that lets Claude fix its own mistakes.

Why Text-to-SQL Matters

Before diving into the code, let's understand why this capability is so valuable:

  • Accessibility: Business analysts, product managers, and executives can query databases directly without SQL knowledge.
  • Efficiency: Data teams can prototype queries 10x faster using natural language.
  • Integration: Chatbots and internal tools can offer natural language database interfaces.
  • Complexity: Claude handles multi-table joins, subqueries, and aggregations with ease.

Prerequisites

To follow along, you'll need:

  • An Anthropic API key (sign up at console.anthropic.com)
  • Python 3.8+ installed
  • Basic familiarity with SQL and Python
Install the required packages:
pip install anthropic sqlite3 voyageai

Setting Up a Test Database

Let's create a simple SQLite database with two tables: employees and departments. This will serve as our playground throughout the guide.

import sqlite3

Create in-memory database

conn = sqlite3.connect(':memory:') cursor = conn.cursor()

Create departments table

cursor.execute(''' CREATE TABLE departments ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, location TEXT ) ''')

Create employees table

cursor.execute(''' CREATE TABLE employees ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, department_id INTEGER, salary REAL, hire_date DATE, FOREIGN KEY (department_id) REFERENCES departments(id) ) ''')

Insert sample data

cursor.executemany('INSERT INTO departments VALUES (?, ?, ?)', [ (1, 'Engineering', 'New York'), (2, 'Marketing', 'San Francisco'), (3, 'Sales', 'Chicago') ])

cursor.executemany('INSERT INTO employees VALUES (?, ?, ?, ?, ?)', [ (1, 'Alice Johnson', 1, 95000, '2020-03-15'), (2, 'Bob Smith', 1, 85000, '2021-06-01'), (3, 'Charlie Brown', 2, 72000, '2019-11-20'), (4, 'Diana Prince', 3, 88000, '2022-01-10'), (5, 'Eve Davis', 2, 65000, '2023-04-05') ])

conn.commit()

Step 1: Basic Text-to-SQL Prompt

A good Text-to-SQL prompt needs three things: clear instructions, the user's question, and the database schema. Let's start simple.

import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

def generate_sql_basic(user_query, schema): prompt = f"""You are a SQL expert. Convert the following natural language query into a SQL query for the given database schema.

Database Schema: {schema}

User Query: {user_query}

Generate only the SQL query, nothing else.""" response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=500, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

Test it

schema = """ Table: departments (id, name, location) Table: employees (id, name, department_id, salary, hire_date) """

query = "Show me all employees in the Engineering department" sql = generate_sql_basic(query, schema) print(sql)

Output: SELECT * FROM employees WHERE department_id = 1;

Step 2: Improving with Few-Shot Examples

Basic prompting works, but including examples (few-shot learning) dramatically improves accuracy, especially for edge cases.

def generate_sql_fewshot(user_query, schema):
    examples = """
Example 1:
User: List all departments in San Francisco
SQL: SELECT * FROM departments WHERE location = 'San Francisco';

Example 2: User: What is the average salary by department? SQL: SELECT d.name, AVG(e.salary) as avg_salary FROM employees e JOIN departments d ON e.department_id = d.id GROUP BY d.name;

Example 3: User: Find employees hired after 2021 with salary above 80000 SQL: SELECT * FROM employees WHERE hire_date > '2021-01-01' AND salary > 80000; """ prompt = f"""You are a SQL expert. Convert natural language to SQL.

Schema: {schema}

Here are some examples: {examples}

User Query: {user_query}

SQL:""" response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=500, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

Pro tip: Including a few rows of actual data in your prompt can further improve accuracy by giving Claude concrete context about values and formats.

Step 3: Chain-of-Thought Prompting

For complex queries involving multiple joins or aggregations, chain-of-thought (CoT) prompting helps Claude reason step-by-step before generating SQL.

def generate_sql_cot(user_query, schema):
    prompt = f"""You are a SQL expert. Convert the user's query to SQL by reasoning step-by-step.

Schema: {schema}

User Query: {user_query}

Let's think through this step by step: 1) First, identify the tables needed 2) Determine the columns to select 3) Identify any JOIN conditions 4) Apply WHERE filters 5) Handle GROUP BY, ORDER BY, or aggregations

Now, write the SQL query: <sql> """ response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=800, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

Using XML tags like <sql> helps Claude structure its output cleanly, making parsing easier in production.

Step 4: RAG for Complex Database Schemas

Real-world databases often have hundreds of tables. Including the entire schema in every prompt is wasteful and can confuse Claude. Instead, use Retrieval Augmented Generation (RAG) to dynamically fetch only the relevant schema information.

First, set up a vector database to store table and column descriptions:

import voyageai

vo = voyageai.Client(api_key="your-voyage-api-key")

class SchemaVectorDB: def __init__(self): self.embeddings = [] self.metadata = [] def add_table(self, table_name, columns, description=""): text = f"Table: {table_name}\nColumns: {', '.join(columns)}\nDescription: {description}" embedding = vo.embed([text], model="voyage-2").embeddings[0] self.embeddings.append(embedding) self.metadata.append({"table": table_name, "columns": columns, "text": text}) def query(self, user_query, top_k=3): query_embedding = vo.embed([user_query], model="voyage-2").embeddings[0] # Simple cosine similarity (use a proper vector DB in production) similarities = [ sum(a*b for a,b in zip(query_embedding, emb)) for emb in self.embeddings ] top_indices = sorted(range(len(similarities)), key=lambda i: similarities[i], reverse=True)[:top_k] return [self.metadata[i]["text"] for i in top_indices]

Populate the vector DB

schema_db = SchemaVectorDB() schema_db.add_table("departments", ["id", "name", "location"], "Company departments") schema_db.add_table("employees", ["id", "name", "department_id", "salary", "hire_date"], "Employee records")

Now, use RAG to build dynamic prompts:

def generate_sql_rag(user_query):
    relevant_schemas = schema_db.query(user_query)
    schema_text = "\n\n".join(relevant_schemas)
    
    prompt = f"""You are a SQL expert. Based on the relevant schema below, convert the user query to SQL.

Relevant Schema: {schema_text}

User Query: {user_query}

SQL:""" response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=500, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

This approach scales to databases with hundreds of tables while keeping prompts lean and focused.

Step 5: Self-Improvement Loop

Even with great prompting, Claude might generate SQL that fails due to syntax errors or logic issues. A self-improvement loop lets Claude execute the SQL, catch errors, and fix them automatically.

def generate_sql_with_self_improvement(user_query, schema, max_attempts=3):
    for attempt in range(max_attempts):
        # Generate SQL
        sql = generate_sql_cot(user_query, schema)
        
        # Try to execute
        try:
            cursor.execute(sql)
            results = cursor.fetchall()
            
            # Optional: Validate results make sense
            validation_prompt = f"""The SQL query: {sql}
Returned these results: {results[:5]}...
Does this answer the user's question: {user_query}?
Answer YES or NO and explain why."""
            
            validation = client.messages.create(
                model="claude-3-sonnet-20241022",
                max_tokens=200,
                messages=[{"role": "user", "content": validation_prompt}]
            )
            
            if "YES" in validation.content[0].text:
                return sql, results
            else:
                # Feed the validation back for improvement
                user_query += f"\n\nPrevious attempt returned: {results}. Please fix."
                
        except Exception as e:
            error_msg = str(e)
            # Feed the error back to Claude
            user_query += f"\n\nPrevious SQL failed with error: {error_msg}. Please fix."
    
    return None, "Failed after max attempts"

This loop handles:

  • Syntax errors: Invalid SQL syntax gets caught and fixed
  • Logic errors: Wrong joins or filters get corrected
  • Edge cases: Unexpected data patterns are handled gracefully

Evaluation: Measuring Success

To ensure your Text-to-SQL system is reliable, implement these evaluation metrics:

  • Execution Accuracy: Does the SQL run without errors?
  • Result Correctness: Do the results match expected outputs?
  • Schema Relevance: Did RAG retrieve the right tables?
  • Latency: How fast is end-to-end generation?
Create a test suite with known query-result pairs and run it after every prompt change.
test_cases = [
    {
        "query": "List all employees in Engineering",
        "expected_sql": "SELECT * FROM employees WHERE department_id = 1",
        "expected_rows": 2
    },
    {
        "query": "Average salary by department",
        "expected_sql_contains": ["AVG", "GROUP BY"],
        "expected_rows": 3
    }
]

Next Steps

Your Text-to-SQL system is now production-ready. Here's how to extend it:

  • Add database-specific dialects: Modify prompts for PostgreSQL, MySQL, or BigQuery
  • Implement query caching: Store frequent query patterns to reduce API calls
  • Add security guardrails: Prevent destructive SQL (DROP, DELETE, UPDATE) for read-only users
  • Build a UI: Create a simple chat interface using Streamlit or Gradio

Key Takeaways

  • Start simple, then layer complexity: Begin with basic prompting, then add few-shot examples, chain-of-thought, RAG, and self-improvement loops as needed.
  • RAG is essential for real-world databases: Don't dump your entire schema into every prompt—retrieve only what's relevant.
  • Self-improvement loops dramatically boost reliability: Let Claude catch and fix its own SQL errors automatically.
  • Always validate outputs: Execute generated SQL in a safe environment and verify results before presenting them to users.
  • Invest in evaluation: A robust test suite is the foundation of a trustworthy Text-to-SQL system.