BeClaude
Guide2026-04-25

Building a Text-to-SQL System with Claude: A Practical Guide

Learn how to build a robust Text-to-SQL system using Claude. Covers prompting, chain-of-thought, RAG for large schemas, and self-improvement loops with code examples.

Quick Answer

This guide walks you through building a Text-to-SQL system with Claude, from basic prompting to advanced techniques like chain-of-thought reasoning, RAG for large schemas, and self-improvement loops that let Claude fix its own SQL errors.

Text-to-SQLClaude APIRAGPrompt EngineeringSQL Generation

Building a Text-to-SQL System with Claude: A Practical Guide

Imagine giving your non-technical team members the power to query your company's database just by asking a question in plain English. That's the promise of Text-to-SQL, and Claude makes it surprisingly achievable.

Text-to-SQL converts natural language queries into structured SQL statements. For organizations, this means:

  • Accessibility: Non-technical users can explore data without learning SQL syntax
  • Efficiency: Data analysts can prototype queries in seconds rather than minutes
  • Integration: Chatbots and internal tools can offer database interactions through natural language
  • Complexity: Claude handles joins, subqueries, and aggregations that would take humans much longer to write
This guide will show you how to build a production-ready Text-to-SQL system using Claude, covering everything from basic prompting to advanced techniques like RAG and self-improvement loops.

Setup: Creating a Test Database

First, let's create a simple SQLite database with two tables: employees and departments. We'll use this throughout the guide.

import sqlite3

conn = sqlite3.connect('company.db') cursor = conn.cursor()

Create departments table

cursor.execute(''' CREATE TABLE departments ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, location TEXT ) ''')

Create employees table

cursor.execute(''' CREATE TABLE employees ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, department_id INTEGER, salary REAL, hire_date TEXT, FOREIGN KEY (department_id) REFERENCES departments(id) ) ''')

Insert sample data

cursor.executemany('INSERT INTO departments VALUES (?, ?, ?)', [ (1, 'Engineering', 'New York'), (2, 'Marketing', 'San Francisco'), (3, 'Sales', 'Chicago') ])

cursor.executemany('INSERT INTO employees VALUES (?, ?, ?, ?, ?)', [ (1, 'Alice Johnson', 1, 95000, '2020-03-15'), (2, 'Bob Smith', 1, 85000, '2021-07-01'), (3, 'Charlie Brown', 2, 72000, '2019-11-20'), (4, 'Diana Prince', 3, 88000, '2022-01-10') ])

conn.commit()

Basic Text-to-SQL Prompting

A good prompt needs three things: clear instructions, the user's query, and the database schema. Here's a minimal working example:

import anthropic

client = anthropic.Anthropic()

def generate_sql(user_query, schema): prompt = f"""You are a SQL expert. Convert the following natural language query into a SQL query for the given database schema.

Database Schema: {schema}

User Query: {user_query}

Generate only the SQL query, nothing else.""" response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=500, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

Example usage

schema = """ Table: departments (id INTEGER, name TEXT, location TEXT) Table: employees (id INTEGER, name TEXT, department_id INTEGER, salary REAL, hire_date TEXT) """

query = "Show me all employees in Engineering who earn more than 90,000" sql = generate_sql(query, schema) print(sql)

Output: SELECT * FROM employees WHERE department_id = 1 AND salary > 90000

Improving Accuracy with Few-Shot Examples

Basic prompts work, but adding examples (few-shot learning) dramatically improves consistency, especially for complex queries:

def generate_prompt_with_examples(user_query, schema):
    examples = """
Example 1:
User: List all departments in San Francisco
SQL: SELECT * FROM departments WHERE location = 'San Francisco'

Example 2: User: What is the average salary per department? SQL: SELECT d.name, AVG(e.salary) as avg_salary FROM departments d JOIN employees e ON d.id = e.department_id GROUP BY d.name

Example 3: User: Who was hired most recently? SQL: SELECT name, hire_date FROM employees ORDER BY hire_date DESC LIMIT 1 """ prompt = f"""You are a SQL expert. Convert natural language to SQL.

Database Schema: {schema}

Here are some examples: {examples}

Now convert this query: User: {user_query} SQL:""" return prompt

Pro tip: Including a few sample rows from your tables in the prompt can give Claude valuable context about data formats and values.

Chain-of-Thought Prompting for Complex Queries

For multi-step queries, chain-of-thought prompting helps Claude reason through the problem before generating SQL. Using XML tags makes the reasoning process explicit:

def generate_sql_with_cot(user_query, schema):
    prompt = f"""You are a SQL expert. Convert the following query to SQL.

Database Schema: {schema}

User Query: {user_query}

<reasoning> Break down the query step by step:

  • Identify the tables needed
  • Determine the columns to select
  • Identify any joins, filters, or aggregations
  • Consider the order of operations
</reasoning>

<sql> Write only the final SQL query here </sql>""" response = client.messages.create( model="claude-3-sonnet-20241022", max_tokens=1000, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

This approach is particularly effective for queries involving multiple joins, subqueries, or conditional logic.

Handling Large Schemas with RAG

When your database has hundreds of tables, you can't include the entire schema in every prompt. Retrieval Augmented Generation (RAG) solves this by dynamically fetching only the relevant schema information.

Here's how to implement it using VoyageAI embeddings:

import voyageai

class VectorDB: def __init__(self): self.client = voyageai.Client() self.embeddings = [] self.metadata = [] def add_table(self, table_name, columns, description): text = f"Table: {table_name}\nColumns: {columns}\nDescription: {description}" embedding = self.client.embed([text], model="voyage-2").embeddings[0] self.embeddings.append(embedding) self.metadata.append({"table": table_name, "columns": columns, "description": description}) def query(self, user_query, top_k=3): query_embedding = self.client.embed([user_query], model="voyage-2").embeddings[0] # Calculate cosine similarity and return top_k results similarities = [cosine_similarity(query_embedding, emb) for emb in self.embeddings] top_indices = sorted(range(len(similarities)), key=lambda i: similarities[i], reverse=True)[:top_k] return [self.metadata[i] for i in top_indices]

def generate_prompt_with_rag(user_query, vector_db): relevant_tables = vector_db.query(user_query) schema_context = "\n\n".join([ f"Table: {t['table']}\nColumns: {t['columns']}" for t in relevant_tables ]) prompt = f"""You are a SQL expert. Based on the following relevant schema, convert the user query to SQL.

Relevant Schema: {schema_context}

User Query: {user_query}

SQL:""" return prompt

Self-Improvement Loop: Let Claude Fix Its Own Errors

One of Claude's most powerful capabilities is self-correction. You can build a loop where Claude executes its generated SQL, catches errors, and refines the query:

def text_to_sql_with_self_improvement(user_query, schema, conn, max_iterations=3):
    prompt = f"Convert this query to SQL:\nSchema: {schema}\nQuery: {user_query}"
    
    for i in range(max_iterations):
        response = client.messages.create(
            model="claude-3-sonnet-20241022",
            max_tokens=500,
            messages=[{"role": "user", "content": prompt}]
        )
        sql = response.content[0].text
        
        try:
            cursor = conn.cursor()
            cursor.execute(sql)
            results = cursor.fetchall()
            return sql, results  # Success!
        except Exception as e:
            error_msg = str(e)
            prompt = f"""The previous SQL query failed with error: {error_msg}

Original query: {user_query} Previous SQL: {sql}

Please fix the SQL query and try again.""" return None, "Failed after max iterations"

This approach handles:

  • Syntax errors (missing commas, wrong keywords)
  • Schema mismatches (wrong column names)
  • Logic errors (incorrect joins)

Evaluating Your Text-to-SQL System

To ensure quality, build an evaluation pipeline:

  • Create a test set: 20-50 natural language queries with expected SQL outputs
  • Measure accuracy: Compare generated SQL to expected SQL (exact match or execution match)
  • Track error types: Schema errors, logic errors, syntax errors
  • Iterate: Use failures to improve your prompts or add more examples
def evaluate(test_cases, schema, conn):
    correct = 0
    for case in test_cases:
        generated_sql, _ = text_to_sql_with_self_improvement(case["query"], schema, conn)
        if generated_sql and execute_and_compare(generated_sql, case["expected_sql"], conn):
            correct += 1
    return correct / len(test_cases)

Key Takeaways

  • Start simple, then layer: Begin with basic prompting, then add few-shot examples, chain-of-thought, and finally RAG as your schema grows
  • Self-improvement is a game-changer: Letting Claude execute and fix its own SQL dramatically reduces error rates without manual intervention
  • RAG makes large schemas manageable: Dynamically retrieving relevant schema information keeps prompts focused and accurate even with hundreds of tables
  • Always include schema context: Claude needs to understand your database structure to generate correct SQL—never omit it
  • Evaluate continuously: Build a test set early and measure performance after each improvement to ensure you're actually making progress