Guide2026-04-27

Mastering Document Summarization with Claude: From Basic Prompts to Advanced RAG

Learn how to summarize long documents with Claude AI, including prompt engineering, metadata extraction, handling token limits, and evaluating summary quality using ROUGE scores and Promptfoo.

Quick Answer

This guide teaches you how to use Claude for document summarization, covering basic prompts, multi-shot techniques, domain-specific guided summarization, and advanced RAG approaches. You'll also learn to evaluate summaries using ROUGE scores and Promptfoo, plus best practices for iterative improvement.

summarizationprompt engineeringClaude APIRAGevaluation

Mastering Document Summarization with Claude: From Basic Prompts to Advanced RAG

Summarization is one of the most powerful and practical applications of large language models. Whether you're a legal professional drowning in contracts, a researcher sifting through papers, or a product manager analyzing customer feedback, the ability to condense lengthy documents into clear, actionable summaries can transform your workflow.

This guide walks you through the full spectrum of summarization techniques using Claude — from simple one-shot prompts to advanced Retrieval-Augmented Generation (RAG) approaches. We'll focus on practical, actionable methods you can implement today.

Why Summarization Is Hard (And Why Claude Excels)

Evaluating summary quality is notoriously subjective. Unlike a math problem with a single correct answer, a "good" summary depends on your audience, use case, and desired level of detail. Traditional metrics like ROUGE scores measure word overlap but miss nuance, coherence, and factual accuracy.

Claude's strengths — long context windows, nuanced understanding, and instruction-following — make it particularly well-suited for summarization. But even with a powerful model, your approach matters.

Getting Started: Setup and Data Preparation

First, install the required packages:

pip install anthropic pypdf pandas matplotlib sklearn numpy rouge-score nltk seaborn promptfoo

You'll also need a Claude API key. Set it as an environment variable:

export ANTHROPIC_API_KEY="your-api-key-here"

Preparing Your Document

For this guide, we'll use a publicly available Sublease Agreement from the SEC website. You can also use any PDF or raw text. Here's how to extract text from a PDF:

import pypdf
def extract_text_from_pdf(pdf_path):
    with open(pdf_path, 'rb') as file:
        reader = pypdf.PdfReader(file)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
    return text
Load your document
text = extract_text_from_pdf("sublease_agreement.pdf")

If you're working with plain text, just assign it directly:

text = "Your document text here..."

Basic Summarization: Your First Claude Summary

Let's start simple. Here's a basic summarization function using the Claude API:

import anthropic
client = anthropic.Anthropic()
def summarize_text(text, max_tokens=500):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=max_tokens,
        messages=[
            {
                "role": "user",
                "content": f"Please summarize the following document:\n\n{text}"
            }
        ]
    )
    return response.content[0].text
summary = summarize_text(text)
print(summary)

This works, but it's basic. Notice we're using the user role with a simple instruction. As we progress, we'll refine this approach significantly.

Multi-Shot Summarization: Providing Examples

One powerful technique is to provide Claude with examples of good summaries. This is called few-shot prompting:

def summarize_with_examples(text):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=500,
        messages=[
            {
                "role": "user",
                "content": """
Here is an example of a good summary:
Document: "The quick brown fox jumps over the lazy dog. The dog was sleeping under a tree. The fox was looking for food."
Summary: "A fox searching for food jumps over a sleeping dog under a tree."
Now summarize this document:
"""
            },
            {
                "role": "assistant",
                "content": "Understood. I will follow that style."
            },
            {
                "role": "user",
                "content": text
            }
        ]
    )
    return response.content[0].text

The key insight here is using the assistant role to acknowledge the instruction before receiving the actual document. This helps Claude "get in the right mindset" before processing your content.

Advanced Techniques: Guided and Domain-Specific Summarization

Guided Summarization

Instead of a generic "summarize this," guide Claude with specific instructions:

def guided_summarize(text):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=800,
        messages=[
            {
                "role": "user",
                "content": f"""
Please summarize the following legal document. Focus on:
Key parties involved
Effective dates and duration
Financial terms (rent, deposits, fees)
Termination conditions
Notable obligations or restrictions

Format your summary as bullet points under each heading.
Document:
{text}
"""
            }
        ]
    )
    return response.content[0].text

Domain-Specific Guided Summarization

For legal documents specifically, you can create a structured extraction:

def legal_document_summary(text):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=1000,
        messages=[
            {
                "role": "user",
                "content": f"""
You are a legal document analyst. Extract the following metadata and create a structured summary:
METADATA:
Document Type:
Parties:
Date Signed:
Effective Date:
Term/Duration:

SUMMARY SECTIONS:
Executive Summary (3-5 sentences)
Key Financial Terms
Rights and Obligations
Termination Clauses
Risk Factors

Document:
{text}
"""
            }
        ]
    )
    return response.content[0].text

Meta-Summarization: Handling Long Documents

When documents exceed Claude's context window (or your budget), use a chunk-and-summarize approach:

def chunk_text(text, chunk_size=4000):
    words = text.split()
    chunks = []
    for i in range(0, len(words), chunk_size):
        chunk = ' '.join(words[i:i+chunk_size])
        chunks.append(chunk)
    return chunks
def meta_summarize(text):
    chunks = chunk_text(text)
    chunk_summaries = []
    
    for chunk in chunks:
        summary = summarize_text(chunk, max_tokens=300)
        chunk_summaries.append(summary)
    
    # Now summarize the summaries
    combined = "\n\n".join(chunk_summaries)
    final_summary = summarize_text(combined, max_tokens=500)
    return final_summary

Summary Indexed Documents: An Advanced RAG Approach

For truly large document collections, combine summarization with RAG. The idea is to create a summary index — a searchable database of document summaries:

import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
def create_summary_index(documents):
    """Create a searchable index of document summaries."""
    summaries = []
    for doc in documents:
        summary = summarize_text(doc, max_tokens=200)
        summaries.append(summary)
    
    # Create TF-IDF vectors for retrieval
    vectorizer = TfidfVectorizer()
    tfidf_matrix = vectorizer.fit_transform(summaries)
    
    return summaries, vectorizer, tfidf_matrix
def query_summary_index(query, summaries, vectorizer, tfidf_matrix, top_k=3):
    """Retrieve most relevant summaries for a query."""
    query_vec = vectorizer.transform([query])
    similarities = cosine_similarity(query_vec, tfidf_matrix)[0]
    top_indices = np.argsort(similarities)[-top_k:][::-1]
    
    results = []
    for idx in top_indices:
        results.append({
            "summary": summaries[idx],
            "relevance": similarities[idx]
        })
    return results

Best Practices for Summarization RAG

Chunk strategically: Align chunks with document structure (paragraphs, sections)
Preserve metadata: Include document title, date, and source in each chunk
Use overlapping chunks: 10-20% overlap prevents information loss at boundaries
Cache summaries: Store generated summaries to avoid redundant API calls

Evaluating Summary Quality

Automated evaluation helps you iterate quickly. Here's how to use ROUGE scores:

from rouge_score import rouge_scorer
def evaluate_summary(reference, generated):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, generated)
    
    print("ROUGE-1:", scores['rouge1'].fmeasure)
    print("ROUGE-2:", scores['rouge2'].fmeasure)
    print("ROUGE-L:", scores['rougeL'].fmeasure)
    return scores

For more nuanced evaluation, use Promptfoo to create custom evaluation suites:

# promptfooconfig.yaml
evaluation:
  prompts:
    - "Summarize this document: {{document}}"
  providers:
    - anthropic:claude-3-opus-20240229
  tests:
    - vars:
        document: "Your test document here"
      assert:
        - type: contains-all
          value: ["key term 1", "key term 2"]
        - type: max-length
          value: 500

Iterative Improvement: A Practical Workflow

Baseline: Start with a basic prompt, generate summaries
Evaluate: Use ROUGE scores and manual spot-checks
Identify gaps: Where does the summary miss key information?
Refine prompts: Add specific instructions for missed areas
Test edge cases: Try with different document types and lengths
Automate: Create a test suite with Promptfoo for regression testing

Conclusion and Best Practices

Be specific in your prompts: Generic "summarize this" yields generic results. Specify format, length, and focus areas.
Use the assistant role: Acknowledge instructions before providing content to improve output quality.
Chunk strategically: For long documents, use overlapping chunks and meta-summarization.
Evaluate systematically: Combine automated metrics (ROUGE) with human review.
Iterate: Summarization is rarely perfect on the first try. Build a feedback loop.
Consider your audience: A summary for a legal expert differs from one for a general reader.

Key Takeaways

Start with guided prompts: Specify exactly what information you need extracted rather than asking for a generic summary
Use multi-shot prompting with assistant role: Providing examples and using the assistant role improves output consistency
Chunk and meta-summarize for long documents: Break documents into overlapping chunks, summarize each, then summarize the summaries
Combine summarization with RAG: Create summary indexes for large document collections to enable fast, relevant retrieval
Evaluate with multiple methods: Use ROUGE scores for automated checks and Promptfoo for custom evaluation suites