promptfoo-evals
NewTeaches AI coding agents to create promptfoo eval suites with deterministic assertions, provider configs, and best practices
Overview
<p align="center"> <a href="https://npmjs.com/package/promptfoo"><img src="https://img.shields.io/npm/v/promptfoo" alt="npm"></a> <a href="https://npmjs.com/package/promptfoo"><img src="https://img.shields.io/npm/dm/promptfoo" alt="npm"></a> <a href="https://github.com/promptfoo/promptfoo/actions/workflows/main.yml"><img src="https://img.shields.io/github/actions/workflow/status/promptfoo/promptfoo/main.yml" alt="GitHub Workflow Status"></a> <a href="https://github.com/promptfoo/promptfoo/blob/main/LICENSE"><img src="https://img.shields.io/github/license/promptfoo/promptfoo" alt="MIT license"></a> <a href="https://discord.gg/promptfoo"><img src="https://img.shields.io/discord/1146610656779440188?logo=discord&label=promptfoo" alt="Discord"></a> </p>
<p align="center"> <code>promptfoo</code> is a CLI and library for evaluating and red-teaming LLM apps. Stop the trial-and-error approach - start shipping secure, reliable AI apps. </p>
<p align="center"> <a href="https://www.promptfoo.dev">Website</a> · <a href="https://www.promptfoo.dev/docs/getting-started/">Getting Started</a> · <a href="https://www.promptfoo.dev/docs/red-team/">Red Teaming</a> · <a href="https://www.promptfoo.dev/docs/">Documentation</a> · <a href="https://discord.gg/promptfoo">Discord</a> </p>
Promptfoo is now part of OpenAI. Promptfoo remains open source and MIT licensed. Read the company update.
Quick Start
Requires Node.js ^20.20.0 or >=22.22.0 for npm and npx usage.
npm install -g promptfoo
promptfoo init --example getting-startedAlso available via brew install promptfoo and pip install promptfoo. You can also use npx promptfoo@latest to run any command without installing.
Most LLM providers require an API key. Set yours as an environment variable:
export OPENAI_API_KEY=sk-abc123Once you're in the example directory, run an eval and view results:
cd getting-started
promptfoo eval
promptfoo viewSee Getting Started (evals) or Red Teaming (vulnerability scanning) for more.
What can you do with Promptfoo?
- •Test your prompts and models with automated evaluations
- •Secure your LLM apps with red teaming and vulnerability scanning
- •Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
- •Automate checks in CI/CD
- •Review pull requests for LLM-related security and compliance issues with code scanning
- •Share results with your team
Here's what it looks like in action:
<img src="site/static/img/[email protected]" alt="prompt evaluation matrix - web viewer" width="700">
It works on the command line too:
<img src="https://www.promptfoo.dev/img/docs/self-grading.gif" alt="promptfoo command line" width="700">
It also can generate security vulnerability reports:
<img src="https://www.promptfoo.dev/img/[email protected]" alt="gen ai red team" width="700">
Why Promptfoo?
- •Developer-first: Fast, with features like live reload and caching
- •Private: LLM evals run 100% locally - your prompts never leave your machine
- •Flexible: Works with any LLM API or programming language
- •Battle-tested: Powers LLM apps serving 10M+ users in production
- •Data-driven: Make decisions based on metrics, not gut feel
- •Open source: MIT licensed, with an active community
Learn More
- •Getting Started
- •Full Documentation
- •Red Teaming Guide
- •CLI Usage
- •Node.js Package
- •Supported Models
- •Code Scanning Guide
Contributing
We welcome contributions! Check out our contributing guide to get started.
Join our Discord community for help and discussion.
<a href="https://github.com/promptfoo/promptfoo/graphs/contributors"> <img src="https://contrib.rocks/image?repo=promptfoo/promptfoo" /> </a>
Install & Usage
mkdir -p .claude/skillsmkdir -p .claude/skills && curl -o .claude/skills/promptfoo-evals.md https://raw.githubusercontent.com/promptfoo/promptfoo/main/SKILL.md/promptfoo-evalsFrequently Asked Questions
What is promptfoo-evals?
Teaches AI coding agents to create promptfoo eval suites with deterministic assertions, provider configs, and best practices
How to install promptfoo-evals?
To install promptfoo-evals, create the .claude/skills directory in your project, then run the curl command to download the skill file. Once installed, invoke it in Claude Code with /promptfoo-evals.
What is promptfoo-evals best for?
promptfoo-evals is a community categorized under General. It is designed for: agent, eval, testing, llm, promptfoo. Created by promptfoo.