touchstone
NewA lightweight, automated multi-model process for rapid agent-skill improvement with adversarial cross-model review.
Summary
Touchstone is a lightweight, automated multi-model process that rapidly improves agent skills through adversarial cross-model review.
- It enables developers to iteratively refine their AI agents by having different models critique and enhance each other's outputs, leading to more robust and reliable skills.
Install & Usage
mkdir -p .claude/agentsAdd the configuration to .claude/agents/touchstone.md
@touchstoneUse Cases
Usage Examples
/touchstone improve my code-review agent by having Claude review the reviews from GPT-4
Run touchstone on my API skill to cross-validate responses against expected schemas
Use touchstone to iteratively refine my data extraction prompts with adversarial feedback
Security Audits
Frequently Asked Questions
What is touchstone?
Touchstone is a lightweight, automated multi-model process that rapidly improves agent skills through adversarial cross-model review. It enables developers to iteratively refine their AI agents by having different models critique and enhance each other's outputs, leading to more robust and reliable skills.
How to install touchstone?
To install touchstone: create the agents directory (mkdir -p .claude/agents), then add the config to .claude/agents/touchstone.md. Finally, @touchstone in Claude Code.
What is touchstone best for?
touchstone is a agent categorized under General. It is designed for: code-review, api, agent. Created by devdacian.
What can I use touchstone for?
touchstone is useful for: Improve a code review agent by having it generate reviews that are then critiqued by another model for accuracy and completeness.; Enhance an API integration skill by cross-validating its responses against expected outputs using a different model.; Automatically refine a data extraction agent by pitting it against a model that checks for missing or incorrect data.; Iteratively optimize a prompt engineering skill by having models suggest improvements to each other's prompts.; Strengthen a security analysis agent by adversarial testing where one model tries to bypass checks and another evaluates the defenses.; Rapidly prototype and improve a multi-step reasoning agent by cross-model validation of intermediate steps..