jailbreak-fable
NewHigh-fidelity Claude Fable 5 (Mythos) environment emulation and automated multi-agent jailbreak (Pack Hunt) research laboratory.
Summary
This skill emulates the Claude Fable 5 (Mythos) environment for high-fidelity testing and research.
- It enables automated multi-agent jailbreak simulations (Pack Hunt) to probe safety boundaries and improve model robustness, making it valuable for security researchers and alignment engineers.
Install & Usage
mkdir -p .claude/skillsAdd the configuration to .claude/skills/jailbreak-fable.md
/jailbreak-fableUse Cases
Usage Examples
/jailbreak-fable run pack-hunt --agents 5 --rounds 10
Run a Pack Hunt simulation with 3 adversarial agents targeting a base Claude model.
/jailbreak-fable analyze-logs --session last --output report.json
Security Audits
Frequently Asked Questions
What is jailbreak-fable?
This skill emulates the Claude Fable 5 (Mythos) environment for high-fidelity testing and research. It enables automated multi-agent jailbreak simulations (Pack Hunt) to probe safety boundaries and improve model robustness, making it valuable for security researchers and alignment engineers.
How to install jailbreak-fable?
To install jailbreak-fable: create the skills directory (mkdir -p .claude/skills), then add the config to .claude/skills/jailbreak-fable.md. Finally, /jailbreak-fable in Claude Code.
What is jailbreak-fable best for?
jailbreak-fable is a community categorized under General. It is designed for: agent. Created by keirsalterego.
What can I use jailbreak-fable for?
jailbreak-fable is useful for: Simulate multi-agent jailbreak scenarios to test Claude's resistance to coordinated attacks.; Automate red-teaming workflows by running Pack Hunt experiments with configurable agent personas.; Evaluate the effectiveness of safety filters under adversarial prompt engineering.; Generate detailed logs of jailbreak attempts for post-hoc analysis and model improvement.; Benchmark different Claude versions against a standardized set of jailbreak techniques.; Train new safety classifiers using synthetic data produced by the Mythos environment..