Research2026-05-12
The Art of the Jailbreak: Formulating Jailbreak Attacks for LLM Security Beyond Binary Scoring
Source: Arxiv CS.AI
arXiv:2605.09225v1 Announce Type: cross Abstract: Jailbreak attacks -- adversarial prompts that bypass LLM alignment through purely linguistic manipulation -- pose a growing operational security threat, yet the field lacks large-scale, reproducible infrastructure for generating, categorizing, and...
arxivpapers