Research2026-05-14
Quantifying LLM Safety Degradation Under Repeated Attacks Using Survival Analysis
Source: Arxiv CS.AI
arXiv:2605.12869v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in a wide range of applications, yet remain vulnerable to adversarial jailbreak attacks that circumvent their safety guardrails. Existing evaluation frameworks typically report binary...
arxivpaperssafety