Research2026-04-24
Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression
Source: Arxiv CS.AI
arXiv:2505.13527v3 Announce Type: replace-cross Abstract: Despite substantial advancements in aligning large language models (LLMs) with human values, current safety mechanisms remain susceptible to jailbreak attacks. We hypothesize that this vulnerability stems from distributional discrepancies...
arxivpaperssafety