BeClaude
Research2026-04-24

Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression

Source: Arxiv CS.AI

arXiv:2505.13527v3 Announce Type: replace-cross Abstract: Despite substantial advancements in aligning large language models (LLMs) with human values, current safety mechanisms remain susceptible to jailbreak attacks. We hypothesize that this vulnerability stems from distributional discrepancies...

arxivpaperssafety