BeClaude
Research2026-04-24

SafeRedirect: Defeating Internal Safety Collapse via Task-Completion Redirection in Frontier LLMs

Source: Arxiv CS.AI

arXiv:2604.20930v1 Announce Type: cross Abstract: Internal Safety Collapse (ISC) is a failure mode in which frontier LLMs, when executing legitimate professional tasks whose correct completion structurally requires harmful content, spontaneously generate that content with safety failure rates...

arxivpaperssafety