BeClaude
Research2026-05-07

Architectural Observability Collapse in Transformers

Source: Arxiv CS.AI

arXiv:2604.24801v2 Announce Type: replace-cross Abstract: Activation monitoring can catch confident errors in autoregressive transformers only if training preserved an internal decision-quality signal that output confidence does not expose. Monitorability is an architectural property before it is a...

arxivpapers