BeClaude
Research2026-05-08

Structural Sensitivity in Compressed Transformers: Relative Error Propagation and Layer Removal

Source: Arxiv CS.AI

arXiv:2603.20991v2 Announce Type: replace-cross Abstract: Compressing transformer weights makes large language models cheaper to deploy. But each layer's compression introduces an error. These errors accumulate as the signal passes through later layers, and how they accumulate is not well...

arxivpapers