Research2026-05-11
Understanding Performance Collapse in Layer-Pruned Large Language Models via Decision Representation Transitions
Source: Arxiv CS.AI
arXiv:2605.07271v1 Announce Type: cross Abstract: Layer pruning efficiently reduces Large Language Model (LLM) computational costs but often triggers sudden performance collapse. Existing representation-based analyses struggle to explain this mechanism. We propose studying pruning through decision...
arxivpapers