Research2026-05-14

Beyond Perplexity: A Geometric and Spectral Study of Low-Rank Pre-Training

arXiv:2605.13652v1 Announce Type: cross Abstract: Pre-training large language models is dominated by the memory cost of storing full-rank weights, gradients, and optimizer states. Low-rank pre-training has emerged to address this, and the space of methods has grown rapidly. A central question...

Read Original Article on Arxiv CS.AI

arxivpapers