Research2026-05-08
Revealing Modular Gradient Noise Imbalance in LLMs: Calibrating Adam via Signal-to-Noise Ratio
Source: Arxiv CS.AI
arXiv:2605.05794v1 Announce Type: cross Abstract: The impressive performance of large language models (LLMs) arises from their massive scale and heterogeneous module composition. However, this structural heterogeneity introduces additional optimization challenges. While adaptive optimizers such as...
arxivpapers