BeClaude
Research2026-04-20

StoSignSGD: Unbiased Structural Stochasticity Fixes SignSGD for Training Large Language Models

Source: Arxiv CS.AI

arXiv:2604.15416v1 Announce Type: cross Abstract: Sign-based optimization algorithms, such as SignSGD, have garnered significant attention for their remarkable performance in distributed learning and training large foundation models. Despite their empirical superiority, SignSGD is known to diverge...

arxivpapers