Research2026-05-12
SimReg: Achieving Higher Performance in the Pretraining via Embedding Similarity Regularization
Source: Arxiv CS.AI
arXiv:2605.08809v1 Announce Type: cross Abstract: Pretraining large language models (LLMs) with next-token prediction has led to remarkable advances, yet the context-dependent nature of token embeddings in such models results in high intra-class variance and inter-class similarity, thus hindering...
arxivpapers