BeClaude
Research2026-05-12

MIDUS: Memory-Infused Depth Up-Scaling

Source: Arxiv CS.AI

arXiv:2512.13751v2 Announce Type: replace-cross Abstract: Expanding pre-trained language models offers a practical way to increase capacity without training larger models from scratch. Depth Up-Scaling (DUS) does so by duplicating Transformer blocks and inserting them into a pre-trained backbone....

arxivpapers