Research2026-05-14
Understanding and Accelerating the Training of Masked Diffusion Language Models
Source: Arxiv CS.AI
arXiv:2605.13026v1 Announce Type: cross Abstract: Masked diffusion models (MDMs) have emerged as a promising alternative to autoregressive models (ARMs) for language modeling. However, MDMs are known to learn substantially more slowly than ARMs, which may become problematic when scaling MDMs to...
arxivpapersimage-generation