Research2026-04-20
DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference
Source: Arxiv CS.AI
arXiv:2604.15750v1 Announce Type: cross Abstract: Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive language generation due to their potential for parallel decoding and global refinement of the entire sequence. To unlock this potential, DLM inference must...
arxivpapersimage-generation