BeClaude
Research2026-05-12

TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM

Source: Arxiv CS.AI

arXiv:2605.09536v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) offer a promising paradigm for parallel text generation, but in practice they face an accuracy-parallelism trade-off, where increasing tokens per forward (TPF) often degrades generation quality. Existing...

arxivpapersimage-generation