BeClaude
Research2026-04-30

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models

Source: Arxiv CS.AI

arXiv:2604.26951v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference steps...

arxivpapersimage-generation