Research2026-05-14
Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion
Source: Arxiv CS.AI
arXiv:2605.12825v1 Announce Type: cross Abstract: We introduce Orthrus, a simple and efficient dual-architecture framework that unifies the exact generation fidelity of autoregressive Large Language Models (LLMs) with the high-speed parallel token generation of diffusion models. The sequential...
arxivpapersimage-generation