Research2026-05-14

Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion

arXiv:2605.12825v1 Announce Type: cross Abstract: We introduce Orthrus, a simple and efficient dual-architecture framework that unifies the exact generation fidelity of autoregressive Large Language Models (LLMs) with the high-speed parallel token generation of diffusion models. The sequential...

Read Original Article on Arxiv CS.AI

arxivpapersimage-generation