Research2026-04-23
UCCL-Zip: Lossless Compression Supercharged GPU Communication
Source: Arxiv CS.AI
arXiv:2604.17172v2 Announce Type: replace-cross Abstract: The rapid growth of large language models (LLMs) has made GPU communication a critical bottleneck. While prior work reduces communication volume via quantization or lossy compression, these approaches introduce numerical errors that can...
arxivpapers