Research2026-05-14
High-Rate Quantized Matrix Multiplication I
Source: Arxiv CS.AI
arXiv:2601.17187v2 Announce Type: replace-cross Abstract: This paper investigates the problem of quantized matrix multiplication (MatMul), which has become crucial for the efficient deployment of large language models (LLMs). We consider a Generic MatMul setting, where both matrices must be...
arxivpapers