Research2026-04-22
LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation
Source: Arxiv CS.AI
arXiv:2604.19167v1 Announce Type: cross Abstract: Deploying large language models (LLMs) in resource-constrained environments is hindered by heavy computational and memory requirements. We present LBLLM, a lightweight binarization framework that achieves effective W(1+1)A4 quantization through a...
arxivpapers