Research2026-05-05

Make Your LVLM KV Cache More Lightweight

arXiv:2605.00789v1 Announce Type: cross Abstract: Key-Value (KV) cache has become a de facto component of modern Large Vision-Language Models (LVLMs) for inference. While it enhances decoding efficiency in Large Language Models (LLMs), its direct adoption in LVLMs introduces substantial GPU memory...

Read Original Article on Arxiv CS.AI

arxivpapers