Research2026-05-11

Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression

arXiv:2502.01941v3 Announce Type: replace-cross Abstract: While Key-Value (KV) cache compression is essential for efficient LLM inference, current evaluations disproportionately focus on sparse retrieval tasks, potentially masking the degradation of High-Density Reasoning where Chain-of-Thought...

Read Original Article on Arxiv CS.AI

arxivpapersreasoningbenchmark