Research2026-05-11
Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression
Source: Arxiv CS.AI
arXiv:2502.01941v3 Announce Type: replace-cross Abstract: While Key-Value (KV) cache compression is essential for efficient LLM inference, current evaluations disproportionately focus on sparse retrieval tasks, potentially masking the degradation of High-Density Reasoning where Chain-of-Thought...
arxivpapersreasoningbenchmark