Research2026-05-07
QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs
Source: Arxiv CS.AI
arXiv:2605.03884v1 Announce Type: new Abstract: Multi-agent LLM systems on edge devices need to hand off latent context efficiently, but the practical choices today are expensive re-prefill or full-precision KV transfer. We study QKVShare, a framework for quantized KV-cache handoff between agents...
arxivpapersagents