Research2026-05-11

When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory

arXiv:2605.07313v1 Announce Type: new Abstract: Memory-agent evaluations report fixed-snapshot accuracy or retrieval quality, but these scores do not show whether evidence remains usable as irrelevant sessions (sessions not annotated as task-relevant evidence for the query) accumulate. We present a...

Read Original Article on Arxiv CS.AI

arxivpapersagents