Research2026-05-14
VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority
Source: Arxiv CS.AI
arXiv:2605.12571v1 Announce Type: cross Abstract: Long video question answering requires locating sparse, time-scattered visual evidence within highly redundant content. Although current MLLMs perform well on short videos, long videos introduce long-horizon search and verification, which often...
arxivpapersagents