Research2026-04-17

DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding

arXiv:2604.12812v2 Announce Type: replace Abstract: Existing Multimodal Large Language Models (MLLMs) suffer from significant performance degradation on the long document understanding task as document length increases. This stems from two fundamental challenges: 1) a low Signal-to-Noise Ratio...

Read Original Article on Arxiv CS.AI

arxivpapersreasoning