Research2026-05-05
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs
Source: Arxiv CS.AI
arXiv:2605.00814v1 Announce Type: cross Abstract: While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function,...
arxivpapers