Research2026-05-05

Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

arXiv:2605.00814v1 Announce Type: cross Abstract: While autoregressive Large Vision-Language Models (LVLMs) demonstrate remarkable proficiency in multimodal tasks, they face a "Visual Signal Dilution" phenomenon, where the accumulation of textual history expands the attention partition function,...

Read Original Article on Arxiv CS.AI

arxivpapers