BeClaude
Research2026-04-28

Can Multimodal Large Language Models Truly Understand Small Objects?

Source: Arxiv CS.AI

arXiv:2604.22884v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have shown promising potential in diverse understanding tasks, e.g., image and video analysis, math and physics olympiads. However, they remain blank and unexplored for Small Object Understanding (SOU) tasks....

arxivpapersmultimodal