Research2026-05-14
Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs
Source: Arxiv CS.AI
arXiv:2605.13737v1 Announce Type: new Abstract: When an omnimodal large language model accepts a question whose textual premise contradicts what it actually sees or hears, does the failure lie in perception or in action? Recent omnimodal models are positioned as perception-grounded agents that...
arxivpapers