Research2026-05-14

Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs

arXiv:2605.13737v1 Announce Type: new Abstract: When an omnimodal large language model accepts a question whose textual premise contradicts what it actually sees or hears, does the failure lie in perception or in action? Recent omnimodal models are positioned as perception-grounded agents that...

Read Original Article on Arxiv CS.AI

arxivpapers