Research2026-04-23
Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs
Source: Arxiv CS.AI
arXiv:2512.08923v2 Announce Type: replace Abstract: We introduce two new benchmarks REST and REST+ (Render-Equivalence Stress Tests) to enable systematic evaluation of cross-modal inconsistency in multimodal large language models (MLLMs). MLLMs are trained to represent vision and language in the...
arxivpapers