Research2026-04-23

Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs

arXiv:2512.08923v2 Announce Type: replace Abstract: We introduce two new benchmarks REST and REST+ (Render-Equivalence Stress Tests) to enable systematic evaluation of cross-modal inconsistency in multimodal large language models (MLLMs). MLLMs are trained to represent vision and language in the...

Read Original Article on Arxiv CS.AI

arxivpapers