BeClaude
Research2026-04-28

Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models

Source: Arxiv CS.AI

arXiv:2604.14888v2 Announce Type: replace-cross Abstract: Recent advances in vision language models (VLMs) offer reasoning capabilities, yet how these unfold and integrate visual and textual information remains unclear. We analyze reasoning dynamics in 18 VLMs covering instruction-tuned and...

arxivpapersreasoningvision