BeClaude
Research2026-05-08

Causal Probing for Internal Visual Representations in Multimodal Large Language Models

Source: Arxiv CS.AI

arXiv:2605.05593v1 Announce Type: new Abstract: Despite the remarkable success of Multimodal Large Language Models (MLLMs) across diverse tasks, the internal mechanisms governing how they encode and ground distinct visual concepts remain poorly understood. To bridge this gap, we propose a causal...

arxivpapersmultimodal