Research2026-05-06

GEASS: Training-Free Caption Steering for Hallucination Mitigation in Vision-Language Models

arXiv:2605.01733v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) excel at grounded reasoning but remain prone to object hallucination. Recent work treats self-generated captions as a uniformly positive resource, yet we find that naively embedding one can degrade rather than...

Read Original Article on Arxiv CS.AI

arxivpapersvision