Research2026-05-12
Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models
Source: Arxiv CS.AI
arXiv:2603.16253v2 Announce Type: replace-cross Abstract: Vision-language process reward models (VL-PRMs) are increasingly used to score intermediate reasoning steps and rerank candidates under test-time scaling. However, they often function as black-box judges: a low step score may reflect a...
arxivpapersvision