Research2026-05-12
Visual-ERM: Reward Modeling for Visual Equivalence
Source: Arxiv CS.AI
arXiv:2603.13224v2 Announce Type: replace-cross Abstract: Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured representations with high visual fidelity. While recent Large Vision Language Models (LVLMs) achieve...
arxivpapers