BeClaude
Research2026-04-20

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

Source: Arxiv CS.AI

arXiv:2603.13966v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are increasingly evaluated across multiple simulation benchmarks, yet adding each benchmark to an evaluation pipeline requires resolving incompatible dependencies, matching underspecified evaluation protocols,...

arxivpapersvision