Research2026-04-20
vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models
Source: Arxiv CS.AI
arXiv:2603.13966v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models are increasingly evaluated across multiple simulation benchmarks, yet adding each benchmark to an evaluation pipeline requires resolving incompatible dependencies, matching underspecified evaluation protocols,...
arxivpapersvision