BeClaude
Research2026-04-30

Value-Guided Iterative Refinement and the DIQ-H Benchmark for Evaluating VLM Robustness

Source: Arxiv CS.AI

arXiv:2512.03992v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) are essential for embodied AI and safety-critical applications, such as robotics and autonomous systems. However, existing benchmarks primarily focus on static or curated visual inputs, neglecting the challenges...

arxivpapersbenchmark