Research2026-05-12
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs
Source: Arxiv CS.AI
arXiv:2509.08031v3 Announce Type: replace-cross Abstract: Large Audio Language Models (LALMs) are rapidly advancing, but evaluating them remains challenging due to inefficient and non-standardized toolkits that limit fair comparison and systematic assessment. Existing evaluation frameworks exhibit...
arxivpapers