Research2026-05-06

Submodular Benchmark Selection

arXiv:2605.02209v1 Announce Type: new Abstract: Evaluating large language models across many benchmarks is expensive, yet many benchmarks are highly correlated. We formalize the selection of a small, informative subset as submodular maximization under a multivariate Gaussian model. Entropy...

Read Original Article on Arxiv CS.AI

arxivpapersbenchmark