Research2026-04-28
ScoringBench: A Benchmark for Evaluating Tabular Foundation Models with Proper Scoring Rules
Source: Arxiv CS.AI
arXiv:2603.29928v2 Announce Type: replace Abstract: Tabular foundation models such as TabPFN and TabICL already produce full predictive distributions, yet prevailing regression benchmarks evaluate them almost exclusively via point-estimate metrics (RMSE, $R^2$). This discards precisely the...
arxivpapersbenchmark