Research2026-05-06
DataClaw: A Process-Oriented Agent Benchmark for Exploratory Real-World Data Analysis
Source: Arxiv CS.AI
arXiv:2605.02503v1 Announce Type: new Abstract: Evaluating autonomous data analysis agents requires testing their ability to perform exploratory analysis in underexplored data environments. However, many existing benchmarks emphasize final answer accuracy in prior-guided data settings and provide...
arxivpapersagentsbenchmark