Back to News
Research2026-04-17
RiskWebWorld: A Realistic Interactive Benchmark for GUI Agents in E-commerce Risk Management
Source: Arxiv CS.AI
arXiv:2604.13531v1 Announce Type: new Abstract: Graphical User Interface (GUI) agents show strong capabilities for automating web tasks, but existing interactive benchmarks primarily target benign, predictable consumer environments. Their effectiveness in high-stakes, investigative domains such as...
arxivpapersagentsbenchmark