BeClaude
Research2026-04-28

Auditing Sabotage Bench: A Benchmark for Detecting and Fixing Research Sabotage in ML Codebases

Source: Arxiv CS.AI

arXiv:2604.16286v2 Announce Type: replace Abstract: As AI systems are increasingly used to conduct research autonomously, misaligned systems could introduce subtle flaws that produce misleading results while evading detection. We introduce Auditing Sabotage Bench, a benchmark for evaluating the...

arxivpapersbenchmark