BeClaude
Research2026-05-12

A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents

Source: Arxiv CS.AI

arXiv:2512.20798v5 Announce Type: replace Abstract: As autonomous AI agents are increasingly deployed in high-stakes environments, ensuring their safety and alignment with human values is becoming a practical deployment concern. Current benchmarks for AI agents primarily evaluate refusal of...

arxivpapersagentsbenchmark