Policy2026-05-08

Beyond Accuracy: Policy Invariance as a Reliability Test for LLM Safety Judges

arXiv:2605.06161v1 Announce Type: new Abstract: LLM-as-a-Judge pipelines have become the de facto evaluator for agent safety, yet existing benchmarks treat their verdicts as ground-truth proxies without checking whether the verdicts depend on the agent's behavior or merely on how the evaluation...

Read Original Article on Arxiv CS.AI

arxivpaperssafety