Policy2026-05-08
Beyond Accuracy: Policy Invariance as a Reliability Test for LLM Safety Judges
Source: Arxiv CS.AI
arXiv:2605.06161v1 Announce Type: new Abstract: LLM-as-a-Judge pipelines have become the de facto evaluator for agent safety, yet existing benchmarks treat their verdicts as ground-truth proxies without checking whether the verdicts depend on the agent's behavior or merely on how the evaluation...
arxivpaperssafety