Research2026-04-22

Owner-Harm: A Missing Threat Model for AI Agent Safety

arXiv:2604.18658v1 Announce Type: cross Abstract: Existing AI agent safety benchmarks focus on generic criminal harm (cybercrime, harassment, weapon synthesis), leaving a systematic blind spot for a distinct and commercially consequential threat category: agents harming their own deployers....

Read Original Article on Arxiv CS.AI

arxivpapersagentssafety