Research2026-04-22
Owner-Harm: A Missing Threat Model for AI Agent Safety
Source: Arxiv CS.AI
arXiv:2604.18658v1 Announce Type: cross Abstract: Existing AI agent safety benchmarks focus on generic criminal harm (cybercrime, harassment, weapon synthesis), leaving a systematic blind spot for a distinct and commercially consequential threat category: agents harming their own deployers....
arxivpapersagentssafety