Research2026-05-08
Automated alignment is harder than you think
Source: Arxiv CS.AI
arXiv:2605.06390v1 Announce Type: new Abstract: A leading proposal for aligning artificial superintelligence (ASI) is to use AI agents to automate an increasing fraction of alignment research as capabilities improve. We argue that, even when research agents are not scheming to deliberately sabotage...
arxivpapers