Research2026-05-12
Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs
Source: Arxiv CS.AI
arXiv:2604.17502v2 Announce Type: replace Abstract: Misaligned artificial agents might resist shutdown. One proposed solution is to train agents to lack preferences between different-length trajectories. The Discounted Reward for Same-Length Trajectories (DReST) reward function does this by...
arxivpapersagents