Research2026-05-12

Towards Shutdownable Agents: Generalizing Stochastic Choice in RL Agents and LLMs

arXiv:2604.17502v2 Announce Type: replace Abstract: Misaligned artificial agents might resist shutdown. One proposed solution is to train agents to lack preferences between different-length trajectories. The Discounted Reward for Same-Length Trajectories (DReST) reward function does this by...

Read Original Article on Arxiv CS.AI

arxivpapersagents