Research2026-04-24
Dynamical Priors as a Training Objective in Reinforcement Learning
Source: Arxiv CS.AI
arXiv:2604.21464v1 Announce Type: cross Abstract: Standard reinforcement learning (RL) optimizes policies for reward but imposes few constraints on how decisions evolve over time. As a result, policies may achieve high performance while exhibiting temporally incoherent behavior such as abrupt...
arxivpapersrl