Research2026-05-08
Frictional Q-Learning
Source: Arxiv CS.AI
arXiv:2509.19771v4 Announce Type: replace-cross Abstract: Off-policy reinforcement learning suffers from extrapolation errors when a learned policy selects actions that are weakly supported in the replay buffer. In this study, we address this issue by drawing an analogy to static friction. From...
arxivpapers