BeClaude
Research2026-05-08

Frictional Q-Learning

Source: Arxiv CS.AI

arXiv:2509.19771v4 Announce Type: replace-cross Abstract: Off-policy reinforcement learning suffers from extrapolation errors when a learned policy selects actions that are weakly supported in the replay buffer. In this study, we address this issue by drawing an analogy to static friction. From...

arxivpapers