Policy2026-04-22
Low-Rank Adaptation for Critic Learning in Off-Policy Reinforcement Learning
Source: Arxiv CS.AI
arXiv:2604.18978v1 Announce Type: cross Abstract: Scaling critic capacity is a promising direction for enhancing off-policy reinforcement learning (RL). However, larger critics are prone to overfitting and unstable in replay-buffer-based bootstrap training. This paper leverages Low-Rank Adaptation...
arxivpapersrl