BeClaude
Policy2026-04-22

Low-Rank Adaptation for Critic Learning in Off-Policy Reinforcement Learning

Source: Arxiv CS.AI

arXiv:2604.18978v1 Announce Type: cross Abstract: Scaling critic capacity is a promising direction for enhancing off-policy reinforcement learning (RL). However, larger critics are prone to overfitting and unstable in replay-buffer-based bootstrap training. This paper leverages Low-Rank Adaptation...

arxivpapersrl