BeClaude
Research2026-05-12

Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning

Source: Arxiv CS.AI

arXiv:2605.08202v1 Announce Type: cross Abstract: Offline reinforcement learning (RL) faces a critical challenge of overestimating the value of out-of-distribution (OOD) actions. Existing methods mitigate this issue by penalizing unseen samples, yet they fail to accurately identify OOD actions and...

arxivpapersimage-generationrl