Research2026-05-12
Beyond Penalization: Diffusion-based Out-of-Distribution Detection and Selective Regularization in Offline Reinforcement Learning
Source: Arxiv CS.AI
arXiv:2605.08202v1 Announce Type: cross Abstract: Offline reinforcement learning (RL) faces a critical challenge of overestimating the value of out-of-distribution (OOD) actions. Existing methods mitigate this issue by penalizing unseen samples, yet they fail to accurately identify OOD actions and...
arxivpapersimage-generationrl