Research2026-05-12
Robust Probabilistic Shielding for Safe Offline Reinforcement Learning
Source: Arxiv CS.AI
arXiv:2605.10293v1 Announce Type: cross Abstract: In offline reinforcement learning (RL), we learn policies from fixed datasets without environment interaction. The major challenges are to provide guarantees on the (1) performance and (2) safety of the resulting policy. A technique called safe...
arxivpapersrl