Research2026-05-12

Robust Probabilistic Shielding for Safe Offline Reinforcement Learning

arXiv:2605.10293v1 Announce Type: cross Abstract: In offline reinforcement learning (RL), we learn policies from fixed datasets without environment interaction. The major challenges are to provide guarantees on the (1) performance and (2) safety of the resulting policy. A technique called safe...

Read Original Article on Arxiv CS.AI

arxivpapersrl