BeClaude
Research2026-05-12

Robust Probabilistic Shielding for Safe Offline Reinforcement Learning

Source: Arxiv CS.AI

arXiv:2605.10293v1 Announce Type: cross Abstract: In offline reinforcement learning (RL), we learn policies from fixed datasets without environment interaction. The major challenges are to provide guarantees on the (1) performance and (2) safety of the resulting policy. A technique called safe...

arxivpapersrl