BeClaude
Research2026-05-06

Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data

Source: Arxiv CS.AI

arXiv:2605.01356v1 Announce Type: cross Abstract: Learning constraint-satisfying policies from offline data without risky online interaction is crucial for safety-critical decision making. Conventional methods typically learn cost value functions from abundant unsafe samples to define safety...

arxivpapers