Research2026-04-30

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning

arXiv:2604.26516v1 Announce Type: cross Abstract: Offline reinforcement learning (RL) agents often fail when deployed, as the gap between training datasets and real environments leads to unsafe behavior. To address this, we present SAS (Self-Alignment for Safety), a transformer-based framework that...

Read Original Article on Arxiv CS.AI

arxivpapersrl