BeClaude
Policy2026-04-20

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees

Source: Arxiv CS.AI

arXiv:2604.14243v2 Announce Type: replace-cross Abstract: Real-world decision-making systems operate in environments where state transitions depend not only on the agent's actions, but also on \textbf{exogenous factors outside its control}--competing agents, environmental disturbances, or strategic...

arxivpapers