Policy2026-04-20
Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees
Source: Arxiv CS.AI
arXiv:2604.14243v2 Announce Type: replace-cross Abstract: Real-world decision-making systems operate in environments where state transitions depend not only on the agent's actions, but also on \textbf{exogenous factors outside its control}--competing agents, environmental disturbances, or strategic...
arxivpapers