Research2026-06-29

Algorithms for Deciding the Safety of States in Fully Observable Non-deterministic Problems: Technical Report

Originally published byArxiv CS.AI

arXiv:2603.15282v2 Announce Type: replace Abstract: Learned action policies are increasingly popular in sequential decision-making, but suffer from a lack of safety guarantees. Recent work introduced a pipeline for testing the safety of such policies under initial-state and action-outcome...

A Formal Safety Check for Non-Deterministic AI Policies

The latest revision of arXiv:2603.15282v2 tackles a fundamental gap in modern AI deployment: how to formally verify that a learned policy will remain safe even when the environment behaves unpredictably. The researchers present algorithms for deciding the safety of states in fully observable non-deterministic (FOND) planning problems—a class of problems where actions can have multiple possible outcomes, but the agent sees the current state completely.

This work builds on a previously introduced pipeline that tests policy safety under initial-state and action-outcome uncertainty. The key advance here is moving from empirical testing to formal decision procedures. Instead of running many simulations and hoping for the best, the algorithms can provably determine whether a given state is safe—meaning no sequence of non-deterministic outcomes can lead to a failure state.

Why This Matters

The practical importance is hard to overstate. Most reinforcement learning and imitation learning policies today are verified through extensive testing, not formal proof. This works well in deterministic or near-deterministic settings, but fails catastrophically when rare edge cases occur. The authors address exactly this vulnerability by treating action outcomes as non-deterministic—a more realistic model than the probabilistic assumptions common in MDP-based verification.

For safety-critical applications like autonomous driving, robotic surgery, or industrial control, the difference between "tested on 10,000 scenarios" and "formally proven safe under all possible non-deterministic outcomes" is the difference between acceptable risk and genuine assurance. This work provides a path to the latter.

Implications for AI Practitioners

First, practitioners building safety-critical systems should pay close attention to the FOND formalism. It offers a middle ground between full probabilistic modeling (which requires accurate transition probabilities) and worst-case analysis (which can be too conservative). The non-deterministic model captures genuine uncertainty without requiring precise probability estimates.

Second, the algorithmic focus on state-level safety rather than policy-level safety is a pragmatic choice. It allows verification to scale by decomposing the problem: check each reachable state individually, rather than attempting to verify the entire policy at once. This aligns well with how modern AI systems are built—modular, with clear state representations.

Third, the work implicitly argues for a shift in how we evaluate learned policies. The current norm of reporting average return or success rate obscures safety failures that occur in rare but critical states. Formal safety verification would become a standard part of the deployment pipeline, much like unit testing in software engineering.

Key Takeaways

The paper provides algorithms for formally verifying safety in non-deterministic environments, moving beyond statistical testing to provable guarantees.
FOND planning offers a practical formalism for safety verification that avoids requiring precise probability estimates while still capturing genuine uncertainty.
State-level safety checking enables modular verification that can scale to complex systems, aligning with modern AI architecture patterns.
For safety-critical applications, this work points toward a future where formal safety proofs become a standard requirement before deployment, not an afterthought.

Read Original Article on Arxiv CS.AI

arxivpaperssafety