Cost-Optimal Decision Diagrams for Stochastic Boolean Function Evaluation
arXiv:2606.24672v1 Announce Type: new Abstract: In many decision-making scenarios, acquiring information incurs different costs. We consider the problem of constructing a deterministic evaluation strategy that minimizes the expected cost of evaluating a propositional formula under variable costs...
What Happened
A new arXiv preprint (2606.24672v1) introduces a formal framework for constructing cost-optimal decision diagrams to evaluate stochastic Boolean functions. The core problem is deceptively simple: when testing a logical formula like "(A AND B) OR C", each variable (A, B, C) may have a different cost to observe (e.g., running a medical test or querying a database). The goal is to design a deterministic evaluation strategy—a decision tree or diagram—that minimizes the expected total cost, given that some variables may be stochastic (their values are uncertain but have known probabilities).
The authors propose using decision diagrams as a compact representation of evaluation strategies, then solving for the optimal ordering and branching structure. This goes beyond classic "short-circuit evaluation" (e.g., stopping early when A is false in "A AND B") by explicitly modeling variable costs and probabilities, and by allowing the strategy to adapt dynamically based on partial results.
Why It Matters
This work addresses a fundamental gap in practical decision-making under uncertainty. Many real-world systems—from diagnostic algorithms to database query optimizers—already use heuristic cost-based evaluation, but they rarely guarantee optimality. The key insight is that the optimal strategy is not always intuitive: sometimes it is cheaper to test an expensive variable first if it has a high probability of terminating the evaluation early.
For AI practitioners, the implications are concrete:
- Resource-constrained AI systems: Any agent that must pay per query (e.g., API calls, sensor readings, human-in-the-loop checks) can benefit from this framework. A chatbot that decides whether to call a weather API or a calendar API first, based on the user's question, is solving exactly this problem.
- Explainability and determinism: Unlike reinforcement learning approaches that learn evaluation policies through trial and error, this method produces a deterministic, provably optimal strategy. This is critical in regulated domains (healthcare, finance) where auditability matters.
- Scalability: Decision diagrams scale better than exhaustive decision trees for many Boolean functions, making the approach feasible for formulas with dozens of variables.
Implications for AI Practitioners
The most immediate application is in active learning and Bayesian optimization, where the cost of acquiring labels or features varies. For example, in medical diagnosis, a blood test might cost $100 while a symptom check costs $0—the optimal strategy might test symptoms first even if they are less informative, simply because they are cheap.
Additionally, this work connects to probabilistic programming and causal inference. When evaluating a complex logical condition over uncertain variables (e.g., "if (test1 positive OR test2 positive) AND risk_factor"), the optimal evaluation order can save significant compute or money.
However, practitioners should note the limitations: the framework assumes known probabilities and costs, which may not hold in dynamic environments. The paper also focuses on Boolean functions, so extending to continuous or multi-valued variables remains future work.
Key Takeaways
- A new formal method constructs provably cost-optimal deterministic evaluation strategies for Boolean functions with variable observation costs and stochastic outcomes.
- The approach uses decision diagrams to compactly represent strategies, enabling optimality guarantees that heuristic methods lack.
- AI practitioners can apply this to any system where information acquisition has heterogeneous costs—diagnostics, query optimization, and active learning.
- The framework requires known probabilities and costs, limiting its use in fully dynamic or unknown environments without additional estimation techniques.