Article

Exact Stochastic Constraint Optimisation with Applications in Network Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We present an extensive study of methods for exactly solving stochastic constraint (optimisation) problems (SCPs) in network analysis. These problems are prevalent in science, governance and industry. The first method we study is generic and decomposes stochastic constraints into a multitude of smaller local constraints that are solved using a constraint programming (CP) or mixed-integer programming (MIP) solver. However, many SCPs are formulated on probability distributions with a monotonic property, meaning that adding a positive decision to a partial solution to the problem cannot cause a decrease in solution quality. The second method is specifically designed for solving global stochastic constraints on monotonic probability distributions (SCMDs) in CP. Both methods use knowledge compilation to obtain a decision diagram encoding of the relevant probability distributions, where we focus on ordered binary decision diagrams (OBDDs). We discuss theoretical advantages and disadvantages of these methods and evaluate them experimentally. We observed that global approaches to solving SCMDs outperform decomposition approaches from CP, and perform complementarily to MIP-based decomposition approaches, while scaling much more favourably with instance size. Both methods have many alternative design choices, as both knowledge compilation and constraint solvers are used in a single pipeline. To identify which configurations work best, we apply programming by optimisation. Specifically, we show how an automated algorithm configurator can be used to find optimised configurations of our pipeline. After configuration, our global SCMD solving pipeline outperforms its closest competitor (a MIP-based decomposition pipeline) on all test sets we considered by up to two orders of magnitude in terms of PAR10 scores.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In the context of constraint programming, optimization under probability, confidence or statistical constraints is becoming standard (Pachet et al. 2015;Pesant 2015;Perez and Régin 2017;Perez, Rappazzo, and Gomes 2018;Hooker 2022;Latour et al. 2022). At the beginning of this century stochastic constraint programming has been defined (Walsh 2002). ...
Article
In robust optimization, finding a solution that solely respects the constraints is not enough. Usually, the uncertainty and unknown parameters of the model are represented by random variables. In such conditions, a good solution is a solution robust to most-likely assignments of these random variables. Recently, the Confidence constraint has been introduced by Mercier-Aubin et al. in order to enforce this type of robustness in constraint programming. Unfortunately, it is restricted to a conjunction of binary inequalities In this paper, we generalize the Confidence constraint to any constraint and propose an implementation based on Multi-valued Decision Diagrams (MDDs). The Confidence constraint is defined over a vector of random variables. For a given constraint C, and given a threshold, the Confidence constraint ensures that the probability for C to be satisfied by a sample of the random variables is greater than the threshold. We propose to use MDDs to represent the constraints on the random variables. MDDs are an efficient tool for representing combinatorial constraints, thanks to their exponential compression power. Here, both random and decision variables are stored in the MDD, and propagation rules are proposed for removing values of decision variables that cannot lead to robust solutions. Furthermore, for several constraints, we show that decision variables can be omitted from the MDD because lighter filtering algorithms are sufficient. This leads to gain an exponential factor in the MDD size. The experimental results obtained on a chemical deliveries problem in factories – where the chemicals consumption are uncertain – shows the efficiency of the proposed approach.
... Such problems become more and more popular over the years in many practical applications, such as dynamic production/inventory lot-sizing problems, power transmission grid reliability problems, signaling-regulatory pathway inference problems, and staffing optimization of emergency department healthcare. The SOPSC belong to the class NP-hard, and their suboptimal designs are usually difficult to solve in a reasonable time [2,3]. ...
Article
Full-text available
Simulation optimization problems with stochastic constraints are optimization problems with deterministic cost functions subject to stochastic constraints. Solving the considered problem by traditional optimization approaches is time-consuming if the search space is large. In this work, an approach integration of beluga whale optimization and ordinal optimization is presented to resolve the considered problem in a relatively short time frame. The proposed approach is composed of three levels: emulator, diversification, and intensification. Firstly, the polynomial chaos expansion is treated as an emulator to evaluate a design. Secondly, the improved beluga whale optimization is proposed to seek N candidates from the whole search space. Eventually, the advanced optimal computational effort allocation is adopted to determine a superior design from the N candidates. The proposed approach is utilized to seek the optimal number of service providers for minimizing staffing costs while delivering a specific level of care in emergency department healthcare. A practical example of an emergency department with six cases is used to verify the proposed approach. The CPU time consumes less than one minute for six cases, which demonstrates that the proposed approach can meet the requirement of real-time application. In addition, the proposed approach is compared to five heuristic methods. Empirical tests indicate the efficiency and robustness of the proposed approach.
Article
Technology helps producers to collect enormous amounts of customer-product interaction (CPI) data. From the collected CPI data, the importance of the customers and the products can be measured. This study focuses on finding the best number of products for the business for campaign selection based on customers’ liking and disliking of products and their frequency of purchases. Campaign selection is an essential process for marketing, and when the total budget of the campaign is fixed, identifying the optimal products for campaigning is challenging. This paper aims to identify the important existing customers and products and identify the optimal product combinations for campaign selection. We propose two algorithms to identify highly valuable customers and products and solve the complex optimization problem of identifying the optimal product combinations for campaign selection that maximize the business’s profits. We compare our proposed algorithms on real and synthetic datasets, and the results are presented.
Conference Paper
Full-text available
Complex multi-stage decision making problems often involve uncertainty, for example, regarding demand or processing times. Stochastic constraint programming was proposed as a way to formulate and solve such decision problems, involving arbitrary constraints over both decision and random variables. What stochastic constraint programming still lacks is support for the use of factorized probabilistic models that are popular in the graphical model community. We show how a state-of-the-art probabilistic inference engine can be integrated into standard constraint solvers. The resulting approach searches over the And-Or search tree directly, and we investigate tight bounds on the expected utility objective. This significantly improves search efficiency and outperforms scenario-based methods that ground out the possible worlds.
Article
Full-text available
We present the first general purpose framework for marginal maximum a posteriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction; delivering significant performance improvements over prominent existing packages. We present applications of our method to a number of tasks including engineering design and parameter optimization.
Article
Full-text available
Sparsification reduces the size of networks while preserving structural and statistical properties of interest. Various sparsifying algorithms have been proposed in different contexts. We contribute the first systematic conceptual and experimental comparison of edge sparsifi-cation methods on a diverse set of network properties. It is shown that they can be understood as methods for rating edges by importance and then filtering globally by these scores. In addition, we propose a new sparsification method (Local Degree) which preserves edges leading to local hub nodes. All methods are evaluated on a set of 100 Facebook social networks with respect to network properties including diameter, connected components, community structure, and multiple node centrality measures. Experiments with our implementations of the sparsification methods (using the open-source network analysis tool suite NetworKit) show that many network properties can be preserved down to about 20% of the original set of edges. Furthermore, the experimental results allow us to differentiate the behavior of different methods and show which method is suitable with respect to which property. Our Local Degree method is fast enough for large-scale networks and performs well across a wider range of properties than previously proposed methods.
Conference Paper
Full-text available
Combinatorial optimisation problems often contain uncertainty that has to be taken into account to produce realistic solutions. However, existing modelling systems either do not support uncertainty, or do not support combinatorial features, such as integer variables and non-linear constraints. This paper presents an extension of the MiniZinc modelling language that supports uncertainty. Stochastic MiniZinc enables modellers to express combinatorial stochastic problems at a high level of abstraction, independent of the stochastic solving approach. These models are translated automatically into different solver-level representations. Stochastic MiniZinc provides the first solving technology agnostic approach to stochastic modelling we are aware of.
Article
Full-text available
Probabilistic logic programs are logic programs in which some of the facts are annotated with probabilities. This paper investigates how classical inference and learning tasks known from the graphical model community can be tackled for probabilistic logic programs. Several such tasks such as computing the marginals given evidence and learning from (partial) interpretations have not really been addressed for probabilistic logic programs before. The first contribution of this paper is a suite of efficient algorithms for various inference tasks. It is based on a conversion of the program and the queries and evidence to a weighted Boolean formula. This allows us to reduce the inference tasks to well-studied tasks such as weighted model counting, which can be solved using state-of-the-art methods known from the graphical model and knowledge compilation literature. The second contribution is an algorithm for parameter estimation in the learning from interpretations setting. The algorithm employs Expectation Maximization, and is built on top of the developed inference algorithms. The proposed approach is experimentally evaluated. The results show that the inference algorithms improve upon the state-of-the-art in probabilistic logic programming and that it is indeed possible to learn the parameters of a probabilistic logic program from interpretations.
Article
Full-text available
Diffusion is a fundamental graph process, underpinning such phenomena as epidemic disease contagion and the spread of innovation by word-of-mouth. We address the algorithmic problem of finding a set of k initial seed nodes in a network so that the expected size of the resulting cascade is maximized, under the standard independent cascade model of network diffusion. Runtime is a primary consideration for this problem due to the massive size of the relevant input networks. We provide a fast algorithm for the influence maximization problem, obtaining the near-optimal approximation factor of (1 - 1/e - epsilon), for any epsilon > 0, in time O((m+n)log(n) / epsilon^3). Our algorithm is runtime-optimal (up to a logarithmic factor) and substantially improves upon the previously best-known algorithms which run in time Omega(mnk POLY(1/epsilon)). Furthermore, our algorithm can be modified to allow early termination: if it is terminated after O(beta(m+n)log(n)) steps for some beta < 1 (which can depend on n), then it returns a solution with approximation factor O(beta). Finally, we show that this runtime is optimal (up to logarithmic factors) for any beta and fixed seed size k.
Article
Full-text available
This paper studies the resolution of (augmented) weighted matching problems within a constraint programming (CP) framework. The first contribution of the paper is a set of techniques that improves substantially the performance of branch-and-bound algorithms based on constraint propagation and the second contribution is the introduction of weighted matching as a global constraint ( WeightedMatching), that can be propagated using specialized incremental algorithms from Operations Research. We first compare programming techniques that use constraint propagation with specialized algorithms from Operations Research, such as the Busaker and Gowen flow algorithm or the Hungarian method. Although CP is shown not to be competitive with specialized polynomial algorithms for pure matching problems, the situation is different as soon as the problems are modified with additional constraints. Using the previously mentioned set of techniques, a simpler branch-and-bound algorithm based on constraint propagation can outperform a complex specialized algorithm. These techniques have been applied with success to the Traveling Salesman Problems [5], which can be seen as an augmented matching problem. We also show that an incremental version of the Hungarian method can be used to propagate a WeightedMatching constraint. This is an extension to the weighted case of the work of Rgin [19], which we show to bring significant improvements on a timetabling example.
Article
Full-text available
In this work we present Cutting Plane Inference (CPI), a Maximum A Posteriori (MAP) inference method for Statistical Relational Learning. Framed in terms of Markov Logic and inspired by the Cutting Plane Method, it can be seen as a meta algorithm that instantiates small parts of a large and complex Markov Network and then solves these using a conventional MAP method. We evaluate CPI on two tasks, Semantic Role Labelling and Joint Entity Resolution, while plugging in two different MAP inference methods: the current method of choice for MAP inference in Markov Logic, MaxWalkSAT, and Integer Linear Programming. We observe that when used with CPI both methods are significantly faster than when used alone. In addition, CPI improves the accuracy of MaxWalkSAT and maintains the exactness of Integer Linear Programming.
Article
Full-text available
Real-life management decisions are usually made in uncertain environments, and decision support systems that ignore this uncertainty are unlikely to provide realistic guidance. We show that previous approaches fail to provide appropriate support for reasoning about reliability under uncertainty. We propose a new framework that addresses this issue by allowing logical dependencies between constraints. Reliability is then defined in terms of key constraints called “events”, which are related to other constraints via these dependencies. We illustrate our approach on three problems, contrast it with existing frameworks, and discuss future developments.
Conference Paper
Full-text available
We introduce in this paper two new, complete prepositional languages and study their properties in terms of (1) their support for polytime operations and (2) their ability to represent boolean functions compactly. The new languages are based on a structured version of decomposability-a property that underlies a number of tractable languages. The key characteristic of structured decomposability is its support for a polytime conjoin operation, which is known to be intractable for unstructured decomposability. We show that any CNF can be compiled into formulas in the new languages, whose size is only exponential in the treewidth of the CNF. Our study also reveals that one of the languages we identify is as powerful as OBDDs in terms of answering key inference queries, yet is more succinct than OBDDs. Copyright © 2008, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
Full-text available
State-of-the-art solvers for mixed integer programming (MIP) prob- lems are highly parameterized, and finding parameter settings that achieve high performance for specific types of MIP instances is challenging. We study the appli- cation of an automated algorithm configuration procedure to different MIP solvers, instance types and optimization objectives. We show that this fully-automated process yields substantial improvements to the performance of three MIP solvers: CPLEX, GUROBI, and LPSOLVE. Although our method can be used "out of the box" without any domain knowledge specific to MIP, we show that it outperforms the CPLEX special-purpose automated tuning tool.
Conference Paper
Full-text available
The semiring-based formalism to model soft constraint has been introduced in 1995 by Ugo Montanari and the authors of this pa- per. The idea was to make,constraint programming,more flexible and widely applicable. We also wanted to define the extension via a general formalism, so that all its instances could inherit its properties and be easily compared. Since then, much work has been done to study, extend, and apply this formalism. This papers gives a brief summary,of some of these research activities.
Conference Paper
Full-text available
Online social networks have become extremely popular; nu- merous sites allow users to interact and share content using social links. Users of these networks often establish hundreds to even thousands of social links with other users. Recently, researchers have suggested examining the activity network— a network that is based on the actual interaction between users, rather than mere friendship—to distinguish between strong and weak links. While initial studies have led to in- sights on how an activity network is structurally different from the social network itself, a natural and important as- pect of the activity network has been disregarded: the fact that over time social links can grow stronger or weaker. In this paper, we study the evolution of activity between users in the Facebook social network to capture this notion. We find that links in the activity network tend to come and go rapidly over time, and the strength of ties exhibits a gen- eral decreasing trend of activity as the social network link ages. For example, only 30% of Facebook user pairs interact consistently from one month to the next. Interestingly, we also find that even though the links of the activity network change rapidly over time, many graph-theoretic properties of the activity network remain unchanged.
Conference Paper
Full-text available
The problem of finding the best explanation for a set of observations is studied within various disciplines of artificial in- telligence. For a probabilistic network, finding the best explanation amounts to finding a value assignment to all the variables in the net- work that has highest posterior probability given the available obser- vations. This problem is known as the MPA, or maximum probability assignment, problem. In this paper, we establish the computational complexity of the MPA problem and of various closely related prob- lems. Among other results, we show that, while the MPA- problem, where an assignment with probability at least is to be found, is NP- hard, its fixed-parameter variant is solvable in linear time. hard, building upon a transformation from the VERTEXCOVER prob- lem (3); A. Abdelbar and S. Hedetniemi have extended this result to approximation of the MPA problem (4). In this paper, we once again address the computational complexity of the MPA problem, this time building upon a transformation from the 3-S ATISFIABILITY problem. This transformation allows us to establish complexity re- sults also for various other problems that are closely related to the MPA problem. More specifically, we establish NP-hardness of the MPA- problem which is the problem, given a probability ,t o fi nd an assignment with probability at least . We further show that, while the MPA- problem is NP-hard, its fixed-parameter variant where an assignment with probability at least is to be found for a fixed ratio- nal number , is solvable in linear time. The paper is organised as follows. In Section 2 we provide some preliminaries on probabilistic networks. In Section 3 we formally de- fine the MPA problem and various related problems. In Section 4 we provide a general construct with which we establish the computa- tional complexity of each of the problems defined in Section 3. In Section 5, we address the complexity of the fixed-parameter variant of the MPA- problem and present a linear algorithm for solving it. The paper ends with our concluding observations in Section 6.
Conference Paper
Full-text available
We present a new algorithm for computing up- per bounds for an optimization version of the E- MAJSAT problem called functional E-MAJSAT. The algorithm utilizes the compilation language d- DNNF which underlies several state-of-the-art al- gorithms for solving related problems. This bound computation can be used in a branch-and-bound solver for solving functional E-MAJSAT. We then present a technique for pruning values from the branch-and-bound search tree based on the infor- mation available after each bound computation. We evaluated the proposed techniques in a MAP solver and a probabilistic conformant planner. In both cases, our experiments showed that the new tech- niques improved the efficiency of state-of-the-art solvers by orders of magnitude.
Article
Modern society is increasingly reliant on the functionality of infrastructure facilities and utility services. Consequently, there has been surge of interest in the problem of quantification of system reliability, which is known to be #P-complete. Reliability also contributes to the resilience of systems, so as to effectively make them bounce back after contingencies. Despite diverse progress, most techniques to estimate system reliability and resilience remain computationally expensive. In this paper, we investigate how recent advances in hashing-based approaches to counting can be exploited to improve computational techniques for system reliability.The primary contribution of this paper is a novel framework, RelNet, that reduces the problem of computing reliability for a given network to counting the number of satisfying assignments of a Σ11 formula, which is amenable to recent hashing-based techniques developed for counting satisfying assignments of SAT formula. We then apply RelNet to ten real world power-transmission grids across different cities in the U.S. and are able to obtain, to the best of our knowledge, the first theoretically sound a priori estimates of reliability between several pairs of nodes of interest. Such estimates will help managing uncertainty and support rational decision making for community resilience.
Article
Maintaining landscape connectivity is increasingly important in wildlife conservation, especially for species experiencing the effects of habitat loss and fragmentation. We propose a novel approach to dynamically optimize landscape connectivity. Our approach is based on a mixed integer program formulation, embedding a spatial capture-recapture model that estimates the density, space usage, and landscape connectivity for a given species. Our method takes into account the fact that local animal density and connectivity change dynamically and non-linearly with different habitat protection plans. In order to scale up our encoding, we propose a sampling scheme via random partitioning of the search space using parity functions. We show that our method scales to real-world size problems and dramatically outperforms the solution quality of an expectation maximization approach and a sample average approximation approach.
Article
Stochastic programming is concerned with decision making under uncertainty, seeking an optimal policy with respect to a set of possible future scenarios. This paper looks at multistage decision problems where the uncertainty is revealed over time. First, decisions are made with respect to all possible future scenarios. Secondly, after observing the random variables, a set of scenario specific decisions is taken. Our goal is to develop algorithms that can be used as a back-end solver for high-level modeling languages. In this paper we propose a scenario decomposition method to solve multistage stochastic combinatorial decision problems recursively. Our approach is applicable to general problem structures, utilizes standard solving technology and is highly parallelizable. We provide experimental results to show how it efficiently solves benchmarks with hundreds of scenarios.
Conference Paper
A number of data mining problems on probabilistic networks can be modeled as Stochastic Constraint Optimization and Satisfaction Problems, i.e., problems that involve objectives or constraints with a stochastic component. Earlier methods for solving these problems used Ordered Binary Decision Diagrams (OBDDs) to represent constraints on probability distributions, which were decomposed into sets of smaller constraints and solved by Constraint Programming (CP) or Mixed Integer Programming (MIP) solvers. For the specific case of monotonic distributions, we propose an alternative method: a new propagator for a global OBDD-based constraint. We show that this propagator is (sub-)linear in the size of the OBDD, and maintains domain consistency. We experimentally evaluate the effectiveness of this global constraint in comparison to existing decomposition-based approaches, and show how this propagator can be used in combination with another data mining specific constraint present in CP systems. As test cases we use problems from the data mining literature.
Conference Paper
In this work we present Cutting Plane Inference (CPI), a Maximum A Posteriori (MAP) inference method for Statistical Relational Learning. Framed in terms of Markov Logic and inspired by the Cutting Plane Method, it can be seen as a meta algorithm that instantiates small parts of a large and complex Markov Network and then solves these using a conventional MAP method. We evaluate CPI on two tasks, Semantic Role Labelling and Joint Entity Resolution, while plugging in two different MAP inference methods: the current method of choice for MAP inference in Markov Logic, MaxWalkSAT, and Integer Linear Programming. We observe that when used with CPI both methods are significantly faster than when used alone. In addition, CPI improves the accuracy of MaxWalkSAT and maintains the exactness of Integer Linear Programming.
Conference Paper
Multi-Valued Decision Diagrams (MDDs) are instrumental in modeling combinatorial problems with Constraint Programming.In this paper, we propose a related data structure called sMDD (semi-MDD) where the central layer of the diagrams is non-deterministic.We show that it is easy and efficient to transform any table (set of tuples) into an sMDD.We also introduce a new filtering algorithm, called Compact-MDD, which is based on bitwise operations, and can be applied to both MDDs and sMDDs.Our experimental results show the practical interest of our approach, both in terms of compression and filtering speed.
Conference Paper
We propose to combine two successful techniques of Artificial Intelligence: sampling and Multi-valued Decision Diagrams (MDDs). Sampling, and notably Markov sampling, is often used to generate data resembling to a corpus. However, this generation has usually to respect some additional constraints, for instance to avoid plagiarism or to respect some rules of the application domain. We propose to represent the corpus dependencies and these side constraints by an MDD and to develop some algorithms for sampling the solutions of an MDD while respecting some probabilities or a Markov chain. In that way, we obtain a generic method which avoids the development of ad-hoc algorithms for each application as it is currently the case. In addition, we introduce new constraints for controlling the probabilities of the solutions that are sampled. We experiments our method on a real life application: the geomodeling of a petroleum reservoir, and on the generation of French alexandrines. The obtained results show the advantage and the efficiency of our approach.
Conference Paper
Constraint Programming is becoming competitive for solving certain data-mining problems largely due to the development of global constraints. We introduce the CoverSize constraint for itemset mining problems, a global constraint for counting and constraining the number of transactions covered by the itemset decision variables. We show the relation of this constraint to the well-known table constraint, and our filtering algorithm internally uses the reversible sparse bitset data structure recently proposed for filtering table. Furthermore, we expose the size of the cover as a variable, which opens up new modelling perspectives compared to an existing global constraint for (closed) frequent itemset mining. For example, one can constrain minimum frequency or compare the frequency of an itemset in different datasets as is done in discriminative itemset mining. We demonstrate experimentally on the frequent, closed and discriminative itemset mining problems that the CoverSize constraint with reversible sparse bitsets allows to outperform other CP approaches.
Conference Paper
We show that a number of problems in Artificial Intelligence can be seen as Stochastic Constraint Optimization Problems (SCOPs): problems that have both a stochastic and a constraint optimization component. We argue that these problems can be modeled in a new language, SC-ProbLog, that combines a generic Probabilistic Logic Programming (PLP) language, ProbLog, with stochastic constraint optimization. We propose a toolchain for effectively solving these SC-ProbLog programs, which consists of two stages. In the first stage, decision diagrams are compiled for the underlying distributions. These diagrams are converted into models that are solved using Mixed Integer Programming or Constraint Programming solvers in the second stage. We show that, to yield linear constraints, decision diagrams need to be compiled in a specific form. We introduce a new method for compiling small Sentential Decision Diagrams in this form. We evaluate the effectiveness of several variations of this toolchain on test cases in viral marketing and bioinformatics.
Article
Recent work on weighted model counting has been very successfully applied to the problem of probabilistic inference in Bayesian networks. The probability distribution is encoded into a Boolean normal form and compiled to a target language, in order to represent local structure expressed among conditional probabilities more efficiently. We show that further improvements are possible, by exploiting the knowledge that is lost during the encoding phase and incorporating it into a compiler inspired by Satisfiability Modulo Theories. Constraints among variables are used as a background theory, which allows us to optimize the Shannon decomposition. We propose a new language, called Weighted Positive Binary Decision Diagrams, that reduces the cost of probabilistic inference by using this decomposition variant to induce an arithmetic circuit of reduced size.
Article
Introduced by Darwiche (2011), sentential decision diagrams (SDDs) are essentially as tractable as ordered binary decision diagrams (OBDDs), but tend to be more succinct in practice. This makes SDDs a prominent representation language, with many applications in artificial intelligence and knowledge compilation. We prove that SDDs are more succinct than OBDDs also in theory, by constructing a family of boolean functions where each member has polynomial SDD size but exponential OBDD size. This exponential separation improves a quasipolynomial separation recently established by Razgon (2013), and settles an open problem in knowledge compilation.
Article
The Sentential Decision Diagram (SDD) is a recently proposed representation of Boolean functions, containing Ordered Binary Decision Diagrams (OBDDs) as a distinguished subclass. While OBDDs are characterized by total variable orders, SDDs are characterized more generally by vtrees. As both OBDDs and SDDs have canonical representations, searching for OBDDs and SDDs of minimal size simplifies to searching for variable orders and vtrees, respectively. For OBDDs, there are effective heuristics for dynamic reordering, based on locally swapping variables. In this paper, we propose an analogous approach for SDDs which navigates the space of vtrees via two operations: one based on tree rotations and a second based on swapping children in a vtree. We propose a particular heuristic for dynamically searching the space of vtrees, showing that it can find SDDs that are an order-of-magnitude more succinct than OBDDs found by dynamic reordering. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
A data structure is presented for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by C. Y. Lee (1959) and S. B. Akers (1978), but with further restrictions on the ordering of decision variables in the graph. Although, in the worst case, a function requires a graph where the number of vertices grows exponentially with the number of arguments, many of the functions encountered in typical applications have a more reasonable representation. The algorithms have time complexity proportional to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large. Experimental results are presented from applying these algorithms to problems in logic design verification that demonstrate the practicality of the approach.
Article
We present a new approach for inference in Bayesian networks, which is mainly based on partial differentiation. According to this approach, one compiles a Bayesian network into a multivariate polynomial and then computes the partial derivatives of this polynomial with respect to each variable. We show that once such derivatives are made available, one can compute in constant-time answers to a large class of probabilistic queries, which are central to classical inference, parameter estimation, model validation and sensitivity analysis. We present a number of complexity results relating to the compilation of such polynomials and to the computation of their partial derivatives. We argue that the combined simplicity, comprehensiveness and computational complexity of the presented framework is unique among existing frameworks for inference in Bayesian networks.
Article
MAP is the problem of finding a most probable instantiation of a set of variables in a Bayesian network given some evidence. Unlike computing posterior probabilities, or MPE (a special case of MAP), the time and space complexity of structural solutions for MAP are not only exponential in the network treewidth, but in a larger parameter known as the "constrained" treewidth. In practice, this means that computing MAP can be orders of magnitude more expensive than computing posterior probabilities or MPE. This paper introduces a new, simple upper bound on the probability of a MAP solution, which admits a tradeoff between the bound quality and the time needed to compute it. The bound is shown to be generally much tighter than those of other methods of comparable complexity. We use this proposed upper bound to develop a branch-and-bound search algorithm for solving MAP exactly. Experimental results demonstrate that the search algorithm is able to solve many problems that are far beyond the reach of any structure-based method for MAP. For example, we show that the proposed algorithm can compute MAP exactly and efficiently for some networks whose constrained treewidth is more than 40.
Article
A new conceptual and analytical vehicle for problems of temporal planning under uncertainty, involving determination of optimal (sequential) stochastic decision rules is defined and illustrated by means of a typical industrial example. The paper presents a method of attack which splits the problem into two non-linear (or linear) programming parts, (i) determining optimal probability distributions, (ii) approximating the optimal distributions as closely as possible by decision rules of prescribed form.
Article
A modelling language for Integer Programming (IP) based on the Predicate Calculus is described. This is particularly suitable for building models with logical conditions. Using this language a model is specified in terms of predicates. This is then converted automatically by a series of transformation rules into a normal form from which an IP model can be created. There is also some discussion of alternative IP formulations which can be incorporated into the system as options. Further practical considerations are discussed briefly concerning implementation language and incorporation into practical Mathematical Programming Systems.
Article
We propose an original, complete and efficient approach to the allocation and scheduling of Conditional Task Graphs (CTGs). In CTGs, nodes represent activities, some of them are branches and are labeled with a condition, arcs rooted in branch nodes are labeled with condition outcomes and a corresponding probability. A task is executed at run time if the condition outcomes that label the arcs in the path to the task hold at schedule execution time; this can be captured off-line by adopting a stochastic model. Tasks need for their execution either unary or cumulative resources and some tasks can be executed on alternative resources. The solution to the problem is a single assignment of a resource and of a start time to each task so that the allocation and schedule is feasible in each scenario and the expected value of a given objective function is optimized. For this problem we need to extend traditional constraint-based scheduling techniques in two directions: (i) compute the probability of sets of scenarios in polynomial time, in order to get the expected value of the objective function; (ii) define conditional constraints that ensure feasibility in all scenarios. We show the application of this framework on problems with objective functions depending either on the allocation of resources to tasks or on the scheduling part. Also, we present the conditional extension to the timetable global constraint. Experimental results show the effectiveness of the approach on a set of benchmarks taken from the field of embedded system design. Comparing our solver with a scenario based solver proposed in the literature, we show the advantages of our approach both in terms of execution time and solution quality.
Article
Many AI problems, when formalized, reduce to evaluating the probability that a propositional expression is true. In this paper we show that this problem is computationally intractable even in surprisingly restricted cases and even if we settle for an approximation to this probability.We consider various methods used in approximate reasoning such as computing degree of belief and Bayesian belief networks, as well as reasoning techniques such as constraint satisfaction and knowledge compilation, that use approximation to avoid computational difficulties, and reduce them to model-counting problems over a propositional domain.We prove that counting satisfying assignments of propositional languages is intractable even for Horn and monotone formulae, and even when the size of clauses and number of occurrences of the variables are extremely limited. This should be contrasted with the case of deductive reasoning, where Horn theories and theories with binary clauses are distinguished by the existence of linear time satisfiability algorithms. What is even more surprising is that, as we show, even approximating the number of satisfying assignments (i.e., “approximating” approximate reasoning), is intractable for most of these restricted theories.We also identify some restricted classes of propositional formulae for which efficient algorithms for counting satisfying assignments can be given.
Article
We present a new characterization of MACE in terms of problems in a classical area in optimization, decision-making under uncertainty. These problems are modeled by certain games played against a disinterested opponent who makes moves at random. We show several natural problems of this sort to be MACE-complete.
Article
A recent and effective approach to probabilistic inference calls for reducing the problem to one of weighted model counting (WMC) on a propositional knowledge base. Specifically, the approach calls for encoding the probabilistic model, typically a Bayesian network, as a propositional knowledge base in conjunctive normal form (CNF) with weights associated to each model according to the network parameters. Given this CNF, computing the probability of some evidence becomes a matter of summing the weights of all CNF models consistent with the evidence. A number of variations on this approach have appeared in the literature recently, that vary across three orthogonal dimensions. The first dimension concerns the specific encoding used to convert a Bayesian network into a CNF. The second dimensions relates to whether weighted model counting is performed using a search algorithm on the CNF, or by compiling the CNF into a structure that renders WMC a polytime operation in the size of the compiled structure. The third dimension deals with the specific properties of network parameters (local structure) which are captured in the CNF encoding. In this paper, we discuss recent work in this area across the above three dimensions, and demonstrate empirically its practical importance in significantly expanding the reach of exact probabilistic inference. We restrict our discussion to exact inference and model counting, even though other proposals have been extended for approximate inference and approximate model counting.
Conference Paper
The MAP (maximum a posteriori hypothesis) problem in Bayesian networks is to find the most likely states of a set of variables given partial evidence on the complement of that set. Standard structure-based inference methods for finding exact solutions to MAP, such as variable elimination and join- tree algorithms, have complexities that are exponential in the constrained treewidth of the network. A more recent algo- rithm, proposed by Park and Darwiche, is exponential only in the treewidth and has been shown to handle networks whose constrained treewidth is quite high. In this paper we present a new algorithm for exact MAP that is not necessarily limited in scalability even by the treewidth. This is achieved by leverag- ing recent advances in compilation of Bayesian networks into arithmetic circuits, which can circumvent treewidth-imposed limits by exploiting the local structure present in the network. Specifically, we implement a branch-and-bound search where the bounds are computed using linear-time operations on the compiled arithmetic circuit. On networks with local struc- ture, we observe orders-of-magnitude improvements over the algorithm of Park and Darwiche. In particular, we are able to efficiently solve many problems where the latter algorithm runs out of memory because of high treewidth.
Conference Paper
Over the past decade general satisability testing algorithms have proven to be surprisingly effective at solving a wide variety of constraint satisfaction problem, such as planning and scheduling (Kautz and Selman 2003). Solving such NP- complete tasks by ìcompilation to SATî has turned out to be an approach that is of both practical and theoretical interest. Recently, (Sang et al. 2004) have shown that state of the art SAT algorithms can be efciently extended to the harder task of counting the number of models (satisfying assignments) of a formula, by employing a technique called component caching. This paper begins to investigate the question of whether ìcompilation to model-countingî could be a practi- cal technique for solving real-world #P-complete problems, in particular Bayesian inference. We describe an efcient translation from Bayesian networks to weighted model count- ing, extend the best model-counting algorithms to weighted model counting, develop an efcient method for computing all marginals in a single counting pass, and evaluate the ap- proach on computationally challenging reasoning problems.
Conference Paper
In this paper we look at how to sparsify a graph i.e. how to reduce the edgeset while keeping the nodes intact, so as to enable faster graph clustering without sacrificing quality. The main idea behind our approach is to preferentially retain the edges that are likely to be part of the same cluster. We propose to rank edges using a simple similarity-based heuristic that we efficiently compute by comparing the minhash signatures of the nodes incident to the edge. For each node, we select the top few edges to be retained in the sparsified graph. Extensive empirical results on several real networks and using four state-of-the-art graph clustering and community discovery algorithms reveal that our proposed approach realizes excellent speedups (often in the range 10-50), with little or no deterioration in the quality of the resulting clusters. In fact, for at least two of the four clustering algorithms, our sparsification consistently enables higher clustering accuracies.
Conference Paper
We present a new algorithm for conformant probabilistic planning, which for a given horizon produces a plan that maximizes the probability of success under quantified uncertainty about the initial state and action effects, and absence of sensory information. Recent work has studied systematic search in the space of all candidate plans as a feasible approach to conformant probabilistic planning, but the algorithms proposed require caching of intermediate computations in such a way that memory is often exhausted quickly except for small planning horizons. On the other hand, planning problems in typical formulations generally have treewidths that do not grow with the horizon, as connections between variables are local to the neighborhood of each time step. These existing planners, however, are unable to directly benefit from the bounded treewidth owing to a constraint on the variable ordering which is necessary for correct computation of the optimal plan. We show that lifting such constraint allows one to obtain a compact compilation of the planning problem, from which an upper bound can be efficiently computed on the value of any partial plan generated during search. Coupled with several optimizations, this results in a depth-first branchand-bound algorithm which on the tested domains runs an order of magnitude faster than its predecessors, and at the same time is able to solve problems for significantly larger horizons thanks to its minimal memory requirements. Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
We introduce ProbLog, a probabilistic extension of Prolog. A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly sam- pled program, and these probabilities are mutually independent. The semantics of ProbLog is then de- fined by the success probability of a query, which corresponds to the probability that the query suc- ceeds in a randomly sampled program. The key contribution of this paper is the introduction of an effective solver for computing success probabili- ties. It essentially combines SLD-resolution with methods for computing the probability of Boolean formulae. Our implementation further employs an approximation algorithm that combines iterative deepening with binary decision diagrams. We re- port on experiments in the context of discovering links in real biological networks, a demonstration of the practical usefulness of the approach.
Conference Paper
We identify a new representation of propositional knowledge bases, the Sentential Decision Diagram (SDD), which is interesting for a number of reasons. First, it is canonical in the presence of additional properties that resemble reduction rules of OBDDs. Second, SDDs can be combined using any Boolean operator in polytime. Third, CNFs with n variables and treewidth w have canonical SDDs of size O(n2w), which is tighter than the bound on OBDDs based on pathwidth. Finally, every OBDD is an SDD. Hence, working with the latter does not preclude the former.
Conference Paper
Finite-domain constraint solvers based on Binary Decision Diagrams (BDDs) are a powerful technique for solving constraint prob- lems over finite set and integer variables represented as Boolean formulæ. Boolean Satisfiability (SAT) solvers are another form of constraint solver that operate on constraints on Boolean variables expressed in clausal form. Modern SAT solvers have highly optimized propagation mecha- nisms and also incorporate ecient conflict-clause learning algorithms and eective search heuristics based on variable activity, but these tech- niques have not been widely used in finite-domain solvers. In this paper we show how to construct a hybrid BDD and SAT solver which inherits the advantages of both solvers simultaneously. The hybrid solver makes use of an ecient algorithm for capturing the inferences of a finite- domain constraint solver in clausal form, allowing us to automatically and transparently construct a SAT model of a finite-domain constraint problem. Finally, we present experimental results demonstrating that the hybrid solver can outperform both SAT and finite-domain solvers by a substantial margin.
Article
Binary Decision Diagram (BDD) based set bounds propagation is a powerful approach to solving set-constraint satisfaction problems. However, prior BDD based techniques in- cur the significant overhead of constructing and manipulating graphs during search. We present a set-constraint solver which combines BDD-based set-bounds propagators with the learning abilities of a modern SAT solver. Together with a number of improvements beyond the basic algorithm, this solver is highly competitive with existing propagation based set constraint solvers.
Article
We present a new algorithm for probabilistic planning with no observability. Our algorithm, called Probabilistic-FF, extends the heuristic forward-search machinery of Conformant-FF to problems with probabilistic uncertainty about both the initial state and action effects. Specifically, Probabilistic-FF combines Conformant-FFs techniques with a powerful machinery for weighted model counting in (weighted) CNFs, serving to elegantly define both the search space and the heuristic function. Our evaluation of Probabilistic-FF shows its fine scalability in a range of probabilistic domains, constituting a several orders of magnitude improvement over previous results in this area. We use a problematic case to point out the main open issue to be addressed by further research.
Article
To model combinatorial decision problems involving uncertainty and proba- bility, we introduce stochastic constraint programming. S tochastic constraint pro- grams contain both decision variables (which we can set) and stochastic variables (which follow a probability distribution). They combine together the best fea- tures of traditional constraint satisfaction, stochastic integer programming, and stochastic satisfiability. We give a semantics for stochast ic constraint programs, and present a complete forward checking algorithm. Finally, we discuss a number of extensions of stochastic constraint programming to rela x various assumptions like the independence between stochastic variables, and compare stochastic con- straint programming with other approaches for decision making under uncertainty like Markov decision problems and influence diagrams.