Conference PaperPDF Available

DTProbLog: A Decision-Theoretic Probabilistic Prolog.

Authors:

Abstract and Figures

We introduce DTPROBLOG, a decision-theoretic exten- sion of Prolog and its probabilistic variant ProbLog. DT- PROBLOG is a simple but expressive probabilistic program- ming language that allows the modeling of a wide variety of domains, such as viral marketing. In DTPROBLOG, the util- ity of a strategy (a particular choice of actions) is defined as the expected reward for its execution in the presence of prob- abilistic effects. The key contribution of this paper is the in- troduction of exact, as well as approximate, solvers to com- pute the optimal strategy for a DTPROBLOG program and the decision problem it represents, by making use of binary and algebraic decision diagrams. We also report on exper- imental results that show the effectiveness and the practical usefulness of the approach.
Content may be subject to copyright.
DTPROBLO G: A Decision-Theoretic Probabilistic Prolog
Guy Van den Broeck and Ingo Thon and Martijn van Otterlo and Luc De Raedt
Department of Computer Science
Katholieke Universiteit Leuven
Celestijnenlaan 200A, B-3001 Heverlee, Belgium
{guy.vandenbroeck, ingo.thon, martijn.vanotterlo, luc.deraedt}@cs.kuleuven.be
Abstract
We introduce DTPROB LOG, a decision-theoretic exten-
sion of Prolog and its probabilistic variant ProbLog. D T-
PROB LOG is a simple but expressive probabilistic program-
ming language that allows the modeling of a wide variety of
domains, such as viral marketing. In DT PRO BLO G, the util-
ity of a strategy (a particular choice of actions) is defined as
the expected reward for its execution in the presence of prob-
abilistic effects. The key contribution of this paper is the in-
troduction of exact, as well as approximate, solvers to com-
pute the optimal strategy for a DTPRO BLOG program and
the decision problem it represents, by making use of binary
and algebraic decision diagrams. We also report on exper-
imental results that show the effectiveness and the practical
usefulness of the approach.
1. Introduction
Artificial intelligence is often viewed as the study of how to
act rationally (Russell and Norvig 2003). The problem of
acting rationally has been formalized within decision theory
using the notion of a decision problem. In this type of prob-
lem, one has to choose actions from a set of alternatives,
given a utility function. The goal is to select the strategy
(set or sequence of actions) that maximizes the utility func-
tion. While the field of decision theory has devoted a lot of
effort to deal with various forms of knowledge and uncer-
tainty, there are so far only a few approaches that are able
to cope with both uncertainty and rich logical or relational
representations (see (Poole 1997; Nath and Domingos 2009;
Chen and Muggleton 2009)). This is surprising, given the
popularity of such representations in the field of statistical
relational learning (Getoor and Taskar 2007; De Raedt et al.
2008).
To alleviate this situation, we introduce a novel frame-
work combining ProbLog (De Raedt, Kimmig, and Toivo-
nen 2007; Kimmig et al. 2008), a simple probabilistic Pro-
log, with elements of decision theory. The resulting prob-
abilistic programming language DTPROBLOG (Decision-
Theoretic ProbLog) is able to elegantly represent decision
problems in complex relational and uncertain environments.
A DTPRO BLO G program consists of a set of definite clauses
(as in Prolog), a set of probabilistic facts (as in ProbLog),
Copyright c
2010, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
and, in addition, a set of decision facts, specifying which
decisions are to be made, and a set of utility attributes, spec-
ifying the rewards that can be obtained. Further key con-
tributions of this paper include the introduction of an exact
algorithm for computing the optimal strategy as well as a
scalable approximation algorithm that can tackle large deci-
sion problems. These algorithms adapt the BDD based infer-
ence mechanism of ProbLog. While DTProbLog’s represen-
tation and spirit are related to those of e.g. (Poole 1997) for
ICL, (Chen and Muggleton 2009) for SLPs, and (Nath and
Domingos 2009) for MLNs, its inference mechanism is dis-
tinct in that it employs state-of-the-art techniques using de-
cision diagrams for computing the optimal strategy exactly;
cf. the related work section for a more detailed comparison.
The paper is organized as follows: in Section 2, we intro-
duce DTPRO BLO G and its semantics; Section 3 discusses
inference and Section 4 how to find the optimal strategy for
a DTPRO BLO G program; Section 5 reports on some exper-
iments and Section 6 describes related work; Finally, we
conclude in Section 7. We assume familiarity with standard
concepts from logic programming (see e.g. Flach (1994)).
2. Decision-Theoretic ProbLog
ProbLog (De Raedt, Kimmig, and Toivonen 2007; Kimmig
et al. 2008) is a recent probabilistic extension of Prolog.
A ProbLog theory Tconsists of a set of labeled facts F
and a set of definite clauses BK that express the background
knowledge. The facts pi:: fiin Fare annotated with a
probability pistating that fiθis true with probability pifor
all substitutions θgrounding fi. These random variables are
assumed to be mutually independent. A ProbLog theory
describes a probability distribution over Prolog programs
L=FL∪ BK where FL⊆ F Θand FΘdenotes the set of
all possible ground instances of facts in F.1
P(L|T ) = Y
fi∈FL
piY
fi∈FΘ\FL
(1 pi)
The success probability of a query qis then
P(q|T ) = X
L
P(q|L)·P(L|T )(1)
1Throughout the paper, we shall assume that FΘis finite for
notational convenience, but see (Sato 1995) for the infinite case.
where P(q|L)=1if there exists a θsuch that L|=.
Observe also that ProbLog defines a probability distribu-
tion Pwover possible worlds, that is, Herbrand interpreta-
tions. Indeed, each atomic choice, or each FL⊆ F Θ, can
be extended into a possible world by computing the least
Herbrand model of L=FL∪ BK. This possible world is
assigned the probability Pw= P(L|T ).
In addition to the background knowledge BK and
the probabilistic facts Fwith their probabilities, a D T-
PROB LOG program consists of a set of decision facts Dand
utility attributes U, which we will now define.
Decisions and Strategies
Decision variables are represented by facts, and so, by anal-
ogy with the set of probabilistic facts F, we introduce D,
the set of decision facts of the form ? :: d. Each such dis an
atom and the label ?indicates that dis a decision fact. Note
that decision facts can be non-ground.
Astrategy σis then a function D → [0,1], mapping a
decision fact to the probability that the agent assigns to it.
Observe that there is one probability assigned to each deci-
sion fact. All instances – these are the groundings – of the
same decision fact are assigned the same probability, real-
izing parameter tying at the level of decisions. For a set of
decision facts Dwe denote with σ(D)the set of probabilis-
tic facts obtained by labeling each decision fact (? :: d)∈ D
as σ(d) :: d. Taking into account the decision facts Dand
the strategy σ, the success probability can be defined as:
P(q|F σ(D), BK ),
which, after the mapping σ(D), is a standard ProbLog query.
Abusing notation we will use σ(DT )to denote ProbLog
program σ(D)∪ F with the background knowledge BK.
Example 1. As a running example we will use the following
problem of dressing for unpredictable weather:
D={? :: umbrella,
? :: raincoat}
F={0.3:: rainy,
0.5:: windy}
BK =broken umbrella :- umbrella,rainy,windy.
dry :- rainy,umbrella,not(broken umbrella).
dry :- rainy,raincoat.
dry :- not(rainy).
There are two decisions to be made: whether to bring an um-
brella and whether to wear a raincoat. Furthermore, rainy
and windy are probabilistic facts. The background knowl-
edge describes when one gets wet and when one breaks the
umbrella due to heavy wind. The probability of dry for the
strategy {umbrella 7→ 1,raincoat 7→ 1}is 1.0.
Rewards and Expected Utility
The set Uconsists of utility attributes of the form uiri,
where uiis a literal and ria reward for achieving ui. The
semantics is that whenever the query uisucceeds, this yields
a reward of ri. Thus, utility attributes play a role analog to
queries in Pro(b)log. The reward is given only once, regard-
less for how many substitutions it succeeds.
For a Prolog program L, defining a possible world
through its least Herbrand model, we define the utility of
uito be the reward due to ui. The utility attributes have to
be additive such that
Util(L) = X
(uiri)∈ U
ri·P(ui|L).
Because a ProbLog theory Tdefines a probability distribu-
tion over Prolog programs, this gives a total expected utility
of
Util(T) = X
(uiri)∈ U
ri·P(ui|T )
where P(ui|T )is defined in Equation 1.
Example 2. We extend Example 1 with utilities:
umbrella→ −2
raincoat→ −20
dry 60
broken umbrella → −40
Bringing an umbrella, breaking the umbrella or wearing a
raincoat incurs a cost. Staying dry gives a reward.
Given these definitions, the semantics of a DTPRO BLOG
theory is defined in terms of the utility of a strategy σ. The
expected utility of a single utility attribute ai=(uiri)is
Util(ai|σ, DT ) = ri·P(ui|σ(DT )) (2)
and the total utility is
Util(σ, DT ) = X
ai∈ U
Util(ai|σ(DT )) = Util(σ(DT )).(3)
Thus, the total utility is the sum of the utilities of each util-
ity attribute and the expected utility for a single attribute is
proportionate to its success probability.
3. Inference
The first problem that we will tackle is how to perform in-
ference in DT PROBLO G, that is, how to compute the utility
Util(σ, DT )of a particular strategy σin a DTPROB LOG
program DT . This is realized by first computing the suc-
cess probability of all utility literals uioccurring in Uus-
ing the standard ProbLog inference mechanism. The over-
all utility Util(σ, DT )can then be computed using Eq. 3.
Let us therefore sketch how ProbLog answers the queries
P(q|σ(DT )). This is realized in two steps; cf. (De Raedt,
Kimmig, and Toivonen 2007; Kimmig et al. 2008) .
First, all different proofs for the query qare found using
SLD-resolution in the ProbLog program σ(DT ). The prob-
abilistic facts and decisions that are used in these proofs are
gathered in a DNF formula. Each proof relies on the con-
junction of probabilistic facts that needs to be true to prove
qand, hence, the DNF formula represents the disjunction of
these conditions. This reduces the problem of computing the
probability of the query qto that of computing the probabil-
ity of the DNF formula. However, because the conjunctions
in the different proofs are not mutually exclusive, one cannot
compute the probability of the DNF formula as the sum of
the probabilities of the different conjunctions as this would
lead to values larger than one. Therefore, the second step
Algorithm 1 Calculating the probability of a BDD
function PROB (BDD-node n)
if nis the 1-terminal then return 1
if nis the 0-terminal then return 0
let hand lbe the high and low children of n
return pn·PROB(h) + (1 pn)·PROB(l)
solves this disjoint-sum problem by constructing a Binary
Decision Diagram (BDD) (Bryant 1986) that represents the
DNF formula. A BDD is an efficient graphical representa-
tion for boolean formulas. Example BDDs are shown in Fig-
ure 1. The BDD can be seen as a decision tree. To determine
whether a boolean variable assignment satisfies the formula
represented by the BDD, one starts at the root of the BDD
and depending on the value of the top proposition, takes the
dashed/low/false or solid/high/true branch. The procedure
is called recursively on the resulting node until a terminal is
reached. Because in a BDD each variable occurs only once
on a path, it can be used to compute the probability of the
DNF formula with Algorithm 1.
Example 3. Using Algorithm 1 and Figure 1a, it is easy
to see that for the strategy of bringing an umbrella and
not wearing a raincoat, the success probability of staying
dry is P(dry|σ(DT )) = 0.7+0.3·0.5 = 0.85 and that
Util(dry|σ, DT ) = 60 ·0.85 = 51. Using the BDD
for broken umbrella, we can calculate Util(σ, DT ) =
51 + (0.15 ·(40)) + (2) = 43.
raincoat
rainy
umbrella
windy
10
(a) dry
rainy
1
0
windy
umbrella
(b) broken umbrella
Figure 1: BDDs for dry and broken umbrella.
4. Solving Decision Problems
When faced with a decision problem, one is interested in
computing the optimal strategy, that is, according to the
maximum expected utility principle, finding σ:
σ= argmaxσ(Util(σ, DT )).
This strategy is the solution to the decision problem. We
will first introduce an exact algorithm for finding determin-
istic solutions and then outline two ways to approximate the
optimal strategy.
Exact Algorithm
Our exact algorithm makes use of Algebraic Decision Dia-
grams (ADD) (Bahar et al. 1997) to efficiently represent the
Algorithm 2 Finding the exact solution for DT
function EXAC TSOLUTION(Theory DT )
ADDutil
tot(σ)a 0-terminal
for each (ur)∈ U do
BDDu(DT )BINARYDD(u)
ADDu(σ)PROBABILI TY DD(BDDu(DT ))
ADDutil
u(σ)r·ADDu(σ)
ADDutil
tot(σ)ADDutil
tot(σ)ADDutil
u(σ)
let tmax be the terminal node of ADDu
tot(σ)with the
highest utility
let pbe a path from tmax to the root of ADDu
tot(σ)
return the boolean decisions made on p
function PROBABILI TY DD(BDD-node n)
if nis the 1-terminal then return a 1-terminal
if nis the 0-terminal then return a 0-terminal
let hand lbe the high and low children of n
ADDhPROBABILI TY DD(h)
ADDlPROBABILI TY DD(l)
if nrepresents a decision dthen
return ITE(d, ADDh,ADDl)
if nrepresents a fact with probability pthen
return (pn·ADDh)((1 pn)·ADDl)
utility function Util(σ, DT ). ADDs generalize BDDs such
that leaves may take on any value and can be used to rep-
resent any function from booleans to the reals [0,1]nR.
Operations on ADDs relevant for this paper are the scalar
multiplication c·gof an ADD gwith the constant c, the addi-
tion fgof two ADDs, and the if-then-else test IT E(b, f, g ).
Using these primitive operations we construct:
1. BDDu(DT )representing DT |=uas a function of the
probabilistic and decision facts in DT .
2. ADDu(σ)representing P(u|σ, DT )as function of σ.
3. ADDutil
u(σ)representing Util(u|σ, DT )as function of σ
4. ADDutil
tot(σ)representing Util(σ, DT )as function of σ
These four diagrams map to the steps in the for-loop of Al-
gorithm 2. The first step builds the BDD for the query uias
described in Section 3. The difference is that the nodes rep-
resenting decisions get marked as such, instead of getting
a0/1-probability assigned. In the second step, this BDD
is transformed into an ADD using Algorithm 2, an adapta-
tion of Algorithm 1. The resulting ADD contains as internal
nodes only decision nodes and the probabilities are propa-
gated into the leafs. The third step scales ADDu(σ)by the
reward for uas in Equation 2.
Example 4. Figure 2 shows the ADDdry (σ)and
ADDbroken umbrella (σ), constructed using Algorithm 2
from the BDDs in Figure 1. The former confirms that
P(dry|σ(DT )) = 0.85 for the strategy given in Example 3.
The transformation to ADDutil
u(σ)is done by replacing the
terminals by their dashed alternatives.
Finally, in the fourth step, this ADD is added to the global
sum ADDutil
tot(σ)according to Equation 3, modeling the ex-
raincoat
umbrella
raincoat
0.85
51
1
60
0.7
42
(a) dry
0
0
0.15
-6
umbrella
(b) broken umbrella
Figure 2: ADDdry (σ)and ADDbroken umbr ella(σ). The al-
ternative, dashed terminals belong to ADDutil
u(σ).
pected utility of the entire DTPROBLOG theory. From the
final ADD, the globally optimal strategy σis extracted by
following a path from the leaf with the highest value to the
root of the ADD. Because ADDs provide a compressed rep-
resentation and efficient operations, the exact solution algo-
rithm is able to solve more problems in an exact manner than
could be done by naively enumerating all possible strategies.
Example 5. Figure 3 shows ADDutil
tot(σ). It confirms that
the expected utility of the strategy from Example 3 is 43.
It turns out that this is the optimal strategy. For wearing a
raincoat, the increased probability of staying dry does not
outweigh the cost. For bringing an umbrella, it does.
raincoat
umbrella
raincoat
4332 42
40
Figure 3: ADDutil
tot(σ)for Util(σ, DT )
Sound Pruning
Algorithm 2 can be further improved by avoiding unneces-
sary computations. The algorithm not only finds the best
strategy, but also represents the utility of all possible strate-
gies. Since we are not interested in those values, they can be
removed from the ADDs when they become irrelevant for
finding the optimal value. The idea is to keep track of the
maximal utility that can be achieved by the utility attributes
not yet added to the ADD. While adding further ADDs, all
nodes of the intermediate ADD that can not yield a value
higher than the current maximum can be pruned. For this,
we define the maximal impact of a utility attribute to be
Im(ui) = max(ADDutil
ui(σ)) min(ADDutil
ui(σ)),
where max and min are the maximal and minimal terminals.
Before adding ADDutil
ui(σ)to the intermediate ADDutil
tot(σ),
we merge all terminals from ADDutil
tot(σ)with a value below
max(ADDutil
tot(σ)) X
ji
Im(uj)
by setting their value to minus infinity. These values are
so low that even in the best case, they will never yield the
maximal value in the final ADD. By sorting the utility at-
tributes by decreasing values of Im(u), even more nodes are
removed from the ADD. This improvement still guarantees
that an optimal solution is found. In the following sections,
we will show two improvements which will not have this
guarantee, but are much faster. The two improvements can
be used together or independently.
Local Search
Solving a DT PROBLO G program is essentially a function
optimization problem for Util(σ, DT )and can be formal-
ized as a search problem in the strategy space. We apply a
standard greedy hill climbing algorithm that searches for a
locally optimal strategy. This way, we avoid the construction
of the ADDs in steps 2-4 of Algorithm 2. The search starts
with a random strategy and iterates repeatedly over the deci-
sions. It tries to flip a decision, forming σ0. If Util(σ0,DT )
improves on the previous utility, σ0is kept as the current
best strategy. The utility value can be computed using
Equations (3) and (2). To efficiently calculate Util(σ0,DT ),
we use the BDDs generated by the BINA RYDD function
of Algorithm 2. During the search, the BDDs can be
kept fixed. Only the probability values for those BDDs
that are effected by the changed decision have to be updated.
Approximative Utility Evaluation
The second optimization is concerned with the first step of
Algorithm 2 that finds all proofs for the utility attributes.
In large decision problems this quickly becomes intractable.
A number of approximative inference methods exist for
ProbLog (Kimmig et al. 2008), among which is the k-best
approximation. The idea behind it is that, while there are
many proofs, only a few contribute significantly to the to-
tal probability. It incorporates only those kproofs where
the product of the random variables that make up the proof
is highest, computing a lower bound on the success proba-
bility of the query. The required proofs are found using a
branch-and-bound algorithm. Similarly, we use the k-best
proofs for the utility attributes to build the BDDs and ADDs
in the strategy solution algorithms. This reduces the runtime
and the complexity of the diagrams. For sufficiently high
values of k, the solution strategy found will be optimal.
5. Experiments
The experiments were set up to answer the questions:
(Q1) Does the exact algorithm perform better than naively
calculating the utility for all possible strategies? (Q2) How
does local search compare to the exact algorithm in terms
of runtime and solution quality? (Q3) What is the trade off
between runtime and solution quality for different values of
k, using the k-best proofs approximation? (Q4) Do the al-
gorithms scale to large, real world problems?
To answer these questions we tested the algorithms on the
viral marketing problem, a prime example of relational non-
sequential decision making. The viral marketing problem
was formulated by Domingos and Richardson (2001) and
Figure 4: Runtime of solving viral marketing in random
graphs of increasing size. The values are averaged over three
runs on four different graphs of the same size. Methods dif-
fer in the search algorithm and the number of proofs used.
used in experiments with Markov Logic (Nath and Domin-
gos 2009). Given a social network structure consisting of
trusts(a,b)relations, the decisions are whether or not to
market to individuals in the network. A reward is given for
people buying the product and marketing to an individual
has a cost. People that are marketed or that trust some-
one that bought the product may buy the product. In D T-
PROB LOG, this problem can be modeled as:
? :: market(P):- person(P).
0.4:: viral(P,Q).
0.3:: from marketing(P).
market(P)→ −2:- person(P).
buys(P)5:- person(P).
buys(P):- market(P),from marketing(P).
buys(P):- trusts(P,Q),buys(Q),viral(P,Q).
The example shows the use of syntactic sugar in the form of
templates for decisions and utility attributes. It is allowed
to make decision facts and utility attributes conditional on a
body of literals. For every substitution for which the body
succeeds, a corresponding decision fact or utility attribute
is constructed. We impose the restriction that these bodies
can only depend on deterministic facts. For instance, in the
example, there is one market decision for each person.
We tested the algorithms2on the viral marketing problem
using a set of synthetic power law random graphs, known
to resemble social networks (Barabasi and Bonabeau 2003).
The average number of edges or trust relations per person
was chosen to be 2. Figure 4 shows the runtime for different
solution algorithms on graphs with increasing node count.
For reproducibility, we start the local search algorithm from
a zero-vector for σand flip decision in a fixed order.
2Implemented in YAP 6 http://www.dcc.fc.up.
pt/˜vsc/Yap/ for the Prolog part and simpleCUDD
2.0.0 http://www.cs.kuleuven.be/˜theo/tools/
simplecudd.html for the decision diagrams.
This allows us to answer the first three questions:
(Q1) While the exact algorithm is fast for small problems
and guaranteed to find the optimal strategy, it becomes un-
feasible on networks with more than 30 nodes. The 10-node
problem is the final one solvable by the naive approach and
takes over an hour to compute. Solving the 30-node graph
in a naive manner would require over a billion inference
steps, which is intractable. The exact algorithm clearly out-
performs a naive approach. (Q2) Local search solves up to
55-node problems when it takes all proofs into account and
was able to find the globally optimal solution for those prob-
lems where the exact algorithm found a solution. This is not
necessarily the case for other decision problems with more
deterministic dependencies. (Q3) After the 55-node point,
the BDDs had to be approximated by the k-best proofs. For
higher values of k, search becomes slower but is more likely
to find a better strategy. For klarger than 20 the utility was
within 2% of the best found policy for all problem sizes.
To answer (Q4), we experimented on a real world dataset
of trust relations extracted from the Epinions3social net-
work website (Richardson and Domingos 2002). The net-
work contains 75888 people that each trust 6 other people
on average. Local search using the 17-best proofs finds a
locally optimal strategy for this problem in 16 hours.
6. Related Work
Several AI-subfields are related to DTPRO BLOG, either be-
cause they focus on the same problem setting or because
they use compact data-structures for decision problems.
Closely related is the independent choice logic (ICL) (Poole
1997), which shares its distribution semantics (Sato 1995)
with ProbLog, and which can represent the same kind of de-
cision problems as DT PROBLO G. Similar to D TPROB LOG
being an extension of an existing language ProbLog, so
have two related system been extended towards utilities re-
cently. Nath and Domingos (2009) introduce Markov logic
decision networks (ML DN) based on Markov logic net-
works and Chen and Muggleton (2009) extend stochastic
logic programs (SL P) towards decision-theoretic logic pro-
grams (DTLP). The DTLP approach is close to the syn-
tax and semantics of DT PROBLO G, although some restric-
tions are put on the use of decisions in probabilistic clauses.
Chen and Muggleton also devise a parameter learning al-
gorithm derived from SLPs. Nath and Domingos (2009)
introduce Markov logic decision networks (MLDN) based
on Markov logic networks. Many differences between
MLNs and ProbLog exist and these are carried over to DT-
PROB LOG. Yet we are able to test on the same problems,
as described earlier. Whereas DTPROBLOG’s inference and
search can be done both exact and approximative, MLD N’s
methods only compute approximate solutions. Some other
formalisms too can model decision problems e.g. IBAL (Pf-
effer 2001), and relational decision networks (Hsu and Joe-
hanes 2004)). Unlike D TP ROB LOG, DTLPs, ICL and re-
lational decision networks currently provide no implemen-
tation based on efficient data structures tailored towards de-
cision problems and hence, no results are reported on a large
3http://www.epinions.com/
problem such as the Epinions dataset. IBAL has difficul-
ties to represent situations in which properties of different
objects are mutually dependent, like in the viral marketing
example.
DTPROB LOG is also related to various works on Markov
decision processes. In contrast to DTProbLog, these are
concerned with sequential decision problems. Nevertheless,
they are related in the kind of techniques they employ. For
instance, for factored Markov decision processes (FMDPs),
SPUDD (Hoey et al. 1999) also uses ADDs to represent
utility functions, though it cannot represent relational de-
cision problems and is not a programming language. On
the other hand, there exist also first-order (or, relational)
Markov decision processes (FOMDP), see (van Otterlo
2009). Techniques for FOMDPs have often been devel-
oped by upgrading corresponding algorithms for FM DPs to
the relational case, including the development of compact
first-order decision diagrams by Wang, Joshi, and Khardon
(2008) and Sanner and Boutilier (2009). While first-order
decision diagrams are very attractive, they are not yet as
well understood and well developed as their propositional
counterparts. A unique feature of DTPROB LOG (and the
underlying ProbLog system) is that it solves relational deci-
sion problems by making use of efficient propositional tech-
niques and data structures.
Perhaps the most important question for further research
is whether and how DTPRO BLO G and its inference algo-
rithms can be adapted for use in sequential decision prob-
lems and FOMDPs. DTPRO BLO G is, in principle, expres-
sive enough to model such problems because it is a pro-
gramming language, allowing the use of structured terms
such as lists or natural numbers to represent sequences and
time. However, while DTPRO BLO G can essentially repre-
sent such problems, a computational investigation of the ef-
fectiveness of its algorithms for this type of problem still
needs to be performed and may actually motivate further
modifications or extensions of DTPRO BLO G’s engine, such
as tabling (currently under development for ProbLog) or the
use of first order decision diagrams.
7. Conclusions
A new decision-theoretic probabilistic logic programming
language, called DTPRO BLO G, has been introduced. It is
a simple but elegant extension of the probabilistic Prolog
ProbLog. Several algorithms for performing inference and
computing the optimal strategy for a DTPRO BLO G program
have been introduced. This includes an exact algorithm to
compute the optimal strategy using binary and algebraic de-
cisions diagrams as well as two approximation algorithms.
The resulting algorithms have been evaluated in experiments
and shown to work on a real life application.
Acknowledgments This work was supported in part by
the Research Foundation-Flanders (FWO-Vlaanderen) and
the GOA project 2008/08 Probabilistic Logic Learning.
References
Bahar, R.; Frohm, E.; Gaona, C.; Hachtel, G.; Macii, E.;
Pardo, A.; and Somenzi, F. 1997. Algebraic Decision Di-
agrams and Their Applications. Formal Methods in System
Design 10:171–206.
Barabasi, A., and Bonabeau, E. 2003. Scale-free networks.
Scientific American 288(5):50–59.
Bryant, R. 1986. Graph-based algorithms for boolean
function manipulation. IEEE Transactions on computers
35(8):677–691.
Chen, J., and Muggleton, S. 2009. Decision-Theoretic Logic
Programs. In Proceedings of ILP.
De Raedt, L.; Frasconi, P.; Kersting, K.; and Muggleton, S.,
eds. 2008. Probabilistic Inductive Logic Programming -
Theory and Applications, volume 4911 of Lecture Notes in
Computer Science. Springer.
De Raedt, L.; Kimmig, A.; and Toivonen, H. 2007.
ProbLog: A probabilistic Prolog and its application in link
discovery. In Proceedings of IJCAI, 2462–2467.
Domingos, P., and Richardson, M. 2001. Mining the net-
work value of customers. In Proceedings of KDD, 57–66.
Flach, P. 1994. Simply Logical: Intelligent Reasoning by
Example.
Getoor, L., and Taskar, B. 2007. Introduction to statistical
relational learning. MIT Press.
Hoey, J.; St-Aubin, R.; Hu, A.; and Boutilier, C. 1999.
SPUDD: Stochastic planning using decision diagrams. Pro-
ceedings of UAI 279–288.
Hsu, W., and Joehanes, R. 2004. Relational Decision Net-
works. In Proceedings of the ICML Workshop on Statistical
Relational Learning.
Kimmig, A.; Santos Costa, V.; Rocha, R.; Demoen, B.; and
De Raedt, L. 2008. On the efficient execution of ProbLog
programs. In Proceedings of ICLP.
Nath, A., and Domingos, P. 2009. A Language for Rela-
tional Decision Theory. In Proceedings of SRL.
Pfeffer, A. 2001. IBAL: A probabilistic rational program-
ming language. In Proceedings of IJCAI, volume 17, 733–
740.
Poole, D. 1997. The independent choice logic for mod-
elling multiple agents under uncertainty. Artificial Intelli-
gence 94(1-2):7–56.
Richardson, M., and Domingos, P. 2002. Mining
knowledge-sharing sites for viral marketing. In Proceedings
of KDD, 61.
Russell, S., and Norvig, P. 2003. Artificial intelligence: A
modern approach. Prentice Hall.
Sanner, S., and Boutilier, C. 2009. Practical solution tech-
niques for first-order MDPs. Artificial Intelligence 173(5-
6):748–788.
Sato, T. 1995. A statistical learning method for logic pro-
grams with distribution semantics. In Proceedings of ICLP,
715–729.
van Otterlo, M. 2009. The logic of adaptive behavior. IOS
Press, Amsterdam.
Wang, C.; Joshi, S.; and Khardon, R. 2008. First order
decision diagrams for relational MDPs. Journal of Artificial
Intelligence Research 31(1):431–472.
... f utility attributes, which indicate the utility (i.e., a reward, possibly negative) of completing a particular task. The goal is to find the strategy that optimizes the overall expected utility. Expressing DT problems with (probabilistic) logic languages enables users to identify the best action in uncertain and complex domains. While DTProbLog ( Van den Broeck et al . 2010) is a ProbLog extension that solves DT tasks represented with a ProbLog program, no tool, to the best of our knowledge, is available to solve them with a probabilistic answer set language. We believe that a (probabilistic) ASP-based tool, by providing expressive syntactic constructs such as aggregates and choice rules, would be of great ...
... In this paper we close this gap and introduce decision theory problems in Probabilistic Answer Set Programming under the credal semantics (DTPASP). In particular, we extend Probabilistic Answer Set Programming under the credal semantics (PASP) with decision atoms and utility attributes (Van den Broeck et al . 2010). Every subset of decision atoms defines a different strategy, that is a different set of actions that can be performed in the domain of interest. In the viral marketing example, the decisions are whether to target an individual with an ad. However, there is uncertainty on the actual effectiveness of the targeting action. At the same tim ...
... This work is inspired to DTProbLog ( Van den Broeck et al . 2010). If we only consider normal rules, the decision theory task can be expressed with both DTProbLog and our framework, but our framework is more general, since it admits a large subset of the whole ASP syntax. The possibility of expressing decision theory problems with ASP gathered a lot of research interest in the past years. The author ...
Article
Full-text available
Solving a decision theory problem usually involves finding the actions, among a set of possible ones, which optimize the expected reward, while possibly accounting for the uncertainty of the environment. In this paper, we introduce the possibility to encode decision theory problems with Probabilistic Answer Set Programming under the credal semantics via decision atoms and utility attributes. To solve the task, we propose an algorithm based on three layers of Algebraic Model Counting, that we test on several synthetic datasets against an algorithm that adopts answer set enumeration. Empirical results show that our algorithm can manage non-trivial instances of programs in a reasonable amount of time.
... Expressing DT problems with (probabilistic) logic languages allows the user to find the best action to take in uncertain complex domains. While DTProbLog ( Van den Broeck et al. 2010) is a ProbLog extension that solves DT tasks represented with a ProbLog program, no tool, to the best of our knowledge, is available to solve them with a probabilistic answer set language. We believe that a (probabilistic) ASP-based tool, by providing expressive syntactic constructs such as aggregates and choice rules, would be of great support in complex environments. ...
... In this paper we close this gap and introduce decision theory problems in Probabilistic Answer Set Programming under the credal semantics (DTPASP). In particular, we extend Probabilistic Answer Set Programming under the credal semantics (PASP) with decision atoms and utility attributes (Van den Broeck et al. 2010). Every subset of decision atoms defines a different strategy, i.e., a different set of actions that can be performed in the domain of interest. ...
... In ADDs, leaf nodes may be associated with elements belonging to a set of constants (for example natural numbers) instead of only 0 and 1, that has been proved effective in multiple scenarios (Bahar et al. 1997). Kiesel et al. (2022) introduced Second Level Algebraic Model Counting (2AMC), needed to solve tasks such as MAP inference (Shterionov et al. 2015) and Decision theoretic inference (Van den Broeck et al. 2010), in Probabilistic Logic Programming, and inference in smProbLog (Totis et al. 2023) programs. These problems are characterized by the need for two levels of Algebraic Model Counting (2AMC) (Kimmig et al. 2017). ...
Preprint
Full-text available
Solving a decision theory problem usually involves finding the actions, among a set of possible ones, which optimize the expected reward, possibly accounting for the uncertainty of the environment. In this paper, we introduce the possibility to encode decision theory problems with Probabilistic Answer Set Programming under the credal semantics via decision atoms and utility attributes. To solve the task we propose an algorithm based on three layers of Algebraic Model Counting, that we test on several synthetic datasets against an algorithm that adopts answer set enumeration. Empirical results show that our algorithm can manage non trivial instances of programs in a reasonable amount of time. Under consideration in Theory and Practice of Logic Programming (TPLP).
... We also added the conversion from HPASP to PASP to the aspcs solver (Azzolini and Riguzzi 2023a), built on top of the aspmc solver (Eiter et al. 2021;Kiesel et al. 2022), which performs inference on Second Level Algebraic Model Counting (2AMC) problems, an extension of AMC (Kimmig et al. 2017). Some of them are MAP inference (Shterionov et al. 2015), decision theory inference in PLP ( Van den Broeck et al. 2010), and probabilistic inference under the smProbLog semantics (Totis et al. 2023). More formally, let X in and X out be a partition of the variables in a propositional theory Π. ...
Article
Full-text available
Probabilistic Answer Set Programming under the credal semantics extends Answer Set Programming with probabilistic facts that represent uncertain information. The probabilistic facts are discrete with Bernoulli distributions. However, several real-world scenarios require a combination of both discrete and continuous random variables. In this paper, we extend the PASP framework to support continuous random variables and propose Hybrid Probabilistic Answer Set Programming. Moreover, we discuss, implement, and assess the performance of two exact algorithms based on projected answer set enumeration and knowledge compilation and two approximate algorithms based on sampling. Empirical results, also in line with known theoretical results, show that exact inference is feasible only for small instances, but knowledge compilation has a huge positive impact on performance. Sampling allows handling larger instances but sometimes requires an increasing amount of memory.
... For instance, model counting requires counting the models while probabilistic inference, e.g., in the probabilistic logic language ProbLog [11], requires summing the probabilities associated with the different models. Other tasks, such as decision theoretic inference [7], MAP and MPE inference [4,5,22], and probabilistic inference under the smProbLog language [23], require to aggregate the results obtained via inference, so they need two levels of aggregations. These tasks they were recently identified as Second Level Algebraic Model Counting (2AMC) tasks [17]. ...
Chapter
Full-text available
Probabilistic Answer Set Programming under the credal semantics (PASP) describes an uncertain domain through an answer set program extended with probabilistic facts. The PASTA language leverages PASP to express statistical statements. A solver with the same name allows to perform inference in PASTA programs and, in general, in PASP. In this paper, we investigate inference in PASP, propose a new inference algorithm called aspcs based on Second Level Algebraic Model Counting (2AMC), and implement it into the aspmc solver. Then, we compare it with PASTA on a set of benchmarks: the empirical results show that, when the program does not contain aggregates, the new algorithm outperforms PASTA. However, when we consider PASTA statements and aggregates, we need to replace aggregates with a possibly exponential number of rules, and aspcs is slower than PASTA.
... Prototypical problems involve probabilistic reasoning over Bayesian networks (Pearl, 1985(Pearl, , 2014Heckerman, 2008;Niedermayer, 2008) or propositional theories (De Raedt, Kimmig, & Toivonen, 2007;Baral, Gelfond, & Rushton, 2009;Lee & Yang, 2017;Sato & Kameya, 1997), counting the models of a logical theory (Valiant, 1979), learning the entailment relation of a logical theory (Khardon & Roth, 1997) an optimal solution to a logical theory (Brailsford, Potts, & Smith, 1999;Erdem, Gelfond, & Leone, 2016;Kautz & Selman, 1996;Li & Manyà, 2021). More recent advances go even further than that by computing not only probabilities but also expected utilities (Van den Berg, Van Bremen, Derkinderen, Kimmig, Schrijvers, & De Raedt, 2021;Van den Broeck, Thon, Van Otterlo, & De Raedt, 2010) or by learning parameterized probability distributions (Manhaeve, Dumancic, Kimmig, Demeester, & De Raedt, 2019;Skryagin, Stammer, Ochs, Dhami, & Kersting, 2021) in neuro-symbolic reasoning. Others introduced advanced mechanisms to formulate preferences over models (Brewka, Delgrande, Romero, & Schaub, 2015;Ruttkay, 1994) or the ability to capture the lineage of data (Cui, 2002). ...
Article
Many important problems in AI, among them #SAT, parameter learning and probabilistic inference go beyond the classical satisfiability problem. Here, instead of finding a solution we are interested in a quantity associated with the set of solutions, such as the number of solutions, the optimal solution or the probability that a query holds in a solution. To model such quantitative problems in a uniform manner, a number of frameworks, e.g. Algebraic Model Counting and Semiring-based Constraint Satisfaction Problems, employ what we call the semiring paradigm. In the latter the abstract algebraic structure of the semiring serves as a means of parameterizing the problem definition, thus allowing for different modes of quantitative computations by choosing different semirings. While efficiently solvable cases have been widely studied, a systematic study of the computational complexity of such problems depending on the semiring parameter is missing. In this work, we characterize the latter by NP(R), a novel generalization of NP over semiring R, and obtain NP(R)-completeness results for a selection of semiring frameworks. To obtain more tangible insights into the hardness of NP(R), we link it to well-known complexity classes from the literature. Interestingly, we manage to connect the computational hardness to properties of the semiring. Using this insight, we see that, on the one hand, NP(R) is always at least as hard as NP or ModpP depending on the semiring R and in general unlikely to be in FPSPACEpoly. On the other hand, for broad subclasses of semirings relevant in practice we can employ reductions to NP, ModpP and #P. These results show that in many cases solutions are only mildly harder to compute than functions in NP, ModpP and #P, give us new insights into how problems that involve counting on semirings can be approached, and provide a means of assessing whether an algorithm is appropriate for a given class of problems.
Preprint
Full-text available
Probabilistic Answer Set Programming under the credal semantics (PASP) extends Answer Set Programming with probabilistic facts that represent uncertain information. The probabilistic facts are discrete with Bernoulli distributions. However, several real-world scenarios require a combination of both discrete and continuous random variables. In this paper, we extend the PASP framework to support continuous random variables and propose Hybrid Probabilistic Answer Set Programming (HPASP). Moreover, we discuss, implement, and assess the performance of two exact algorithms based on projected answer set enumeration and knowledge compilation and two approximate algorithms based on sampling. Empirical results, also in line with known theoretical results, show that exact inference is feasible only for small instances, but knowledge compilation has a huge positive impact on the performance. Sampling allows handling larger instances, but sometimes requires an increasing amount of memory. Under consideration in Theory and Practice of Logic Programming (TPLP).
Article
A data structure is presented for representing Boolean functions and an associated set of manipulation algorithms. Functions are represented by directed, acyclic graphs in a manner similar to the representations introduced by C. Y. Lee (1959) and S. B. Akers (1978), but with further restrictions on the ordering of decision variables in the graph. Although, in the worst case, a function requires a graph where the number of vertices grows exponentially with the number of arguments, many of the functions encountered in typical applications have a more reasonable representation. The algorithms have time complexity proportional to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large. Experimental results are presented from applying these algorithms to problems in logic design verification that demonstrate the practicality of the approach.
Article
We propose a new Probabilistic ILP (PILP) framework, Decision-theoretic Logic Programs (DTLPs), in the paper. DTLPs extend PILP models by integrating decision-making features developed in Statistical Decision Theory area. Both decision-theoretic knowledge (e.g. utilities) and probabilistic knowledge (e.g. probabilities) can be represented and dealt with in DTLPs. An implementation of DTLPs using Stochastic Logic Programs (SLPs) is introduced and a DTLP parameter learning algorithm is discussed accordingly. The representation and methods are tested by performing regression on the traditional mutagenesis dataset.
Article
In this paper we present theory and experimental results on Algebraic Decision Diagrams. These diagrams extend BDDs by allowing values from an arbitrary finite domain to be associated with the terminal nodes of the diagram. We present a treatment founded in Boolean algebras and discuss algorithms and results in several areas of application: Matrix multiplication, shortest path algorithms, and direct methods for numerical linear algebra. Although we report an essentially negative result for Gaussian elimination per se, we propose a modified form of ADDs which appears to circumvent the difficulties in some cases. We discuss the relevance of our findings and point to directions for future work.
Article
Many traditional solution approaches to relationally specified decision-theoretic planning problems (e.g., those stated in the probabilistic planning domain description language, or PPDDL) ground the specification with respect to a specific instantiation of domain objects and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these grounded solution approaches are polynomial in the number of domain objects and exponential in the predicate arity and the number of nested quantifiers in the relational problem specification. An alternative to grounding a relational planning problem is to tackle the problem directly at the relational level. In this article, we propose one such approach that translates an expressive subset of the PPDDL representation to a first-order MDP (FOMDP) specification and then derives a domain-independent policy without grounding at any intermediate step. However, such generality does not come without its own set of challenges—the purpose of this article is to explore practical solution techniques for solving FOMDPs. To demonstrate the applicability of our techniques, we present proof-of-concept results of our first-order approximate linear programming (FOALP) planner on problems from the probabilistic track of the ICAPS 2004 and 2006 International Planning Competitions.
Article
Inspired by game theory representations, Bayesian networks, influence diagrams, structured Markov decision process models, logic programming, and work in dynamical systems, the independent choice logic (ICL) is a semantic framework that allows for independent choices (made by various agents, including nature) and a logic program that gives the consequence of choices. This representation can be used as a specification for agents that act in a world, make observations of that world and have memory, as well as a modelling tool for dynamic environments with uncertainty. The rules specify the consequences of an action, what can be sensed and the utility of outcomes. This paper presents a possible-worlds semantics for ICL, and shows how to embed influence diagrams, structured Markov decision processes, and both the strategic (normal) form and extensive (game-tree) form of games within the ICL. It is argued that the ICL provides a natural and concise representation for multi-agent decision-making under uncertainty that allows for the representation of structured probability tables, the dynamic construction of networks (through the use of logical variables) and a way to handle uncertainty and decisions in a logical representation.
Conference Paper
We introduce ProbLog, a probabilistic extension of Prolog. A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly sam- pled program, and these probabilities are mutually independent. The semantics of ProbLog is then de- fined by the success probability of a query, which corresponds to the probability that the query suc- ceeds in a randomly sampled program. The key contribution of this paper is the introduction of an effective solver for computing success probabili- ties. It essentially combines SLD-resolution with methods for computing the probability of Boolean formulae. Our implementation further employs an approximation algorithm that combines iterative deepening with binary decision diagrams. We re- port on experiments in the context of discovering links in real biological networks, a demonstration of the practical usefulness of the approach.