A Fully Polynomial Time Approximation Scheme for Updating Credal Networks of Bounded Treewidth and Number of Variable States
ABSTRACT Credal networks lift the precise probability assumption of Bayesian networks, enabling a richer representation of un-certainty in the form of closed convex sets of probabil-ity measures. The increase in expressiveness comes at the expense of higher computational costs. In this paper we present a new algorithm which is an extension of the well-known variable elimination algorithm for computing pos-terior inferences in extensively specified credal networks. The algorithm efficiency is empirically shown to outper-form a state-of-the-art algorithm. We then provide the first fully polynomial time approximation scheme for inference in credal networks with bounded treewidth and number of states per variable.
-
Citations (0)
-
Cited In (0)
Page 1
7th International Symposium on Imprecise Probability: Theories and Applications, Innsbruck, Austria, 2011
A Fully Polynomial Time Approximation Scheme for Updating Credal
Networks of Bounded Treewidth and Number of Variable States
Denis D. Mau´ a
IDSIA, Switzerland
denis@idsia.ch
Cassio P. de Campos
IDSIA, Switzerland
cassio@idsia.ch
Marco Zaffalon
IDSIA, Switzerland
zaffalon@idsia.ch
Abstract
Credal networks lift the precise probability assumption of
Bayesian networks, enabling a richer representation of un-
certainty in the form of closed convex sets of probabil-
ity measures. The increase in expressiveness comes at the
expense of higher computational costs. In this paper we
present a new algorithm which is an extension of the well-
known variable elimination algorithm for computing pos-
terior inferences in extensively specified credal networks.
The algorithm efficiency is empirically shown to outper-
form a state-of-the-art algorithm. We then provide the first
fully polynomial time approximation scheme for inference
in credal networks with bounded treewidth and number of
states per variable.
Keywords. Probabilistic graphical models, credal net-
works, approximation scheme, valuation algebra.
1Introduction
Credal networks [11] are generalizations of Bayesian net-
works that allow for a richer representation of uncer-
tainty in the form of set-valued probabilities—in contrast
to the sharp numeric values required by their Bayesian
counterpart. They are models of imprecise probability
as advocated by Walley [18]. In a nutshell, credal net-
works rely on a directed acyclic graph (DAG) to encode a
compact and computationally efficient representation of a
closed convex set of joint probability mass functions over
a set of variables, much in the same way that Bayesian
networks do for single joint probability mass functions.
Namely, credal networks respect the local Markov condi-
tion that each variable (uniquely represented by a node in
the DAG) is (strongly) independent of its non-descendant
non-parents conditional on its parents. Strong indepen-
dence is justified by a sensitivity analysis interpretation,
where we assume that there exists a single probability
mass function representing our knowledge which we can-
not know precisely for lack of resources; epistemic irrele-
vance, on the other hand, is arguably more consistent with
a behavioral interpretation of inherent imprecision [18]. In
the following, we assume credal networks to operate under
strong independence.
In order to enable efficient computation, additional con-
straints need to be imposed to the set-valued specifications
of the local probabilities. The two most common choices
are extensively specified sets, in which local models are
given as setsof probability potentials, andseparately spec-
ified sets, in which local models are specified as collec-
tions containing one set of probability mass functions for
each configuration of the parents. Separately specified net-
works can be mapped to extensively specified and vice-
versa [2].
There is also another subtlety when computing with such
local models, which concerns the way they are represented
in a computer. The sets of local (conditional) probabil-
ity mass functions can be encoded either as sets of points
(e.g., the sets of vertices of a convex polytope), or as sets
of (linear) inequalities. Although these two encodings can
represent any finitely-generated closed convex set, mov-
ing from an inequality-based encoding to a vertex-based
encoding can dramatically increase the length of the rep-
resentation of the local models. For example, a simple
8-dimensional polytope specified by 729 inequalities has
between 5 thousand and 12 billion vertices [4].
Inference with credal networks has been theoretically and
empirically shown to be a difficult problem. For example,
computing exact marginals in credal networks is known
to be NP-hard even for polytree-shaped networks, a par-
ticular case that can be computed in polynomial time in
Bayesian networks [7]. Despite the hardness of the prob-
lem, several algorithms are known to perform reasonably
well under certain conditions. Most notably, the 2U al-
gorithm [12], which computes exact posterior bounds in
polytree-shaped credal networks with binary variables,
continues to be the only known polynomial time algo-
rithm available, and its generalizations to arbitrary net-
works (e.g., the GL2U [3]), which perform approximate
inference, are among the fastest algorithms. A notable ex-
ample, against which we compare our results in this pa-
per, is the algorithm of de Campos and Cozman [8], which
Page 2
algorithm
2U [12]
GL2U [3]
A/R+ [16]
IP [8]
ML [6]
HC [9]
complexity
polynomial
polynomial
exponential
exponential
exponential
exponential
topology
polytree
all
polytree
all
all
all
inference
exact
approximate
approximate
exact/approx.
exact/approx.
exact/approx.
representation
inequality
vertex
inequality
inequality
inequality
vertex
Table 1: Comparison of some existing algorithms for inference in credal networks.
finds exact posterior bounds in general networks by con-
verting the problem into a mixed integer program, which
can be solved exactly for small networks, or relaxed to
provide approximate results in large networks. Other ap-
proaches mix branch-and-bound methods for exact infer-
ence and local searches for approximate results [6, 9, 16].
Table 1 contrasts some of the available algorithms. To
date, no algorithm is known to provide approximations
within given bounds in polynomial time. Recently, de
Cooman et al. [10] developed a polynomial time algorithm
for tree-shaped credal networks, but it operates under epis-
temic irrelevance.
In this paper, we present a new algorithm for computing
exact posterior bounds in extensively specified credal net-
works encoded by vertices, as well as a fully polynomial
time approximation scheme (FPTAS) for networks with
bounded treewidth and number of states per variable. We
begin by stating the basic elements of our formalism (Sec-
tion 2), followed by a formal definition of inference in ex-
tensively specified credal networks (Section 3). Then we
present a modified variable elimination algorithm for ex-
act inference, which has worst-case complexity exponen-
tial in both the treewidth of the graph and the size of local
sets (Section 4). We address this issue by devising an FP-
TAS (Section 5). Experiments showing the performance
of the algorithms are presented and discussed in Section 6.
Finally, Section 7 contains our concluding thoughts.
Due to the limited space, we only present proofs for the
most important results.
2An Algebra of Ordered Potentials
In this section, we introduce the main ingredients of the
message passing algorithms that we present later as well
as the basic results needed to guarantee the correctness and
efficiency of computations.
From an algebraic viewpoint, the primitive entities of
our formalism are the so-called labeled valuations (φ,x),
which encode information about a (local) domain through
a valuation φ and a set of variables x. Here we adopt
the equivalent notation φxto denote the pair (φ,x). More
concretely, valuations can take as straightforward forms as
bounded real-valued functions (Section 2.2), or represent
more complicated objects such as sets of pairs of probabil-
ity potentials (Section 2.3).
The set of all variables we consider relevant to a problem,
denotedbyU, isthelargestsetofvariablesthatcanbecon-
sidered for a (labeled) valuation in our setting, which we
assume to be bounded. We write variables with capital let-
ters (e.g., X1,...,Xn∈ U) and sets of variables in lower
case (e.g., x = {X1,...,Xn}). Any variable X is as-
sumedtobeassociatedwithafinitesetofvaluesΩXcalled
its frame. The elements of ΩXare called states. If x is a
set of variables, the domain Ωxis given by the Cartesian
product of the frames of variables in x, Ωx?×X∈xΩX.
Any element of Ωxis called a configuration. If x is a con-
figuration in Ωx, the notation x↓ydenotes the projection
of x onto y ⊆ x, with x↓∅? λ, where λ denotes the null
element that does not appear in any frame.
The set of all valuations (φ,x) over a subset x ⊆ U is
denoted by Φx. The set of all valuations is denoted by
Φ ??
combination represents aggregation of two pieces of in-
formation. If φxand φyare two arbitrary valuations, then
φx×φyis a valuation φx∪ywith domain Ωx∪y. Marginal-
ization, on the other hand, acts by coarsening informa-
tion. If φxis a valuation then the marginal φ↓y
uation with domain Ωy. Sometimes, it is convenient to
define the elimination operation, which is in a one-to-one
correspondence to marginalization. Formally, if φxis a
valuation then φ−y
x
? φ↓x\y
x
nation of variables in y. When clear from the context,
we write Y to denote a singleton y = {Y }, for exam-
ple φ−Y
x
= φ↓x\{Y }
x
. A system (Φ,U,×,↓) closed under
combination and marginalization is said to be a valuation
algebra if it satisfies the following three axioms [15, 17].
x⊆UΦx. The algebra comes with two basic op-
erations of combination and marginalization. Intuitively,
x is a val-
is the result of the elimi-
(A1) Combination is commutative and associative.
(A2) For y ⊆ x ⊆ z,?φ↓x
(A3) If x ⊆ z ⊆ x ∪ y then (φx× φy)↓z= φx× φ↓z∩y
The purpose of a valuation algebra is the computation of
marginals of the form (×iφui)↓y, where the joint valu-
z
?↓y= φ↓y
z.
y
.
Page 3
ation ×iφuiis computationally too expensive to be ob-
tainedexplicitly. Thecomplexityoftheoperationsofcom-
bination and marginalization is given by the size of the
valuations involved, which is in general a function of the
cardinality of the domain. Hence, as a rule-of-thumb, the
larger the domain of a valuation the more expensive are the
operations involving it. The axioms of valuation algebras
provide the necessary framework for breaking down the
computation of costly marginals into a sequence of com-
putations of marginals over smaller domains. The pseudo-
code in Algorithm 1 exhibits the variable elimination pro-
cedure (also known as fusion algorithm), which more effi-
ciently computes marginals of factorized valuations.
Algorithm 1: Variable Elimination
input : A finite set of valuations Ψ, a set of target
variables y ⊂ U ??
output: The marginal (×φ∈Ψφ)↓y
for i ← 1 to n do
Set Bi← {φu∈ Ψ : Xi∈ u} ;
Compute Ψi? (×φ∈Biφ)−Xi;
Set Ψ ← (Ψ \ Bi) ∪ {Ψi};
end
return Γ ?×φ∈Ψφ;
φu∈Ψu, and an ordering
o = (X1,...,Xn) of the variables in U \ y
Instead of computing a valuation ×φ∈Ψφ over a large
domain ΩU and then marginalizing to y, the algorithm
computes marginals (×φ∈Biφ)−Xiover possibly much
smaller domains. The overall complexity of the algorithm
is given by the size of the largest valuation Ψigenerated
at the loop step. If such a size is bounded then (A1)–(A3)
are sufficient to show that the algorithm efficiently outputs
the desired marginal [15].
Some optimization tasks like the credal network infer-
ences we aim at here admit a partial ordering over the
valuations. Let ≤ denote a partial order over Φ (i.e., a
reflexive, antisymmetric and transitive relation). An or-
dered valuation algebra [13] is a system (Φ,U,×,↓,≤),
where (Φ,U,×,↓) is a valuation algebra and ≤ is mono-
tonic with respect to × and ↓:
(A4) If φx≤ ψxand φy≤ ψythen (φy× φx) ≤ (ψy×
ψx) and φ↓y
x≤ ψ↓y
x.
Given a finite set of ordered valuations Ψ ⊆ Φ, we say
that φ ∈ Ψ is maximal if for all ψ ∈ Ψ such that φ ≤ ψ
it holds that ψ ≤ φ. The operation max(Ψ) returns the set
of maximal valuations of a set Ψ. Given any relation R on
Ψ, a subset Ψ?⊆ Ψ is called an R-covering of Ψ if for
every φ ∈ Ψ there is ψ ∈ Ψ?such that φRψ. For example,
the set max(Ψ) is a ≤-covering for Ψ.
2.1Set-Valuations
The algorithms we develop use the more complex entities
of sets of valuations, called set-valuations. Theses entities
can nevertheless be casted in the algebra of valuations, and
manipulated by the variable elimination algorithm to pro-
duce sets of marginal valuations.
Let 2Φxdenote the power set of Φx, that is, the set of all
subsets of it. Thus, 2Φdenotes the set of all subsets of
valuations in Φ. If Ψx∈ 2Φxand Ψy ∈ 2Φy, we define
their set-combination ⊗ as the set-valuation resulting from
element-wise combination of their elements, Ψx⊗ Ψy?
{φx× φy: φx∈ Ψx,φy∈ Φy}. Likewise, we define the
set-marginalization operation ⇓ on 2Φas the element-wise
marginalization of the valuations in a set, Ψ⇓y
φx∈ Ψx}.
Proposition 1. The system (2Φ,U,⊗,⇓) of set-valuations
with set-combination and set-marginalization is a valua-
tion algebra.
x
? {φ↓y
x :
The exact variable elimination algorithm we develop in
Section 4 obtains its (relative) efficiency by propagating
only maximal valuations. Let max(2Φ) ? {max(Ψ) :
Ψ ∈ 2Φ} denote the set of all sets of maximal valua-
tions in 2Φ. We define the max-combination ⊕ and max-
marginalization ? as Ψx⊕ Ψy ? max(Ψx⊗ Ψy) and
Ψ?y
x
? max(Φ⇓y
Proposition 2. The system (max(2Φ),U,⊕,?) of max-
imal set valuations with max-combination and max-
marginalization is also a valuation algebra.
x).
If (Φ1,U,×1,↓1) and (Φ2,U,×2,↓2) are two valuation
algebras, we say that a mapping h : Φ1 → Φ2 is a
homomorphism if for any φx,φy ∈ Φ1 we have that
h(φx)×2h(φy) = h(φx×1φy) and h(φx)↓2y= h(φ↓1y
Thus, if we are interested in computing h(φ↓1y
valuation φ1∈ Φ1that we know that factorizes as φ1=
ψ1×1··· ×1ψm, we can equivalently obtain (h(ψ1) ×2
··· ×2h(ψm))↓2y, which might be computationally more
convenient. The following result relates the algebras of
set-valuations and maximal set-valuations.
Proposition 3. max
is
(2Φ,U,⊗,⇓) to (max(2Φ),U,⊕,?).
Since the set of maximal elements of a set is in the worst
case as large as the set itself, but often much smaller, the
homomorphism max allows us to conveniently obtain a
setofmaximalmarginalsmax([?
as element-wise combination of valuations in the carte-
sian product, and assume that the set-valuations Ψxican
not be factorized as combinations of other set-valuations.
Hence, the set?
the combination of maximal set-valuations?
x ).
1
) for some
a homomorphismfrom
iΨxi]⇓y)bycomputing
the equivalent [?
imax(Ψxi)]?y. Recall that ⊗ is defined
iΨxiis exponentially large in the size
of each Ψxiand often intractable. On the other hand,
imax(Ψxi)
can mitigate the exponential explosion if the number of
Page 4
maximal points is kept bounded after each pairwise com-
bination. For instance, if each of the local maximal sets
max(Ψxi) is half as large as its original set Ψxi, then com-
puting max([?
depends on the number of non-maximal elements that are
discarded after each max-combination.
imax(Ψxi)]⇓y) involves O(2n) less com-
putations than max([?
iΨxi]⇓y). The speed up strongly
In the rest of this section we introduce the concrete valua-
tion algebras our framework relies on.
2.2Probability Potentials
Probabilitypotentialsareperhapsthemostcommonexam-
ple of valuation algebras. They generalize (conditional)
probability mass functions. If x ⊆ U is a nonempty set
of variables, we define a potential pxas a mapping from
Ωxto the set of nonnegative reals. A potential p∅over the
emptysetisdefinedasanonnegativerealnumber. Thesize
of a potential pxis the cardinality of its domain. The fol-
lowing operations are defined over potentials. Combina-
tion of potentials is done by element-wise multiplication:
for z ∈ Ωx∪y,
(px× py)(z) ? px(z↓x)py(z↓y).
Marginalization is defined as the sum of compatible ele-
ments. For y ∈ Ωy,
p↓y
(1)
x(y) ?
?
x∈Ωx:x↓y=y
px(x).
(2)
Note that if y = ∅, the marginal p↓y
number.
xis a (nonnegative real)
PartialorderingisgivenbyweakParetodominance. Given
two potentials pxand qxover Ωx, we define px ≥ qxif
px(x) ≥ qx(x) for all x ∈ Ωx. Note that if pxand qx
haveequalsum(i.e.,?
case, for example, of potentials representing (conditional)
probability mass functions. Therefore, the identity Px=
max(Px) holds for any set Pxof (conditional) probability
mass functions. Let P denote the set of all probability
potentials.
Proposition 4. The system (P,U,×,↓,≤) is an ordered
valuation algebra.
x∈Ωxpx(x) =?
x∈Ωxqx(x))then
px ?≥ qxand qx ?≥ px(unless px = qx). This is the
Given a real number α > 1, we define an equivalence
relation ≡αover potentials such that any two potentials px
and qxare α-equivalent (i.e., px≡αqx) if for all x ∈ Ωx
either px(x) = qx(x) = 0 or px(x) and qx(x) are both
positive and ?logαpx(x)? = ?logαqx(x)?.
2.3Pairs of Potentials
The algorithms we develop in Sections 4 and 5 rely on a
more abstract structure over pairs of potentials. Let φx=
(p?
potentials p?
tentials of φx, respectively. For any two pairs of potentials
φxand ψx, we define φx= (p?
p?
reflects the nature of computations with credal networks.
We seek for a solution that partly dominates (according
to right potentials) all other potentials and partly is dom-
inated by them (according to left potentials). It is in part
this dichotomy in the objective that makes posterior infer-
ences in credal networks much harder than their Bayesian
counterpart.
x,pr
x) denote a pair of probability potentials over x. The
xand pr
xare referred to as the left and right po-
x,pr
x) ≥ (q?
x,qr
x) = ψxif
x≤ q?
xandpr
x≥ qr
x. Thepartialorderdefinedinthisway
If φx= (p?
tentials, we define their combination as the pair of left
and right combinations of potentials, that is, φx× φy ?
(p?
Similarly, the marginalization of
a pair φx = (p?
φ↓y
x
? ((p?
potentials.
x,pr
x) and φy = (p?
y,pr
y) are two pairs of po-
x× p?
y,pr
x× pr
y).
x,pr
x)↓y). Let Φ be the set of all pairs of
x) is performed on both potentials,
x)↓y,(pr
Proposition 5. The system (Φ,U,×,↓,≤) is an ordered
valuation algebra.
Let2Φandmax(2Φ)denote, respectively, thesetofallsets
ofpairsofpotentialsandthesetofallsetsofmaximalpairs
of potentials. It follows from Propositions 1 and 2 that the
systems (2Φ,U,⊗,⇓) and (max(2Φ),U,⊕,?) are valua-
tion algebras. Moreover, max is a homomorphism from
2Φto max(2Φ). Thus, given a collection of finite sets of
pairs Ψx1,...,Ψxn, we can obtain the set max(Ψy) ?
max((?Ψxi)⇓y) of maximal marginal valuations poten-
algebra of sets of maximal pairs, that is, by computing
max((?
a domain Ωyhave, on average, O((logn)2|Ωy|−1) maxi-
mal elements. Unfortunately, the uniformity assumption
does not hold in the computations we perform, and we ex-
pect the average number of maximal elements to be higher
than this. To our knowledge, it remains to be obtained any
bounds or expectations on the size of maximal sets ob-
tained from propagated valuations such as those generated
byvariableelimination. Notethat, aswithsetsofprobabil-
ity potentials, if Ψ contains only valuations whose left or
right potentials specify a probability mass function, then
Ψ = max(Ψ).
tially more efficiently by performing computations in the
imax(Ψxi))?y). Bentley et al. [5] showed that
sets with n uniformly distributed pairs of potentials over
We can have an upper bound on the cardinality of sets
by relaxing the partial order to allow approximate Pareto
dominance. Given a real number α > 1, we define a re-
lation ≤αsuch that φ ≤α ψ denotes that by mistakenly
assuming φ ≤ ψ we introduce an error no greater than α
in each coordinate. More formally, we define φ ≤αψ if
(α−1,α) × ψ ≥ φ. Note that ≤αis neither transitive nor
antisymmetric, and that we may have φ ≤αψ for φ ?≤ ψ.
The α-equivalence relation over potentials can easily be
Page 5
extended to pairs. Two pairs (p?
equivalent if p?
see that φ ≡αψ implies both φ ≤αψ and ψ ≤αφ.
A ≤α-covering for a set of pairs of potentials Ψx pro-
vides an approximated version of Ψx, one in which for
each φx∈ Ψxwe are guaranteed to have a pair ψxin the
covering such that the left and right potentials of ψxand
φxdiffer in each coordinate by a factor no greater than α.
We can easily obtain a ≤α-covering of Ψxof bounded car-
dinality from its quotient set Ψx/α, that is, by discarding
one of any two α-equivalent pairs in Ψx. The approxima-
tion algorithm we develop in Section 5 strongly relies on
the following results.
Lemma 6. If k1,...,km
are positive integers and
Ψx1,Ψ?
i = 1,...,m Ψ?
··· ⊗ Ψ?
β = α
Proof. We work by induction on j = 1,...,m. For j =
1, it follows directly that Ψ?
Assume the result holds for 1 ≤ j < m − 1, and consider
any pair φ = φ?× φ??in Ψx1⊗ ··· ⊗ Ψxj+1, where φ?∈
Ψx1⊗ ··· ⊗ Ψxjand φ??∈ Ψxj+1. There is ψ = ψ?× ψ??
in Ψ?
ψ??∈ Ψxj+1, such that (α−Pj
φ?(by assumption) and (α−kj+1,αkj+1) × ψ??≥ φ??. It
follows from (A4) that (α−Pj+1
Let Ψx1,...,Ψxmdenote sets of pairs of potentials which
take values on the interval [0,1], and let b be the number
of bits required to encode these sets.
Proposition 7. The number of elements in (Ψx1⊗ ··· ⊗
Ψxm)⇓y/α is O((bmα/(α − 1))2|Ωy|).
The latter result is in fact an adaptation of Papadim-
itriou and Yannakakis’ result on the boundedness of ?-
approximate Pareto curves in multi-objective optimization
problems [1, Theorem 1].
x,pr
x) and (p?
y. It is not difficult to
y,pr
y) are α-
x≡αp?
yand pr
y≡αpr
x1,...,Ψxm,Ψ?
xmare set valuations such that for
xiis a ≤αki-covering for Ψi, then Ψ?
xmis a ≤β-covering for Ψx1⊗ ··· ⊗ Ψxm, where
i=1ki.
x1⊗
Pm
1is a ≤αk1-covering for Ψ1.
x1⊗ ··· ⊗ Ψ?
xj+1, where ψ?∈ Ψ?
x1⊗ ··· ⊗ Ψ?
i=1ki,α
xjand
Pj
i=1ki) × ψ?≥
Pj+1
i=1ki,α
i=1ki)×ψ ≥ φ.?
3Credal Networks
In this section we review the basic concepts and computa-
tional challenges of extensively specified credal networks.
Let G = (U,E) be a DAG, and X a node in U. We write
pa(X) ? {Y ∈ U : (Y,X) ∈ E} to denote the parents
of X, ch(X) ? {Y ∈ U : (X,Y ) ∈ E} to denote the
children of X in U, and fa(X) ? {X} ∪ pa(X) to denote
the family of X. We call Y a descendant of X if there is a
directed path from X to Y in G.
Anextensivecredalset Kxisasetofprobabilitypotentials
pxover domain Ωx. Given an extensive credal set Kx, we
write H(Kx) to denote its convex hull (i.e., the set ob-
tained by all convex combinations of elements in Kx), and
ext[H(Kx)] to denote its extreme points (i.e., the elements
A
BC
K{A}=
„0.1
0.9
«ff
K{B,A}= H
„„0.20.3
0.70.8
«
,
„0.40.5
0.50.6
«ff«
K{C,A}=
„0.60.7
0.30.4
«ff
Figure 1: Example of extensively specified credal net-
work.
of H(Kx) that cannot be written as a convex combination
of other elements). The convex hull of a set and the set of
its extreme points are themselves extensive credal sets.
An extensively specified credal network is a pair (G,K),
where K is a collection of finitely-generated closed con-
vex extensive credal sets Kfa(X), one for each X ∈
U, such that each potential pfa(X)
fies
(i.e., they represent conditional probability mass functions
p(X|pa(X))). Figure 1 depicts a simple extensively spec-
ified credal network over 3 binary-valued variables.
∈ Kfa(X) satis-
?
x↓pa(X)=πpfa(X)(x) = 1 for all π ∈ Ωpa(X)
The strong extension of a credal network is given by the
credal set generated by the convex closure of the product
of all extensive credal sets in K,
??
Sincetheproductoflocal
?
Kext
Notice that Kext
U
elements.
Kstrong
U
? H
X∈U
Kfa(X)
?
.
(3)
extremes
Kext
U
?
X∈Uext[Kfa(X)] is a subset of the strong extension (by
definition), we have that ext[Kstrong
U.
U
] = ext[H(Kext
U)] ⊆
contains a finite number of
Let q,e ⊂ U denote disjoint sets of query and evidence
variables, respectively, and (q,e) an element of Ωq∪e. In-
ference with credal networks consists in computing lower
and upper posterior probabilities (we assume p↓e(e) > 0
for all p ∈ Kstrong
U
):
p(q|e) ?
min
p∈Kstrong
U
p↓q∪e(q,e)
p↓e(e)
p↓q∪e(q,e)
p↓e(e)
,
(4)
p(q|e) ? max
p∈Kstrong
U
.
(5)
Our goal in the rest of this section is to show that the
continuous optimizations of Equations (4) and (5) can
be mapped into problems of computing maximal sets of
marginals of the combinations of finite sets of pairs of po-
tentials. We begin with a well-known result that the so-
lutions to the convex optimizations in Equation (5) are at-
tainedatextremepointsofthestrongextension[18]. Since