Simple Random Logic Programs.
ABSTRACT We consider random logic programs with twoliteral rules and study their properties. In particular, we obtain results on the
probability that random “sparse” and “dense” programs with twoliteral rules have answer sets. We study experimentally how
hard it is to compute answer sets of such programs. For programs that are constraintfree and purely negative we show that the easyhardeasy pattern emerges. We provide arguments to explain that behavior. We also show that the hardness
of programs from the hard region grows quickly with the number of atoms. Our results point to the importance of purely negative
constraintfree programs for the development of ASP solvers.

Chapter: Simple but Hard Mixed Horn Formulas
[Show abstract] [Hide abstract]
ABSTRACT: We study simple classes of mixed Horn formulas, in which the structure of the Horn part is drastically constrained. We show that the SAT problem for formulas in these classes remains NPcomplete, and demonstrate experimentally that formulas randomly generated from these classes are hard for the present SAT solvers, both complete and localsearch ones.07/2010: pages 382387;  SourceAvailable from: ArXiv[Show abstract] [Hide abstract]
ABSTRACT: This paper develops automated testing and debugging techniques for answer set solver development. We describe a flexible grammarbased blackbox ASP fuzz testing tool which is able to reveal various defects such as unsound and incomplete behavior, i.e. invalid answer sets and inability to find existing solutions, in stateoftheart answer set solver implementations. Moreover, we develop delta debugging techniques for shrinking failureinducing inputs on which solvers exhibit defective behavior. In particular, we develop a delta debugging algorithm in the context of answer set solving, and evaluate two different elimination strategies for the algorithm. Comment: 18 pagesTheory and Practice of Logic Programming 07/2010; · 0.29 Impact Factor
Page 1
Simple Random Logic Programs
Gayathri Namasivayam and Miros? law Truszczy´ nski
Department of Computer Science, University of Kentucky, Lexington, KY
405060046, USA
Abstract. We consider random logic programs with twoliteral rules
and study their properties. In particular, we obtain results on the proba
bility that random “sparse” and “dense” programs with twoliteral rules
have answer sets. We study experimentally how hard it is to compute
answer sets of such programs. For programs that are constraintfree and
purely negative we show that the easyhardeasy pattern emerges. We
provide arguments to explain that behavior. We also show that the hard
ness of programs from the hard region grows quickly with the number of
atoms. Our results point to the importance of purely negative constraint
free programs for the development of ASP solvers.
1 Introduction
The availability of a simple model of a random CNF theory was one of the
enabling factors behind the development of fast satisfiability testing programs
— SAT solvers. The model constrains the length of each clause to a fixed integer,
say k, and classifies kCNF theories according to their density, that is, the ratio
of the number of clauses to the number of atoms. kCNF theories with low
densities have few clauses relative to the number of atoms. Thus, most of them
have many solutions, and solutions are easy to find. kCNF theories with high
densities have many clauses relative to the number of atoms. Thus, most of
them are unsatisfiable. Moreover, due to the abundance of clauses, proofs of
contradiction are easy to find. As theories in low and highdensity regions are
“easy,” they played essentially no role in the development of SAT solvers.
There is, however, a narrow range of densities “in between,” called the phase
transition, where random kCNF theories change rapidly from most being satis
fiable to most being unsatisfiable. Somewhere in that narrow range is a value d
such that random kCNF theories with density d are satisfiable with the proba
bility 1/2. The problem of determining that value has received much attention.
For instance, for 3CNF theories, the phasetransition density was found exper
imentally to be about 4.25 [1]. A paper by Achlioptas discusses recent progress
on the problem, including some lower and upper bounds on the phase transition
value [2]. A key property of 3CNF theories from the phase transition region
is that they are hard.1Thus, we have the easyhardeasy difficulty pattern as
1It should be noted that the low and highdensity regions also contain challenging
theories, but they are relatively rare [3]).
Page 2
the function of density. Moreover, deciding satisfiability of programs from the
hard region is very hard indeed! Designing solvers that could solve random un
satisfiable 3CNF theories with 700 atoms generated from the phasetransition
region was one of grand challenges for SAT research posed by Selman, Kautz
and McAllester [4]. It resulted in major advances in SAT solver technology.
As in the case of the SAT research, work on random logic programs is likely
to lead to new insights into the properties of answer sets of programs, and lead
to advances in ASP solvers — software for computing them. Yet, the question
of models of random logic programs has received little attention so far, with the
work of Zhao and Lin [5] being a notable exception. Our objective is to propose
a model of simple random logic programs and investigate its properties.
As in SAT, we consider random programs with rules of the same length. For
the present study, we further restrict our attention to programs with twoliteral
rules. These programs are simple, which facilitates theoretical studies. But de
spite their simplicity, they are of considerable interest. First, every problem in
NP can be reduced in polynomial time to the problem of deciding the existence of
an answer set of a program of that type [6]. Second, many problems of interest
have a simple encoding in terms of such programs [7]. We study experimen
tally and analytically properties of programs with twoliteral rules. We obtain
results on the probability that random programs with twoliteral rules, both
“sparse” and “dense,” have answer sets. We study experimentally how hard it is
to compute answer sets of such programs. We show that for programs that are
constraintfree and purely negative the easyhardeasy pattern emerges. We give
arguments to explain that phenomenon, and show that the hardness of programs
from the hard region grows quickly with the number of atoms. Our results point
to the importance of constraintfree purely negative programs for the develop
ment of ASP solvers, as they can serve as useful benchmarks when developing
good search heuristics. However, unlike in the case of SAT, depending on the
parameters of the model, we either do not observe the phase transition or, when
we do, it is gradual not sudden.
Even relatively small programs from the hard region are very hard for the
current generation of ASP solvers. Interestingly, that observation may also have
implications for the design of SAT solvers. If P is a purely negative program,
answer sets of P are models of its completion comp(P), a certain propositional
theory [8]. For programs with twoliteral rules the completion is (essentially) a
CNF theory. Our experiments showed that these theories are very hard for the
presentday SAT solvers, despite the fact that most of their clauses are binary.
2 Preliminaries
Logic programs consist of rules, that is, of expressions of the form
a ← b1,...,bm,not c1,...,not cn
(1)
and
← b1,...,bm,not c1,...,not cn, (2)
Page 3
where a, bi and cj are atoms. Rules (1) are called definite, and rules (2) —
constraints. A rule is proper if no atom occurs in it more than once. A rule is
kregular if it consists of k literals (that is, it is a definite rule with k−1 literals
in the body, or a constraint with k literals in the body).
If r is a rule of type (1) or (2), the expression b1,...,bm,not c1,...,not cn
(understood as the conjunction of its literals) is the body of r. We denote it by
bd(r). The set of atoms {b1,...,bm} is the positive body of r, denoted bd+(r),
and the set of atoms {c1,...,cn} is the negative body of r, denoted bd−(r).
In addition, the head of r, hd(r), is defined as a, if r is of type (1), and as
⊥, otherwise. A program P is constraintfree if it contains no constraints. A
program P is purely negative if for every nonconstraint rule r ∈ P, bd+(r) = ∅.
A set of atoms M is an answer set of a program P if it is the least model
of the reduct of P with respect to M, that is, the program PMobtained by
removing from P every rule r such that M ∩ bd−(r) ?= ∅, and by removing all
literals of the form not c from all other rules of P.
Computing answer sets of propositional logic programs is the basic reasoning
task of answerset programming, and fast programs that can do that, known as
answerset programming solvers (ASP solvers, for short) have been developed in
the recent years [9–13].
3 2Regular Programs
We assume a fixed set of atoms At = {a1,a2,...}. There are five types of 2
regular rules: a ← not b; a ← b; ← not a,not b; ← a,not b; ← a,b. Accord
ingly, we define five classes of programs, mR−
with atoms from Atn= {a1,...,an} and consisting of m proper rules of each of
these types, respectively. Without the reference to m, the notation refers to all
programs with n atoms of the corresponding type (for instance, R+
the class of all programs over Atnconsisting of proper rules of the form a ← b).
The maximum value of m for which mR−
n(n − 1). The maximum value of m for which mC−
n(n−1)/2. Let 0 ≤ m1,m2,c2≤ n(n−1) and 0 ≤ c1,c3≤ n(n−1)/2 be integers.
By [m1R−+ m2R++ c1C−+ c2C±+ c3C+]nwe denote the class of programs
P that are unions of programs from the corresponding classes. We refer to these
programs as components of P. If any of the integers mi and ci is 0, we omit
the corresponding term from the notation. When we do not specify the numbers
of rules, we allow any programs from the corresponding classes. For instance,
[R−+ R++ C−+ C±+ C+]nstands for the class of all proper programs with
atoms from Atn.
Given integers n and m, it is easy to generate uniformly at random programs
from each class mR−
program from mR−
ncan be viewed as the result of a process in which we start
with the empty program on the set of atoms Atn and then, in each step, we
add a randomly generated proper rule of the form a ← not b, with repeating
rules discarded, until m rules are generated. This approach generalizes easily
n, mR+
n, mC−
n, mC±
n, and mC+
n,
nstands for
n, mR+
nand mC±
nand mC+
nare not empty is
nare not empty is
n, mR+
n, mC−
n, mC±
n, and mC+
n. For instance, a random
Page 4
to programs from other classes we consider, in particular, to programs from
[m1R−+ m2R++ c1C−+ c2C±+ c3C+]n. Our goal is to study properties of
such random programs.
We start with a general observation. If P ∈ [m2R++c1C−+c2C±+c3C+]n
(m1= 0), then either P has no answer sets (if c1?= 0) or, otherwise, ∅ is a unique
answer set of P. Thus, in order to obtain interesting classes of programs, we must
have m1> 0. In other words, programs from R−
constraintfree) play a key role.
n(proper purely negative and
4The Probability of a Program to Have an Answer Set
We study first the probability that a random program in the class [m1R−+
m2R++c1C−+c2C±+c3C+]nhas an answer set. In several places we use results
from random graph theory [14,15]. To this end, we exploit graphs associated with
programs. Namely, with a program P ∈ [R−+R++C±]nwe associate a directed
graph D(P) with the vertex set Atn, in which a is connected to b with a directed
edge (a,b) if b ← not a, b ← a or ← b,not a is a rule of P. For P ∈ [R−+R+]n,
the graph D(P) is known as the dependency graph of a program. Similarly, with
a program P ∈ [R−+ R++ C−+ C±+ C+]nwe associate an undirected graph
G(P) with the vertex set Atn, in which a is connected to b with an undirected
edge {a,b} if a and b appear together in a rule of P. If P ∈ [R−+ R++ C±]n,
then D(P) may have fewer edges than P has rules (the rules a ← not b, a ← b
and ← b,not a determine the same edge). A similar observation holds for G(P).
These graphs contain much information about the underlying programs. For
instance, it is well known that if P ∈ [R−+R+]nand D(P) has no cycles then P
has a unique answer set. Similarly, if P ∈ [m1R−+m2R++c1C−+c2C±+c3C+]n
and M is an answer set of P then M is an independent set in the graph G(P1),
where P1is the component of P from m1R−
We denote by AS+the class of all programs over At that have answer sets.
We write Prob(P ∈ AS+) for the probability that a random graph P from one
of the classes defined above has an answer set. That probability depends on n
(technically, it also depends on the numbers of rules of particular types but,
whenever it is so, the relevant numbers are themselves expressed as functions
of n). We are interested in understanding the behavior of Prob(P ∈ AS+) for
random programs P from the class [R−+ R++ C−+ C±+ C+]n(or one of its
subclasses). More specifically, we will investigate Prob(P ∈ AS+) as n grows to
infinity. If Prob(P ∈ AS+) → 1 as n → ∞, we say that P asymptotically almost
surely, or a.a.s for short, has answer sets. If Prob(P ∈ AS+) → 0 as n → ∞, we
say that P a.a.s. has no answer sets.
To ground our results in some intuitions, we first consider the probability that
a program from mR−
150has an answer set as a function of the density d = m/150
(or equivalently, the number of edges m). The graphs, shown in Figure 1, were
obtained experimentally. For each value of d, we generated 1000 graphs from the
set mR−
150, where m = 150d. The graph on the left shows the behavior of the
n.
Page 5
probability across the entire range of d. The graph on the right shows in more
detail the behavior for small densities.
(a)
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 20 40 60 80 100 120 140 160
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
(b)
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
0 2 4 6 8 10
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Fig. 1. The probability that a graph from mR−
150(m = 150d) has an answer set, as a function of d.
The graphs show that the probability is close to 1 for very small densities,
then drops rapidly. After reaching a low point (around 0.6, in this case), it starts
getting larger again and, eventually, reaches 1. We also note that the rate of
drop is faster than the rate of ascent. We will now present theoretical results
that quantify some of these observations. Our results concern the two extremes:
programs of low density and graphs of high density.
We start with programs of low density and assume first that they do not
have constraints. In this case, the results do not depend on whether or not we
allow positive rules.
Theorem 1. If m1+ m2= o(n) and P ∈ [m1R−+ m2R+]n, then P a.a.s has
a unique answer set.
Proof. (Sketch) Let P be a random program from [m1R−+m2R+]n. The directed
graph D(P) can be viewed as a random directed graph with n vertices, and
m′= o(n) edges (m′≤ m, as different rules in P may map onto the same
edge). Thus, D(P) a.a.s. has no directed cycles (the claim can be derived from
the property of random undirected graphs: a random undirected graph with n
vertices and o(n) edges a.a.s. has no cycles [15]). It follows that P a.a.s. has a
unique answer set.
2
If there are constraints in the program, the situation changes. Even a sin
gle constraint of the form
← not a,not b renders a sparse random program
inconsistent.
Corollary 1. If c1 ≥ 1, m1+ m2 = o(n), and P is a random program from
[m1R−+ m2R++ c1C−]n, then P a.a.s. has no answer sets.
Proof. Let P be a random program from [m1R−+m2R++c1C−]n. Then, P =
P1∪ P2, where P1 is a random program from [m1R−+ m2R+]n and P2 is a
random program from c1C−
say M. Since P1has o(n) nonconstraint rules, M = o(n). The probability that
n. By Theorem 1, P1a.s.s. has a unique answer set,
View other sources
Hide other sources
 Available from citeseerx.ist.psu.edu
 Available from citeseerx.ist.psu.edu