# A Decomposition-Based Approach to OWL DL Ontology Diagnosis.

**ABSTRACT** Computing all diagnoses of an inconsistent ontology is important in ontology-based applications. However, the number of diagnoses can be very large. It is impractical to enumerate all diagnoses before identifying the target one to render the ontology consistent. Hence, we propose to represent all diagnoses by multiple sets of partial diagnoses, where the total number of partial diagnoses can be small and the target diagnosis can be directly retrieved from these partial diagnoses. We also propose methods for computing the new representation of all diagnoses in an OWL DL ontology. Experimental results show that computing the new representation of all diagnoses is much easier than directly computing all diagnoses.

**0**Bookmarks

**·**

**69**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Ontology debugging aims to provide users with justifications for an entailment in OWL ontologies. So far, many ontology debugging algorithms have been proposed and several ontology debugging systems are available. There has been some work on evaluating these systems with the efficiency as the main evaluation measure. However, existing systems may fail to find all justifications for an entailment within a time limit and may return incorrect justifications. Therefore, measuring their effectiveness by considering the correctness of justifications and the completeness of a found set of justifications is helpful. In this paper, we first give a survey of existing ontology debugging approaches and systems. We then evaluate both the effectiveness and the efficiency of existing ontology debugging systems based on a large collection of diverse ontologies. To assess the effectiveness of an ontology debugging system, we first propose a method to construct the reference justification sets and define the degrees of correctness and completeness of the system. Then we construct a dataset containing 80 ontologies with significantly different sizes and expressivities. Based on the proposed evaluation measures and the constructed dataset, we do comprehensive experiments. The results show the advantages and disadvantages of existing ontology debugging systems in terms of correctness, completeness and efficiency. Based on the results, we provide several suggestions for users to choose an appropriate ontology debugging system and for developers to design an ontology debugging algorithm and build an ontology debugging system.Knowledge-Based Systems 11/2014; 71. · 4.10 Impact Factor

Page 1

A Decomposition-based Approach to OWL DL Ontology Diagnosis

Jianfeng Du∗†, Guilin Qi‡, Jeff Z. Pan§, Yi-Dong Shen†

∗Guangdong University of Foreign Studies, Guangzhou 510006, China

Email: jfdu@mail.gdufs.edu.cn

†State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China

‡School of Computer Science and Engineering, Southeast University, NanJing 211189, China

§Department of Computing Science, The University of Aberdeen, Aberdeen AB243UE, UK

Abstract—Computing all diagnoses of an inconsistent ontol-

ogy is important in ontology-based applications. However, the

number of diagnoses can be very large. It is impractical to

enumerate all diagnoses before identifying the target one to

render the ontology consistent. Hence, we propose to represent

all diagnoses by multiple sets of partial diagnoses, where the

total number of partial diagnoses can be small and the target

diagnosis can be directly retrieved from these partial diagnoses.

We also propose methods for computing the new representation

of all diagnoses in an OWL DL ontology. Experimental results

show that computing the new representation of all diagnoses

is much easier than directly computing all diagnoses.

Keywords-ontology diagnosis; description logics; OWL DL;

decomposition; inconsistency handling

I. INTRODUCTION

Ontologies provide formal representations of shared

knowledge and are playing a core role in many applications.

The W3C organization proposed the web ontology language

(OWL) for modeling ontologies. Among all species in the

OWL family, OWL DL is an import one corresponding to

the Description Logic (DL) SHOIN(D) [1]. An OWL DL

ontology, treated as a set of axioms, consists of a TBox

(intensional part) and an ABox (extensional part).

There are two types of logical inconsistencies about OWL

DL ontologies. The first type is called inconsistency. An

OWL DL ontology is inconsistent iff it has no models. The

second type is called incoherency. An OWL DL ontology

is incoherent iff it has some unsatisfiable atomic concepts.

The incoherency can be viewed as a potential inconsistency

since adding concept assertions to an incoherent ontology

will probably render the ontology inconsistent. Resolving

logical inconsistencies is crucial to make an ontology usable

under the standard reasoning mechanisms.

Ontology diagnosis, a well-known approach to handling

logical inconsistencies, computes diagnoses of an inconsis-

tent ontology, where a diagnosis of a TBox is a minimal

subset of axioms in the TBox whose removal renders the

TBox coherent, and a diagnosis of an ontology is a minimal

subset of axioms in the ontology whose removal renders

the ontology consistent [2], [3]. This notion of diagnosis is

originated from the model-based diagnosis (MBD) field [4].

In order to resolve logical inconsistencies, the ultimate goal

of ontology diagnosis is to identify among all diagnoses the

target diagnosis in which all axioms need to be changed.

In the MBD field, the target diagnosis is usually identified

by performing a sequence of tests on the set of all diagnoses

[5]. By generating a sequence of queries posed upon an

oracle such as a user, and by using the answer of a query to

sequentially filter diagnoses, the set of diagnoses will finally

shrink to a singleton set containing the target diagnosis. This

approach has been adapted by Shchekotykhin and Friedrich

[6] to identify the target diagnosis of an inconsistent ontol-

ogy. There exists another approach which ranks all diagnoses

and proposes a diagnosis with a minimal rank or cost [7], [8].

It determines a target diagnosis based on some predefined

measures. In contrast, by exploiting additional information,

the approach that uses a sequence of tests to identify the

target diagnosis is more elaborate and reliable. Hence, in this

paper we only consider this approach. Since the approach

requires all diagnoses to be computed beforehand, it may

become impractical when the number of diagnoses is huge.

A sample study of real-world incoherent TBoxes pre-

sented in Table I shows that the number of diagnoses can

be up to millions, e.g., the MGED TBox has 27,317,365

diagnoses. It is even hard to enumerate all these diagnoses.

This situation may become even worse for an inconsistent

ontology. Consider the following example.

Example 1: Let O be an inconsistent ontology, where its

TBox consists of the following two axioms

PhdStudent ? Student,Student ? ¬Worker,

and its ABox consists of the following 3n axioms

PhdStudent(ai),Worker(ai),isFriendOf(ai,ai+1)

(i = 1,...,n) and the axiom an+1≈ a1.

This ontology O has 2n+ 2 diagnoses. The first one is

{PhdStudent ? Student}, the second one is {Student ?

¬Worker}, and the others are of the form {α1,...,αn}

where αi is either PhdStudent(ai) or Worker(ai). When

n is large, enumerating all diagnoses of O is hard or even

impossible. This kind of ontologies can appear in reality.

For example, during populating an ontology with Web data,

concept assertions are retrieved from different Web sources,

then an inconsistent ontology like O can be generated.

Page 2

Table I

SOME REAL-WORLD INCOHERENT TBOXES

TBox

University

Chemical

Tambis

Geography

MGED

Proton

#C

30

48

395

400

225

266

#R

12

20

100

#A

46

114

596

1621

1654

1826

#D

90 1.

2.

3.

4.

5.

6.

6

147

034,200

6827,317,365

6,531,840111

Note: #C/#R/#A/#D are the numbers of atomic concepts, atomic

roles, axioms and diagnoses. The first three TBoxes were down-

loaded from http://www.mindswap.org/ontologies/debugging/, the

next two from http://www.few.vu.nl/∼schlobac/software.html, and

the last one from an experiment of disjointness learning [9].

In this paper, we focus on a general diagnosis problem

that generalizes the problem of diagnosing incoherence or

inconsistency. It computes all diagnoses of an ontology O

w.r.t. a set Opst of unchangeable axioms in O, namely

all minimal subsets of O disjoint with Opst and whose

removal renders O consistent. Although the introduction

of Opst prunes some diagnoses, the number of diagnoses

of O w.r.t. Opst can still be large. Consider the ontology

O given in Example 1, the number of diagnoses w.r.t.

{PhdStudent ? Student} is still 2n+ 1.

In order to deal with the large number of diagnoses, it is

critical to compute a new representation of all diagnoses,

which should be easier to compute and should facilitate

the identification of the target diagnosis. To this end, we

propose to represent the set D of diagnoses of an ontology by

multiple sets of partial diagnoses (where a partial diagnosis

is a subset of some diagnosis) D1,...,Dnsuch that

D = reduce(D11 D21 ... 1 Dn),

where X 1 Y denotes the cross join of X and Y , i.e.,

X 1 Y = {S1∪ S2| S1∈ X,S2∈ Y },

and reduce(X) denotes the set of irreducible sets in X, i.e.,

reduce(X) = {S ∈ X |?∃S?∈ X : S?⊂ S}.

For example, the set of all diagnoses of O given in Ex-

ample 1 w.r.t. {PhdStudent ? Student} can be repre-

sented by Di = {{Student ? ¬Worker}, {Student(ai)},

{Worker(ai)}}, where the cardinality of each Di is three.

Hence only 3n partial diagnoses need to be computed.

This new representation of all diagnoses facilitates the

identification of the target diagnosis. Suppose there exists a

method M which, when given a set X of partial diagnoses,

returns a partial diagnosis in X which is a subset of the

target diagnosis. Such a method M can be adapted from

existing methods for identifying the target diagnosis by

performing a sequence of tests, such as the method given in

[6]. Then the target diagnosis can be sequentially retrieved

from D1,...,Dnthrough the method M. That is, for each i

from 1 to n, identify a subset Siof the target diagnosis in Di

Table II

THE SYNTAX AND SEMANTICS OF OWL DL

SymbolSemantics

∆I

{aI}

CI∩ DI

{x ∈ ∆I| ∃y : (x,y) ∈ RI,y ∈ CI}

{x ∈ ∆I| ∀y : (x,y) ∈ RI→ y ∈ CI}

{x ∈ ∆I| |{y ∈ ∆I| (x,y) ∈ SI}| ≥ n}

{x ∈ ∆I| |{y ∈ ∆I| (x,y) ∈ SI}| ≤ m}

Axiom Name

concept inclusion axiom

role inclusion axiom

transitivity axiom

concept assertion

role assertion

equality assertion

inequality assertion

Symbol Semantics

?

{a}

C ? D

∃R.C

∀R.C

≥nS

≤mS

⊥

¬C

C ? D

∅

∆I\ CI

CI∪ DI

Syntax Semantics

CI⊆ DI

RI

RI× RI⊆ RI

aI∈ CI

(aI,bI) ∈ RI

aI= bI

aI?= bI

TBox

C ? D

R1? R2

Trans(R)

C(a)

R(a,b)

a ≈ b

a ?≈ b

1⊆ RI

2

ABox

by M. Then?n

subset, then since?n

a minimal subset S of O such that O \ S is consistent.

We propose two methods for computing the new represen-

tation of all diagnoses. Both methods compute a certain set

of subsets of O, namely {O1, ..., On}, and then separately

compute Di in Oi for all i ∈ {1,...,n}. To compute

the set {O1, ..., On}, we adopt the method proposed in

[10]. That is, we first compile O to a finite propositional

program Π, then decompose Π into disjoint subsets, and

finally extract O1, ..., On from these subsets. To compute

each Di, we propose two methods. The first one is a

simple adaption of Reiter’s method [4], which computes

all minimal inconsistent subsets [11] and then computes all

diagnoses from them. The second one combines the method

proposed in [12] for enumerating all diagnoses and the

QUICKXPLAIN framework [13]. Our experimental results

show that computing the new representation of all diagnoses

is much easier than directly computing all diagnoses.

i=1Siis the target diagnosis. This is because

i=1Siis a subset of the target diagnosis; if it is a proper

i=1Si∈ D11 ... 1 Dn, O \?n

?n

must be consistent, contradicting that the target diagnosis is

i=1Si

II. A GENERAL DIAGNOSIS PROBLEM IN OWL DL

The Web Ontology Language (OWL) comes with three

species, OWL Lite, OWL DL and OWL Full, with increasing

expressivity. We only consider OWL DL because OWL

DL covers OWL Lite and OWL Full is undecidable. OWL

DL corresponds to Description Logic (DL) SHOIN [1].

Although OWL DL also contains datatypes, we do not

complicate our presentation by considering them here.

An OWL DL vocabulary consists of a set NCof atomic

concepts, a set NR of atomic roles, and a set NI of

individuals. A role is either an atomic role r ∈ NR or an

inverse role r−with r ∈ NR. By R−we denote the inverse

of a role R, defined as r−when R = r, and r when R = r−.

The set of OWL DL concepts is recursively defined using

Page 3

atomic concepts A ∈ NCand the constructor symbols given

in Table II. An ontology O, treated as a set of axioms in this

paper, consists of a TBox and an ABox, where the axioms

are specified in Table II.

An interpretation I = (∆I,·I) of O consists of a domain

∆Iand a function ·Ithat maps every A ∈ NC to a set

AI⊆ ∆I, every r ∈ NRto a binary relation rI⊆ ∆I×∆I,

and every a ∈ NIto aI∈ ∆I. The interpretation is extended

to roles by defining (r−)Ias {(x,y) | (y,x) ∈ rI} and

to concepts according to Table II, where |S| denotes the

cardinality of a set S. An interpretation I satisfies an axiom

α if the respective condition to the right of the axiom in

Table II holds; I is a model of O if I satisfies every axiom

in O. O is said to be consistent if it has a model. A concept

C is said to be satisfiable in O if there exists a model I

of O such that CI?= ∅. O is said to be coherent if every

atomic concept in O is satisfiable.

In this paper, we focus on a general diagnosis problem

which computes all diagnoses of an ontology O w.r.t. a set

Opstof unchangeable axioms in O, where a diagnosis S of

O w.r.t. Opst(simply called a diagnosis if O and Opstare

clear from the context) is a subset of axioms in O such that

S∩Opst= ∅, O\S is consistent, and for all proper subsets

S?of S, O \ S?is inconsistent.

The problem of diagnosing an incoherent TBox T , namely

computing all diagnoses of T , can be reduced to the above

problem, because the set of diagnoses of T is the set of

diagnoses of T ∪ A w.r.t. A, where A = {A(aA) | A

is an atomic concept in T , aA is a fresh individual for

A}. It should be noted that any method for identifying

the target diagnosis from all diagnoses w.r.t. ∅ can still

work on all diagnoses w.r.t. Opst. This is because the target

diagnosis cannot contain any unchangeable axiom, so the

target diagnosis is also a diagnosis of O w.r.t. Opst.

III. COMPUTING ALL DIAGNOSES

Since the number of diagnoses of O w.r.t. Opst can be

very large, we propose to represent all diagnoses by multiple

sets of partial diagnoses D1,...,Dnsuch that reduce(D11

D21 ... 1 Dn) is equal to the set of diagnoses of O w.r.t.

Opst. In the following, we propose methods for computing

D1,...,Dn. The key idea is to compute a set of subsets of

O, namely {O1,...,On}, such that O\?n

compute the set Diof diagnoses of Oiw.r.t. Opst∩ Oiin

Oi. The correctness is shown in the following theorem.

Theorem 1: Let {O1,...,On} be a set of subsets of O

such that O \?n

Opst∩ Oi, and D be the set of diagnoses of O w.r.t. Opst.

Then (1) any set in Diis a subset of some set in D and (2)

D = reduce(D11 D21 ... 1 Dn).

Proof: (1) Let Sibe a set in Di. If O\Siis inconsistent,

since Oi⊆ O, there must be some S ∈ D such that Si⊆ S.

i=1Siis consistent

for any diagnosis Siof Oiw.r.t. Opst∩Oi, then separately

i=1Siis consistent for any diagnosis Siof

Oiw.r.t. Opst∩Oi. Let Dibe the set of diagnoses Oiw.r.t.

If O \ Siis consistent, then there is some S ∈ D such that

S ⊆ Si. Since Siis a diagnosis of Oiw.r.t. Opst∩ Oiand

Oi\ S ⊆ O \ S is consistent, S = Si.

(2) We first show that for any set S ∈ D, S ∈

reduce(D1 1 D2 1 ... 1 Dn). Let Si = S ∩ Oi for all

i ∈ {1,...,n}. Since O \ S is consistent, Oi\ Si= Oi\ S

is also consistent. Hence there exists S?

S?

is consistent. Since S ∈ D and?n

S ?∈ reduce(D11 D21 ... 1 Dn), then there exists S?⊂ S

such that S?∈ reduce(D1 1 D2 1 ... 1 Dn). But then,

since O \ S?is consistent and S?∩ Opst= ∅, S will not be

a diagnosis of O w.r.t. Opst, contradiction.

We then show that for any set S ∈ reduce(D11 D21

... 1 Dn), S ∈ D. Suppose S ?∈ D, then since O \ S is

consistent and S ∩ Opst= ∅, there exists S?∈ D such that

S?⊂ S. But then, by the conclusion shown in the previous

paragraph, we have S?∈ reduce(D1 1 D2 1 ... 1 Dn),

contradicting that S ∈ reduce(D11 D21 ... 1 Dn).

According to the above theorem, it is crucial to compute

from O a set of subsets {O1,...,On}, called a diagnosis-

preserving set for O w.r.t. Opst, such that O \?n

A. Computing a Diagnosis-preserving Set

For computing a diagnosis-preserving set for O w.r.t.

Opst, a direct decomposition of O based on the connectivity

of entities in O is generally ineffective. To obtain smaller

subsets from O, we adapt the method proposed in [10] to

compute a diagnosis-preserving set for O. The method is

outlined in the following.

Initially, the method compiles O to a labeled propositional

program Π that consists of a finite set of labeled clauses of

the form (cl,ax), where the label ax is the axiom from

which the ground clause cl is translated; ax is the empty

label ? if cl is a clause used to axiomatize equality. There

are three steps in this compilation.

The first step is translating O to a labeled first-order logic

(FOL) program with equality P which consists of a set of

labeled clauses of the form (cl,ax), where ax is the axiom

from which the clause cl is translated. This translation is

adapted from some well-known methods including struc-

tural transformation e.g. given in [14] and clausification

(with skolemization) by adding labels. For each axiom ax

in O, this translation is applied to yield a set of labeled

clauses for ax. For example, let ax1 be the first axiom

in the TBox of the ontology O given in Example 1, then

applying this translation to ax1 will yield a singleton set

{(¬PhdStudent(x) ∨ Student(x),ax1)}. By P we denote

the union of all resulting sets returned by the application of

this translation to all axioms in O.

The second step is transforming P to a labeled FOL

program without equality P?by adding some labeled clauses

i∈ Di such that

i⊆ Si. By the condition on {O1,...,On}, O \?n

?n

i=1S?

i

i=1S?

i⊆ S, we have

i=1S?

i= S, thus S ∈ D1 1 D2 1 ... 1 Dn. Suppose

i=1Si is

consistent for any diagnosis Siof Oiw.r.t. Opst∩ Oi.

Page 4

(cl,?) to P. These clauses cl are standard clauses for

axiomatizing equality e.g. defined in [15]. P?is simply set

as P if all clauses in P have no positive equational literals.

The last step is transforming P?to Π. Instead of directly

grounding P?, which may not terminate when P?contains

function symbols, this step computes a finite superset Π of

λ(Π?) for a convergent mapping function λ for the bottom-up

grounding [10] Π?of P?. The notion of convergent mapping

function is first given in [16] and then adapted in [10] to

deal with labeled propositional programs Π†. A convergent

mapping function λ for Π†is defined as a mapping function

from ground terms occurring in Π†to constants occurring

in Π†, such that (i) for every functional term f1(...fn(a))

(where a is a constant) occurring in Π†, λ(f1(...fn(a))) =

λ(a), and (ii) for every equational atom s ≈ t positively

occurring in Π†, λ(s) = λ(t). Here, we simply use λ(Π†)

to denote the result obtained from Π†by replacing every

occurrence of ground term t with λ(t). Note that λ(Π?) must

be finite because it contains no functional terms but only

constants occurring in P?. Using the method proposed in

[16], Π is directly computed from P?.

Let Cl(Π) and Ax(Π) respectively denote the clause part

and the axiom part of Π, i.e., Cl(Π) = {cl | (cl,ax) ∈ Π}

and Ax(Π) = {ax ∈ O | (cl,ax) ∈ Π}, and let ρ(Π,O?)

denote the projection of Π on a subset O?of O, defined as

{(cl,ax) ∈ Π | ax ∈ O?∪ {?}}.

After Π is computed, the method first computes a satisfi-

able core Π0of Π, then decomposes Π \ Π0into maximal

connected components Π1,...,Πn, and finally sets Oi as

Ax(ρ(Πi,O)). A satisfiable core Π0of Π is a subset of Π

such that Cl(Π0)∪S is satisfiable for any satisfiable subset S

of Cl(Π\Π0). It is computed by some fix-point computation

methods [10]. A connected component of Π is a subset of

Π such that any two clauses in Cl(Π) are connected through

common ground atoms. A maximal connected component of

Π is a connected component of Π which cannot be a proper

subset of any connected component of Π.

Let Si be a diagnosis of Oi w.r.t. Opst∩ Oi for all

i ∈ {1,...,n}. It has been shown in [10] that O \?n

preserving set for O w.r.t. Opst.

B. Two Methods for Computing All Diagnoses

We develop two methods for computing all diagnoses of

O w.r.t. Opst. These methods are also used to computing all

diagnoses of Oiw.r.t. Opst∩ Oiin our experiments.

The first method, adapted from the diagnosis method

proposed by Reiter [4], computes the set M of minimal

inconsistent subsets (MISs) of O, and then computes the set

D of minimal hitting sets (MHSs) of M that do not contain

axioms in Opst. Since the set of MHSs of M is the set of

diagnoses of O w.r.t. ∅ [4], D is the set of diagnoses of O

w.r.t. Opst. In this method, we employ an efficient algorithm,

proposed in [11], to compute all MISs. Moreover, we employ

i=1Si

is consistent. Hence the set {O1,...,On} is a diagnosis-

The main procedure EnumerateAllDiagnoses(O, Opst)

Input: An ontology O and a set Opst of unchangeable axioms.

Output: The set of diagnoses of O w.r.t. Opst.

Comments: The maximum subscript n of diagnoses and the

diagnoses D0,D1,...,Dn of O w.r.t. Opst are used globally.

1:

n ← 0;

2: ConstructDiagnosis(0, ∅);

3:

return {D0,...,Dn};

Procedure ConstructDiagnosis(i, M)

1:

if i = n then

2:

if M ∪ Opst is inconsistent then exit;

3:

Di ← O\ FindMaximalSuperset(M);

4:

n ← n + 1;

5:

end if

6:

if Di∩ M ?= ∅ then ConstructDiagnosis(i + 1, M)

7:

else for each e ∈ Di such that

M ∪ {e} is a minimal hitting set (MHS) of {D0,...,Di}

do ConstructDiagnosis(i + 1, M ∪ {e});

8:

end if

Function FindMaximalSuperset(M)

1:

Otmp ← O \ (Opst∪ M);

2:

return mcs(Opst∪ M,Otmp) ∪ Opst∪ M;

Figure 1.The method adapted from [12] for computing all diagnoses

the well-known Hitting Set Tree algorithm [4] to compute

all MHSs. This method is called the MIS based method.

The second method, adapted from the method proposed

by Satoh and Uno [12], computes all diagnoses of O w.r.t.

Opstby alternately generating the set D of diagnoses of O

w.r.t. Opstand computing the set M of MHSs of D. This

method does not compute all MISs beforehand, thus we call

it the direct method. The pseudo-code is given in Figure 1.

In the procedure ConstructDiagnosis(i,M), the function

FindMaximalSuperset(M) calls mcs(Opst∪ M,Otmp) and

returns a maximal consistent superset M+of M ∪ Opst,

where M is an MHS of {D0,...,Di−1}, Otmp = O \

(Opst∪M), and mcs(B,C), in which B (resp. C) is treated

as an unchangeable (resp. a changeable) part, denotes a

maximal subset S of C such that B ∪ S is consistent. It

follows that O \ M+is a diagnosis of O w.r.t. Opstand is

different from D0,...,Di−1. We exploit the QUICKXPLAIN

framework [13] to compute mcs(B,C). Let C1 consist

of arbitrary ?|C|/2? axioms in C, C2 = C \ C1 and

∆1 = mcs(B,C1). It is shown in [13] that mcs(B,C) is

equal to ∆1∪ mcs(B ∪ ∆1,C2) and can be computed in

B∪C1and B∪∆1∪C2in turn. We use this way to compute

mcs(Opst∪M,Otmp). It is essentially a divide-and-conquer

method and will be more efficient than the way suggested

in [12] which handles every axiom in Otmpone by one.

The direct method computes all MHSs of the set of gener-

ated diagnoses using depth-first search. After an MHS M of

the set of generated diagnoses is computed, the consistency

of M ∪ Opst is checked. If M ∪ Opst is inconsistent, M

will be an MHS of all diagnoses of O w.r.t. Opst[12], thus

the method computes another MHS of the set of generated

diagnoses; otherwise, the method computes a diagnosis of

O w.r.t. Opst(whose complement is a maximal consistent

Page 5

superset of M∪Opst) and computes an MHS (which can still

be M) of the new set of generated diagnoses. This method

guarantees that a newly generated MHS is different from

previously generated ones, while all diagnoses of O w.r.t.

Opstare finally generated [12].

IV. EXPERIMENTAL EVALUATION

We compared four methods for computing all diagnoses.

The first two methods are the MIS based method (M1) and

the direct method (M2) presented in the previous section.

The implementations of these two methods (in Java) use

the Pellet [17] API (version 2.2.2) to compute all MISs

and check consistency respectively. The last two methods

compute the new representation of all diagnoses. To compute

all diagnoses of an extracted subset, one method applies the

MIS-based method and is called the decomposition-and-MIS

based method (M3), the other applies the direct method and

is called the decomposition-based direct method (M4).

We used two groups of inconsistent ontologies. The

ontologies in the first group were modified from incoherent

TBoxes presented in Table I by adding a concept assertion

A(aA) for every atomic concept A, where aA is a fresh

individual for A. We tested these ontologies on computing

all diagnoses w.r.t. the set of added concept assertions. The

set of these diagnoses is equal to the set of diagnoses of

the original TBox. The characteristics of the ontologies and

the numbers of diagnoses have been shown in Table I.

The ontologies in the second group were modified from

the University Benchmark (UOBM-Lite) [18] by inserting

a specified number of conflicts using the Injector tool

presented in [8], where a conflict is a set of axioms violating

a functional role restriction or a disjointness constraint. By

UOBM-Liten+mwe denote an UOBM-Lite ontology with

axioms of n universities and with m conflicts inserted. We

generated UOBM-Liten+montologies for different combi-

nations of n ∈ {1, 10} and m ∈ {50, 100, 150, 200, 250,

300} and obtained totally 12 UOBM-Liten+m ontologies.

All these ontologies have large ABoxes. The number of ax-

ioms in an ABox ranges from 246,144 (UOBM-Lite1+50) to

2,097,873 (UOBM-Lite10+300). We tested these ontologies

on computing all diagnoses w.r.t. the TBox.

All experiments were conducted on a PC with Pentium

Dual Core 2.60GHz CPU and 2GB RAM, running Windows

XP, where the maximum Java heap size was set to (max)

1280MB. See http://jfdu.limewebs.com/diagnosis/ for more

details on our experiments, including implementations of the

four methods, accessorial tools and test ontologies.

The test results for the first group is shown in Table

III. For each of the last three ontologies, since the number

of diagnoses is very large, the direct computation of all

diagnoses by M1 or M2 exceeds 12 hours. In contrast, since

the total number of partial diagnoses is rather small, the

computation of the new representation of all diagnoses is

efficient. For the Tambis ontology, the number of diagnoses

Table III

THE EXECUTION TIME (IN SECONDS) OF DIFFERENT METHODS

Ontology

University

Chemical

Tambis

Geography

MGED

Proton

M1M2M3

19

113

TO

52

264

74

M4

19

96

#PDs

1.

2.

3.

4.

5.

6.

87

2

40

TO

TO

TO

TO

TO

514

34,1637,117 11,219

TO

TO

TO

5744

390

77

1972

165

Note: TO means running out of 12 hours. #PDs is the total number

of computed partial diagnoses.

is smaller than the total number of partial diagnoses, but

since partial diagnoses are computed in extracted subsets

rather than in the whole ontology, M4 is still faster than

M2. There are some undesirable results, i.e., M3 runs out

of 12 hours during computing all MISs in some extracted

subset of the Tambis ontology, whereas M1 runs out of 12

hours during computing all MISs for the last five ontologies.

A probable reason for these undesirable results is that the

method for computing all MISs [11], which is applied in

M1 and M3, facilitates MIS computation by simultaneously

computing diagnoses w.r.t. ∅. When there are too many

diagnoses w.r.t. ∅, M1 and M3 may not be practical. For the

first two ontologies, the computation of the new represen-

tation is generally slower than the direct computation of all

diagnoses. This is because the first two ontologies are small

and the compilation process in the course of computing the

new representation is relatively costly.

The test results for the second group is shown in Figure

2. Since each test ontology in this group has about 350

diagnoses, the direct computation of all diagnoses exceeds

12 hours and even seems impossible, so the results about

the direct computation are not shown. The figure shows

that computing the new representation of all diagnoses is

feasible, except that M3 runs out of 12 hours for UOBM-

Lite1+300as it needs to compute many diagnoses w.r.t. ∅ to

facilitate MIS computation in some extracted subsets.

As can be seen from Figure 2, the execution time on

UOBM-Lite1 increases more sharply than the execution time

on UOBM-Lite10 when the number of inserted conflicts

increases. This is because in UOBM-Lite10 ontologies, the

inserted conflicts are distributed to more extracted subsets

and the computation of partial diagnoses is easier. Figure 2

and Table III also show that M3 is sometimes faster than M4.

This is because the computation of MISs in M3 is integrated

into the consistency check (i.e., it is a glass box method),

while the computation of diagnoses in M4 calls a consistency

checker as an oracle (i.e., it is a black box method); hence

when the number of diagnoses w.r.t. ∅ is close to the number

of diagnoses w.r.t. Opst, M3 can be faster than M4.

To sum up, the main implication of our evaluation is that

computing the new representation of all diagnoses is much

easier than directly computing all diagnoses, especially for

large ontologies.