# DL-LITER in the Light of Propositional Logic for Decentralized Data Management.

**ABSTRACT** This paper provides a decentralized data model and associated algorithms for peer data manage- ment systems (PDMS) based on the DL-LITER de- scription logic. Our approach relies on reducing query reformulation and consistency checking for DL-LITER into reasoning in propositional logic. This enables a straightforward deployment of DL- LITER PDMSs on top of SomeWhere, a scalable propositional peer-to-peer inference system. We also show how to use the state-of-the-art Minicon algorithm for rewriting queries using views in DL- LITER in the centralized and decentralized cases.

**0**Bookmarks

**·**

**76**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Les systèmes d'inférence pair-à-pair (P2PIS) sont constitués de serveurs autonomes appelés pairs. Chaque pair gère sa propre base de connaissances (BC) et peut communiquer avec les autres pairs via des mappings afin de réaliser une inférence particulière au niveau global du système. Dans la première partie de la thèse, nous proposons un algorithme totalement décentralisé de calcul de conséquences par déduction linéaire dans les P2PIS propositionnels et nous étudions sa complexité communicationnelle, en espace et en temps. Nous abordons ensuite la notion d'extension non conservative d'une BC dans les P2PIS et nous exhibons son lien théorique avec la déduction linéaire décentralisée. Cette notion est importante car elle est liée à la confidentialité des pairs et à la qualité de service offerte par le P2PIS. Nous étudions donc les problèmes de décider si un P2PIS est une extension conservative d'un pair donné et de calculer les témoins d'une possible corruption de la BC d'un pair de sorte à pouvoir l'empêcher. La seconde partie de la thèse est une application directe des P2PIS au domaine des systèmes de gestion de données pair-à-pair (PDMS) pour le Web Sémantique. Nous définissons des PDMS basés sur la logique de description DL-LITER pour lesquels nous fournissons les algorithmes nécessaires de test de consistance et de réponse aux requêtes. Notre approche repose sur les P2PIS propositionnels car nous réduisons les problèmes de la reformulation des requêtes et de test de l'inconsistance à des problèmes de calcul de conséquences en logique propositionnelle. - [Show abstract] [Hide abstract]

**ABSTRACT:**This paper points out that the notion of non-conservative extension of a knowledge base (KB) is important to the distributed logical setting of peer-to-peer inference systems (P2PIS), a.k.a. peer-to-peer semantic systems. It is useful to a peer in order to detect/prevent that a P2PIS corrupts (part of) its knowledge or to learn more about its own application domain from the P2PIS. That notion is all the more important since it has connections with the privacy of a peer within a P2PIS and with the quality of service provided by a P2PIS. We therefore study the following tightly related problems from both the theoretical and decentralized algorithmic perspectives: (i) deciding whether a P2PIS is a conservative extension of a given peer and (ii) computing the witnesses to the corruption of a given peer's KB within a P2PIS so that we can forbid it. We consider here scalable P2PISs that have already proved useful to Artificial Intelligence and DataBases.Ai Communications 01/2009; 22:211-233. · 0.45 Impact Factor

Page 1

DL-LITERin the Light of Propositional Logic for Decentralized Data Management

N. Abdallah and F. Goasdou´ e

LRI: Univ. Paris-Sud, CNRS, and INRIA

{nada,fg}@lri.fr

M.-C. Rousset

LIG: Univ. of Grenoble, CNRS, and INRIA

Marie-Christine.Rousset@imag.fr

Abstract

This paper provides a decentralized data model

and associated algorithms for peer data manage-

ment systems (PDMS) based on the DL-LITERde-

scription logic. Our approach relies on reducing

query reformulation and consistency checking for

DL-LITER into reasoning in propositional logic.

This enables a straightforward deployment of DL-

LITERPDMSs on top of SomeWhere, a scalable

propositional peer-to-peer inference system. We

also show how to use the state-of-the-art Minicon

algorithm for rewriting queries using views in DL-

LITERin the centralized and decentralized cases.

1Introduction

Ontologies are the backbone of the Semantic Web by pro-

viding a conceptual view of data and services made available

worldwide through the Web. Description logics are the for-

mal foundations of the OWL ontology web language recom-

mendedbyW3C. Theycovera broadspectrumof logical lan-

guagesforwhichreasoningis decidablewith a computational

complexity varying depending on the set of constructors al-

lowed in the language. Answering conjunctive queries over

ontologiesis a reasoningproblemof majorinterest fortheSe-

mantic Web the associated decision problem of which is not

reducible to (un)satisfiability checking. The DL-Lite family

[Calvanese et al., 2007]has been specially designed for guar-

anteeing query answering to be polynomial in data complex-

ity. Thisis achievedbyaqueryreformulationapproachwhich

(1) computes the most general conjunctive queries which, to-

getherwiththeaxiomsintheTbox,entailtheinitialqueryand

(2) evaluates each of those query reformulations against the

Abox seen as a relational database. Such an approach has the

practical interest that it makes possible to use an SQL engine

for the second step, thus taking advantageof well-established

queryoptimizationstrategiessupportedbystandardrelational

data management systems. The reformulation step is neces-

sary for guaranteeing the completeness of the answers but is

a reasoning step independent of the data. A major result in

[Calvanese et al., 2007]is that DL-LITERis one of the max-

imal fragments of the DL-Lite family supporting tractable

query answering over large amounts of data. DL-LITERis a

fragment of OWL-DL1which extends RDFS2with interest-

ingcontructorssuch as inverseroles anddisjointnessbetween

concepts and between roles. RDFS is the first standard of the

W3C concerning the Semantic Web. Its use for associating

semantic metadata to web resources is rapidly spreading at a

large scale, as shown by the Billion Triple Track of the Se-

mantic Web Challenge (http://challenge.semanticweb.org/).

For scalability and robustness but also for data protection,

it is important to investigate a fully decentralized model of

the Semantic Web, viewed as a huge peer data management

system (PDMS). Each peer may have its own local ontology

for describing its data, and interacts with some other peers by

establishing mappings with their ontologies. The result is a

network of peers with no centralized knowledge and thus no

global control on the data and knowledge distributed over the

web.

Thecontributionofthis paperis a decentralizeddatamodel

andassociatedalgorithmsfordatamanagementintheSeman-

tic Web based on distributed DL-LITER. We revisit the cen-

tralized current approach of [Calvanese et al., 2007]for data

consistency checking and query answering by reformulation

in order to design corresponding decentralized algorithms.

We also extend the current work on DL-Lite by providing

both a centralized and a decentralized algorithm for rewriting

queries using views when queries and views are conjunctive

queries over DL-LITERontologies.

Our approach relies on reducing the above data manage-

mentproblemsfor DL-LITERintodecentralizedreasoningin

distributed propositional logic, in order to deploy DL-LITER

PDMSs on top of the SomeWhere platform. SomeWhere is

a propositional P2P inference system for which experiments

have demonstrated the scalability[Adjiman et al., 2006].

The paper is organizedas follows. In Section 2, we present

the distributed DL-LITER data model which is based on

bridgingdistributed DL-LITERontologieswith mappings. In

Section 3, we provide decentralized algorithms for query an-

swering by reformulation and for data consistency checking.

In Section 4, we investigate the problem of query rewriting

using views in DL-LITERin the centralized and decentral-

ized cases. Finally, we conclude with a short discussion on

related work in Section 5.

1http://www.w3.org/2004/OWL/

2http://www.w3.org/TR/rdf-schema/

2010

Page 2

2Distributed DL-LITER

DL-LITERconcepts and roles are of the following form:

B → A | ∃R, C → B | ¬B, R → P | P−, E → R | ¬R

where A denotes an atomic concept, P an atomic role, and

P−the inverse of P. B denotes a basic concept (i.e., an

atomic concept A or an unqualified existential quantification

on a basic role ∃R) and R a basic role (i.e., an atomic role

P or its inverse P−). Finally, C denotes a general concept

(i.e., a basic concept or its negation) and E a general role

(i.e., a basic role or its negation).

An interpretation I = (ΔI,.I) consists of a nonempty in-

terpretation domain ΔIand an interpretation function .Ithat

assigns a subset of ΔIto each atomic concept, and a binary

relation over ΔIto each atomic role. The semantics of non

atomic concepts and roles is defined as follows:

(P−)I= {(o2,o1) | (o1,o2) ∈ PI}

(∃R)I= {o1| ∃o2(o1,o2) ∈ RI}

(¬B)I= ΔI\BIand (¬R)I= ΔI× ΔI\RI

An interpretationI is a model of a concept C (resp. a role E)

if CI?= ∅ (resp. EI?= ∅).

DL-LITER knowledge bases.

base (KB) is made of a Tbox representing a conceptual view

of the domain of interest (i.e., an ontology), and either an

Abox (a local set of facts)[Calvanese et al., 2007]or view ex-

tensions (predefinedqueriesoverthe Tboxtogetherwith their

answers)[Calvanese et al., 2008b]for representing the data.

A DL-LITERTbox is a finite set of inclusion statements of

the form B ? C and/or R ? E. General concepts or roles

are only allowed on the right hand side of inclusion state-

ments whereas only basic concepts or roles may occur on the

left hand side of inclusion statements. Inclusions of the form

B1? B2or of the form R1? R2are called positive inclu-

sions (PIs), whereas inclusions of the form B1? ¬B2or of

the form R1? ¬R2are called negative inclusions (NIs). An

interpretation I = (ΔI,.I) is a model of an inclusion B ? C

(resp. R ? E) if BI⊆ CI(resp. RI⊆ EI). It is a model of

a Tbox if it satisfies all of its inclusion statements. A Tbox T

logically entails an inclusion statement α, written T |= α, if

every model of T is a model of α.

A DL-LITERAbox consists of a finite set of membership

assertions on atomic concepts and roles of the form A(a) and

P(a,b), stating respectivelythat a is an instanceofA andthat

the pair of constants (a,b) is an instance of P. The interpre-

tation function of an interpretation I = (ΔI,.I) is extended

to constants by assigning to each constant a a distinct object

aI∈ ΔI(i.e., the so called unique name assumption holds).

An interpretation I is a model of the membership assertion

A(a) (resp. P(a,b)) if aI∈ AI(resp., (aI,bI) ∈ PI). It is a

model of an Abox if it satisfies all of its assertions.

When the extensional knowledge is modeled using view

extensions, the KB is of the form ?T ,V,E? such that E is a

set of facts of the form v(¯t) where v is a view of V.

A DL-LITER knowledge

Queries and views over a DL-LITER KB.

sider (unions of) conjunctive queries of the form q(¯ x) :

∃¯ y conj(¯ x, ¯ y) where conj(¯ x, ¯ y) is a conjunction of atoms,

the variables of which are only the free variables ¯ x and the

existential variables ¯ y, and the predicates of which are either

We con-

atomic conceptsor roles of the KB. The arity of a queryis the

number of its free variables, e.g., 0 for a boolean query.

Given an interpretation I = (ΔI,.I), the semantics qIof

a boolean query q is defined as true if [∃¯ y conj(∅, ¯ y)]I=

true, and false otherwise, while the semantics qIof a query

q of arity n ≥ 1 is the relation of arity n defined on (ΔI) as

follows: qI= {¯ e ∈ (ΔI)n| [∃¯ y conj(¯ e, ¯ y)]I= true}.

A view v is defined by a query v(¯ x) : ∃¯ y conj(¯ x, ¯ y), and

has an extension E(v) which is a set of facts of the form v(¯t).

Following the open world assumption, we adopt the sound

semantics, i.e., for every interpretation I for each v(¯t) ∈

E(v),¯tI∈ vI.

A model of a KB K = ?T ,A? (resp. K = ?T ,V,E?) is an

interpretation I that is a model of both T and A (resp. of T ,

V and E). A KB K is consistent if it has at least one model.

K logically entails a membership assertion β, written K |=

β, if every model of K is a model of β.

(Certain) answers of a query over a DL-LITERKB.

defining the answers of a query over a KB, it is needed to

distinguish the case where the extensions of the query predi-

cates are given in an Abox, from the case where they just can

be (partially) inferred from extensions of views. In the latter

case, they are called the certain answers.

The answer set of a non boolean query q over K = ?T ,A?

is defined as: ans(q,K) = {¯t ∈ Cn| K |= q(¯t)} where C is

the set of the constants appearing in the KB, and q(¯t) is the

closed formula obtained by replacing in the query definition

the free variables in ¯ x by the constants in¯t.

The certain answer set of a non boolean query q over K =

?T ,V,E?, is defined as: cert(q,K) = {¯t ∈ Cn| K |= q(¯t)}.

By convention, the (certain) answer set of a boolean query

is {()}, () is the empty tuple, if K |= q(), and ∅ otherwise.

For

DL-LITERPDMSs

{Pi}i=1..n, where the index i models the identifier of the

peer Pi(e.g., its IP address). Each peer Pimanages its own

DL-LITERKB Kiwritten in terms of its own vocabulary,

i.e., atomic concepts and roles. We will note Ai(resp. Pi) the

atomic concept A (resp. the atomic role P) of Pi.

Mappings are here inclusion assertions (PIs and/or NIs) in-

volving concepts and/or roles of two different peers. For sim-

plifying the presentation, we consider that mappings are in

both KBs.

From a logical viewpoint, a PDMS S = {Pi}i=1..n is

a standard (yet distributed) DL-LITERKB K =

i.e., in contrast with other approaches ([Calvanese et al.,

2008a], [Franconi et al., 2004], [Serafini et al., 2005]) we

adopt a standard logical semantics for the mappings.

A DL-LITERPDMS S is a set of peers

?n

i=1Ki,

3

We first recall the Answer, Consistent, and PerfectRef

algorithms of [Calvanese et al., 2007] that are used for an-

swering queries over a DL-LITERKB K = ?T ,A? in the

centralized case (Section 3.1). Then we provide their decen-

tralized versions (in Sections 3.3 and 3.4). They are based

on the propositional encodingsummarized in Section 3.2 and

the use of the DeCa algorithm [Adjiman et al., 2006] which

is the decentralizedalgorithmfor propositionalreasoning im-

plemented in the SomeWhere platform.

Decentralized Query Answering

2011

Page 3

3.1

Given a union of conjunctive queries Q over a KB K =

?T ,A?, Answer (Algorithm 1) first checks whether K is in-

consistent (line 1). In that case, it returns all the tuples of

the arity of Q that can be generated from the constants occur-

ring in A (line 2). Otherwise, it gets ans(Q,K) by evaluat-

ing against A considered as a relational database the union of

conjunctive queries obtained by reformulation of Q (line 3).

Algorithm 1: The original Answer algorithm

Answer(Q,K)

Input: a union of conjunctive queries Q and a KB K = ?T ,A?

Output: ans(Q,K)

(1) if not Consistent(K)

(2)

return Alltup(Q,K)

(3) else return (S

Consistent (Algorithm 2) builds a boolean query qunsat

thatchecksthatthe DL-LITERformulaethatmustbedisjoint,

according to the intentional knowledge modeled in T , indeed

have disjoint instances in A. qunsatis obtained as the union

of the first-orderlogic (FOL) translations of the NI-closure of

T , denoted cln(T ), i.e., the set of all the NIs entailed by T .

The FOL translations of NIs are defined by:

δ(B1? ¬B2) = ∃x γ1(x) ∧ γ2(x) such that

γi(x) = Ai(x) if Bi= Ai

γi(x) = ∃yiPi(x,yi) if Bi= ∃Pi

γi(x) = ∃yiPi(yi,x) if Bi= ∃P−

δ(R1? ¬R2) = ∃x,y ρ1(x,y) ∧ ρ2(x,y) such that

ρi(x,y) = Pi(x,y) if Ri= Pi

ρi(x,y) = Pi(y,x) if Ri= P−

Algorithm 2: The original Consistent algorithm

Consistent(K)

Input: a KB K = ?T ,A?

Output: true if K is satisfiable, false otherwise

(1) qunsat = ⊥ (i.e., qunsatis false)

(2) foreach α ∈ cln(T )

(3)

qunsat = qunsat∨ δ(α)

(4) if qdb(A)

(5)

return true

(6) else return false

Finally, PerfectRef (Algorithm 3) reformulates each

conjunctive query q in Q by using the PIs in T as rewrit-

ing rules. PIs are seen as logical rules that can be applied in

backward-chaining to query atoms in order to expand them.

More specifically, a PI I is applicable to an atom A(x) of a

query if I has A in its right-hand side, and a PI I is appli-

cable to an atom P(x1,x2) of a query if (i) x2 =

right-hand side of I is ∃P; or (ii) x1=

side of I is ∃P−; or (iii) I is a role inclusion assertion and its

right-handside is either P or P−. Note that denotes here an

unbounded existential variable of a query.

The following definition (Definition 32 from [Calvanese

et al., 2007]) defines the result gr(g,I) of the (backward)

application of the PI I to the atom g, which is the core of

PerfectRef (loop (a), lines 5 to 7).

Existing DL-LITERalgorithms: reminder

qi∈QPerfectRef(qi,T ))db(A)

i

i

unsat= ∅

and the

and the right-hand

Definition 1 (Backward application of a PI to an atom)

Let I be an inclusion assertion that is applicable to the atom

g. Then, gr(g,I) is the atom defined as follows:

- if g = A(x) and I = A1 ? A, then gr(g,I) = A1(x)

- if g = A(x) and I = ∃P ? A, then gr(g,I) = P(x, )

- if g = A(x) and I = ∃P−? A, then gr(g,I) = P( ,x)

- if g = P(x, ) and I = A ? ∃P, then gr(g,I) = A(x)

- if g = P(x, ) and I = ∃P1 ? ∃P, then gr(g,I) = P1(x, )

- if g = P(x, ) and I = ∃P−

- if g = P( ,x) and I = A ? ∃P−, then gr(g,I) = A(x)

- if g = P( ,x) and I = ∃P1 ? ∃P−, then gr(g,I) = P1(x, )

- if g = P( ,x) and I = ∃P−

- if g = P(x1,x2) and either I = P1 ? P or I = P−

then gr(g,I) = P1(x1,x2)

- if g = P(x1,x2) and either I = P1 ? P−or I = P−

then gr(g,I) = P1(x2,x1)

The subtle pointof PerfectRef is the needof simplifying

the produced reformulations (loop (b), lines 8 to 10), so that

some PIs that were not applicable to a reformulation become

applicable to its simplifications. A simplification amounts to

unify two atoms of a reformulation using their most general

unifier(usingreduce, line 10) andthen to switch the possibly

new unbounded existential variables to (using τ, line 10).

Algorithm 3: The original PerfectRef algorithm

PerfectRef(q,T )

Input: a conjunctive query q and a Tbox T

Output: a union of conjunctive queries

(1) PR := {q}

(2) repeat

(3)

PR?:= PR

(4)

foreach q ∈ PR?

(5)(a) foreach g ∈ q

(6)

if I is applicable to g

(7)

PR := PR ∪ {q[g/gr(g,I)]}

(8)(b) foreach g1,g2 ∈ q

(9)

if g1and g2unify

(10)

PR := PR ∪ {τ(reduce(q,g1,g2))}

(11) until PR?= PR

1? ∃P, then gr(g,I) = P1( ,x)

1? ∃P−, then gr(g,I) = P1( ,x)

1 ? P−

1 ? P

3.2

The propositionalencodingof a DL-LITERTbox T , denoted

Φ(T ), is the formula of propositional logic (PL) that corre-

sponds to the union of the PL encoding of every inclusion

assersion I of T : Φ(T ) =?

I∈TΦ(I).

The PL encoding of a concept inclusion B ? C, denoted

Φ(B ? C) is inductivelydefined by {Φ(B) ⇒ Φ(C)} where

Φ(B) = A when B = A, Φ(B) = P∃when B = ∃P,

Φ(B) = P∃−whenB = ∃P−, Φ(C) = Φ(B) whenC = B,

and Φ(C) = ¬Φ(B) when C = ¬B.

The PL encoding of a role inclusion R ? E, denoted

Φ(R ? E), is defined as follows:

Φ(P ? Q)={P ⇒ Q,P−⇒ Q−,P∃⇒ Q∃,P∃−⇒ Q∃−}

Φ(P−? Q)={P−⇒ Q,P ⇒ Q−,P∃−⇒ Q∃,P∃⇒ Q∃−}

Φ(P ? Q−)={P ⇒ Q−,P−⇒ Q,P∃⇒ Q∃−,P∃−⇒ Q∃}

Φ(P−? Q−)={P−⇒ Q−,P ⇒ Q,P∃⇒ Q∃,P∃−⇒ Q∃−}

Φ(P ? ¬Q)={P ⇒ ¬Q,P−⇒ ¬Q−}

Φ(P−? ¬Q)={P−⇒ ¬Q,P ⇒ ¬Q−}

Φ(P ? ¬Q−)={P ⇒ ¬Q−,P−⇒ ¬Q}

Φ(P−? ¬Q−)={P−⇒ ¬Q−,P ⇒ ¬Q}

Note also that in the following Φ(E) = Φ(R) when E = R,

Φ(E) = ¬Φ(R) when E = ¬R, Φ(R) = P when R = P,

and Φ(R) = P−when R = P−.

The PL encoding of the distributed Tbox?n

DL-LITER PDMS is the distributed propositional theory

?n

i=1Φ(Ti) obtained by the encoding of each local Tbox Ti.

Propositional encoding of a DL-LITERTbox

i=1Ti of a

2012

Page 4

DECA is a message-based algorithm implemented in the

SomeWhere platform ([Adjiman et al., 2006]) which com-

putes in a decentralized manner the logical consequences of

propositional clausal theories distributed in a P2P system.

More precisely, by a copy of DECA running locally on each

peer and transmitting forth and back messages conveying lit-

erals and clauses, DeCAi(li) (denoting DECA running on

the peer Piand triggered with an input literal liof the Pivo-

cabulary) produces the set of all the proper prime implicates

of liw.r.t. the distributed theory?n

prime implicates of {li} ∪?n

cates of?n

i=1Φ(Ti) alone.

i=1Φ(Ti), i.e., the set of

i=1Φ(Ti), which are not impli-

3.3Decentralized Consistency Checking

Our approach relies on decentralizing the computation of the

NI-closure of a distributed Tbox?n

PDMS without empty roles (the Tbox does not entail a NI

P ? ¬P) by exploiting a property transfer of the propo-

sitional encoding (Theorem 1) and then by using DECA.

The subtle point is that in a decentralized setting, we have

to launch the computation of the NI-closure from each peer

and thus possibly start from local PIs and NIs that could lead

to the derivation of new NIs by interacting with NIs and PIs

of other peers. For doing so, we define (Definition 2) and

compute with DECA the NI-closure of a peer w.r.t a PDMS

without empty roles.

i=1Ti of a DL-LITER

Theorem 1 (NI-entailment reduced to PL entailment)

Let T be the distributed Tbox of a DL-LITERPDMS without

empty roles, and Φ(T ) its PL encoding. Let X and Y be

both distinct basic concepts or distinct basic roles:

cln(T ) |= X ? ¬Y iff Φ(T ) |= ¬Φ(X) ∨ ¬Φ(Y ).

The proof is by induction on the number of rules defining the

NI-closure (Definition 9 in [Calvanese et al., 2007]) used for

producing X ? ¬Y , for the if direction, and on the smallest

length of the resolutionprooffor producingΦ(X) ⇒ Φ(¬Y )

for the converse direction.

Definition 2 (NI-closure of a peer w.r.t. a PDMS) Let T =

?n

i=1Tibe the distributed Tbox of a DL-LITERPDMS S =

{Pi}i=1..nwithout emptyroles. The NI-closureofPiw.r.t.S,

denoted cln(Pi), is obtained from Φ(T ) using DECA as fol-

lows:

• for every PI Z ? Y ∈ Tisuch that Z is in the vocabu-

lary of Piand Y in that of Pj(j may be i), Z ? ¬X ∈

cln(Pi) for any ¬Φ(X) ∈ DeCAj(Φ(Y )).

• for every NI Z ? ¬Y ∈ Ti

– if Z is inthevocabularyofPiandY inthatofPj(j

may be i), Z ? ¬X ∈ cln(Pi) for any ¬Φ(X) ∈

DeCAj(¬Φ(Y ))

– if Y is in the vocabulary of Pi and Z is in that

of Pj(j may be i), X ? ¬Y ∈ cln(Pi) for any

¬Φ(X) ∈ DeCAj(¬Φ(Z)).

The decentralized version of the original Consistent al-

gorithm, denoted Consistentiwhen running on peer Pi, is

simply obtained by replacing foreach α ∈ cln(T ) in Line

2 of Algorithm 2 by foreach α ∈ cln(Pi), and where each

conjunctive query of qunsatdoes not have to be evaluated by

Piagainst the (unknown) global Abox of the whole PDMS.

Indeed, by construction of qunsat, each of its conjunctive

queries has two conjuncts, one from Piand another from Pj

(j may be i), the latter providingin its atomic concept or role

the identifier j of the peer to contact for the evaluation.

Theorem 2 states the correctness of locally running

ConsistentioneachpeerPiforglobalconsistencychecking

of a PDMS without empty roles.

Theorem 2 (Correctness of PDMS consistency checking)

Let S = {Pi}i=1..nbe a DL-LITERPDMS without empty

roles. S is consistent iff Consistentireturns true for every

i = 1..n.

The proof relies first on Theorem 1 showing the equivalence

between logical entailement from a Tbox of a NI with en-

tailment in PL of the corresponding propositional encoding.

Then, both Lemma 12 in [Calvanese et al., 2007] and the

completenessof DECA(provedin[Adjimanet al., 2006]) en-

sure that cln(Pi) defined in Definition 2 contains all the NIs

entailed by the PDMS and involving a concept or role in the

vocabularyofthepeerPi. Finally,itis easytosee thatbyrun-

ning Consistentifor every peer Piof the PDMS, we obtain

all the NIs entailed by the PDMS. Therefore Theorem 15 and

Lemma 16 in[Calvanese et al., 2007]ensure that consistency

checking can be made by evaluating the union of conjunc-

tive queries in qunsatagainst the relevant part of the Abox. It

is exactly what running Consistention all the peers in the

PDMS does in a decentralized manner.

3.4 Decentralized Query Reformulation

Ourapproachrelies onthe propositionalencodingandtheuse

of DECA for decentralizing the backward-closure w.r.t. the

PIs of each atom in the query.

backward-closure of an atom w.r.t. the PIs as the iteration

of the one-step backward application of PIs (Definition 1).

Proposition 1 states the termination of this iterative process.

Definition 3 defines the

Definition 3 (Backward-closure of an atom w.r.t. PIs) Let

PI be a set of PIs, g an atom, and A a set of atoms.

Wedefinethe

backward-closure

as

cl gr(g,PI)=

?

cl gri({g},PI) is recursively defined as follows:

of

g

w.r.t.

PI

i≥1cl gri({g},PI)

where

• cl gr1(A,PI) = {gr(g,I) | g ∈ A, I ∈ PI and I is

applicable to g}

• cl gri+1(A,PI) = cl gr1(cl gri(A,PI),PI)

Proposition 1 (Termination of backward-closure w.r.t. PIs)

The backward-closure of an atom w.r.t. a set of PIs

is finite, i.e., there exists a constant n such that

cl gr(g,PI) =?n

i=1cl gri({g},PI).

The proof corresponds to the termination proof of

PerfectRef (Lemma 34 in[Calvanese et al., 2007]).

Theorem3 is the equivalentforthe PIs of the transferprop-

erty of the propositional encoding for the NIs stated in The-

orem 1. Its proof is also by induction (number of one-step

applications of a PI and smallest length of resolution proofs).

2013

Page 5

Theorem 3 (Backward-closure reduced to PL entailment)

Let T be a DL-LITERTbox the PIs of which form the set

PI. Let g,g?be atoms, and V1,V2propositional variables.

g?∈ cl gr(g,PI) iff Φ(T ) ∪ {¬V1} |= ¬V2where:

- g = A(x), g?= A?(x), V1= A, and V2= A?;

- g = A(x), g?= P(x, ), V1= A, and V2= P∃;

- g = A(x), g?= P( ,x), V1= A, and V2= P∃−;

- g = P(x,y), g?= Q(x,y), V1= P, and V2= Q;

- g = P(x,y), g?= Q(y,x), V1= P, and V2= Q−;

- g = P(x, ), g?= A(x), V1= P∃, and V2= A;

- g = P(x, ), g?= Q(x, ), V1= P∃, and V2= Q∃;

- g = P(x, ), g?= Q( ,x), V1= P∃, and V2= Q∃−;

- g = P( ,x), g?= A(x), V1= P∃−, and V2= A;

- g = P( ,x), g?= Q(x, ), V1= P∃−, and V2= Q∃;

- g = P( ,x), g?= Q( ,x), V1= P∃−, and V2= Q∃−.

Based on Theorem 3, the decentralized computation of

cl gr(g,PI) is straighforward using DECA: if g is built

from the vocabulary of the peer Pi, g?∈ cl gr(g,PI) iff

¬V2∈ DeCAi(¬V1) for the same values of g, g?, V1, and V2

of the corresponding cases of Theorem 3.

The decentralized version of PerfectRef, denoted

PerfectRefiwhen running on peer Pi, is given in Algo-

rithm 4. For each atom in the query, it computes first (in the

decentralizedmannerexplainedpreviously)thesetofallofits

reformulations, and then it produces a first set of reformula-

tions ofthe originalquerybybuildingall theconjunctionsbe-

tween the atomic reformulations(denoted?n

at Line 5). Those reformulations are then possibly simpli-

fied by unifying some of their atoms (Lines 8 to 11), and the

reformulationprocess is iterated on those newly producedre-

formulations until no simplification is possible (general loop

starting on Line 4).

Algorithm4: ThedecentralizedPerfectRef algorithmrun-

ning on the peer Piof the PDMS S

PerfectRefi(q)

Input: a conjunctive query q over the Tbox Tiof the peer Pi

Output: a union of conjunctive queries over the Tbox T of the

PDMS S

(1) PR := {q}

(2) PR?:= PR

i=1cl gr(gi,PI)

(3) while PR?= ∅

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11) PR = PR ∪ PR?∪ PR??

(12) return PR

(a) foreach q?= g1∧ g2∧ ... ∧ gn ∈ PR?

PR??= ?n

i=1cl gr(gi,PI)

PR?= ∅

(b) foreach q??∈ PR??

foreach g?

if g?

2unify

PR?:= PR?∪ {τ(reduce(q??,g?

1,g?

2∈ q??

1and g?

1,g?

2))}

The following theorem states the correctness of the decen-

tralized reformulation algorithm PerfectRefi.

Theorem 4 (Correctness of PerfectRefi) Let

T

=

?n

i=1Ti be a Tbox of a PDMS. Let q be a con-

junctive query over Ti. PerfectRefi(q) returns the same

set of conjunctive queries as PerfectRef(q,T ).

Its proof results (1) from the observation that the cen-

tralized version of PerfectRefi(in which cl gr(gi,PI) is

computed by iterating the one-step application of PIs on each

atom giof the query) produces the same results than the orig-

inal PerfectRef,and (2)fromTheorem3 andthe complete-

ness of DECA, ensuringthe decentralizedcomputationof the

whole set cl gr(gi,PI).

In contrast with the original Answer algorithm, the global

consistency of the PDMS cannot be checked at query time

since the queried peer Pidoes not know all the peers in the

PDMS. However, it can get the identifiers id1,...,idk of

the peers involved in a reformulation of the query (to con-

tact them) from the identifiers used in the atomic concept and

role names involved in that reformulation. Algorithm 5 de-

scribes the decentralized Answerialgorithm that checks in a

decentralized manner whether?k

tent and computes the set of corresponding answers by eval-

uating each reformulation against the relevant Aboxes.

j=1(Tidj∪ Aidj) is consis-

Algorithm 5: The decentralized Answer algorithm running

on the peer Piof the PDMS S

Answeri(Q)

Input: a union of conjunctive queries Q over the KB Ki = ?Ti,Ai?

of Pi

Output: ans(Q,K) where K = ?T ,A? is the KB of the PDMS S

(1) q =S

(2) if Consistentidjreturns true for every peer Pidjinvolved in q

return qdb(Sk

(4) else return the singleton {⊥}

In that algorithm ⊥ replaces AllTup(Q,K) of the original

Answer algorithm.

The interest of Algorithm 5 is to provide well-founded an-

swers, i.e.,answersthat canbe entailedfroma consistentsub-

set of the (possibly inconsistent) KB of the PDMS.

q?∈QPerfectRefi(q?)

(3)

j=1Aidj)

4Query Answering using Views by Rewriting

We provide algorithms for computing the certain answers of

a query over a (centralized or decentralized) DL-LITERKB

K = ?T ,V,E? where E is the extensionof views in V that are

conjunctivequeries overT . For doing so, we make use of the

scalable MiniCon [Pottinger and Halevy, 2001] algorithm

which produces the maximally-contained conjunctive rewrit-

ings of a conjunctive query q using a set V of conjunctive

views. A conjunctive rewriting of q is a conjunctive query

qvwhose body predicates are the head predicates of views in

V such that T ∪ V |= ∀¯ x(qv(¯ x) ⇒ q(¯ x)). In that setting

([Halevy, 2001]), the set of certain answers of a query can be

obtainedbyevaluatingagainstthe view extensionsthe (finite)

union of its maximally-containedconjunctive rewritings.

Centralized Case

First V iewConsistent(K) checks the

consistency of the KB It is a variant of the original

Consistent algorithm obtained by replacing:

- qunsat = qunsat ∨ δ(α) in Line 3 of Algorithm 2

by qunsat

=

qunsat ∨ MiniCon(δ(α),V),

MiniCon(δ(α),V)

provides

rewritings using V of the FOL translation δ(α) of the NI α,

- qdb(A)

unsatin Line(4)ofAlgorithm2bythe evaluationagainst

the view extensions: qdb(E)

unsat.

where

themaximally-contained

2014

Page 6

Algorithm 6 describes the CertainAnswer algorithm. If the

KB is inconsistent, the algorithm returns Alltup(Q,E), the

set of all the tuples of the arity of Q generated from the con-

stants in E. Otherwise, it computes (using MiniCon(q?,V))

the rewritings in terms of the views of the conjunctivequeries

q?returned by PerfectRef as the different ways of unfold-

ing the initial query using the PIs of T .

Algorithm 6: The CertainAnswer algorithm

CertainAnswer(Q,K)

Input: a union of conjunctive queries Q and a KB K = ?T ,V,E?

Output: cert(Q,K)

(1) if not V iewConsistent(K)

(2)

return Alltup(Q,E)

(3) else

(4)

Q?=S

return (S

q∈QPerfectRef(q,T )

q?∈Q?MiniCon(q?,V))db(E)

(5)

Theorem 5 (Correctness of CertainAnswer) Let K

?T ,V,E? be a DL-LITERconsistent KB and Q a union of

conjunctive queries. cert(Q,K) = CertainAnswer(Q,K).

=

For one direction, from q?∈ PerfectRef(q,T ) and qv ∈

Minicon(q?,V) we infer: T ∪ V |= ∀¯ x(qv(¯ x) ⇒ q(¯ x)),

and thus qvis a conjunctive rewriting of q, the evaluation of

which provides certain answers. For the converse direction:

if¯t is a certain answer, by adapting the notion of witness of a

tuple introduced in Lemma 39 of [Calvanese et al., 2007] to

E instead of A, we build from the witness of¯t a specific con-

junctive query qvover view atoms such that¯t is in its answer

set and we show by induction that this query is a rewriting

of a reformulation q?of q. Since Minicon computes all the

maximally-containedrewritings(Theorem1in[Pottingerand

Halevy, 2001]), qvis contained in one of them, and¯t will be

returned by Line 5 of Algorithm 6.

Decentralized

sketch the approach:

V iewConsistent and of CertainAnswer are denoted

V iewConsistentiand CertainAnsweriwhen running on

a peer Pi. They are extensions of the decentralized algo-

rithms Consistentiand Answeripresented in the previous

section. They use the function fetchV iews(q) where q is

a conjunctive query possibly involving the vocabulary of

different peers: fetchV iews(q) retrieves from those peers

the views a body atom of which can be unified with a body

atom of the query.

TheV iewConsistentialgorithmextendsConsistentiby

replacingthe evaluationofqunsatagainstthe relevantAboxes

by the evaluation of Qunsatagainst the relevant view exten-

sions, where Qunsatis obtained from qunsat(which is com-

puted as in Consistenti) as follows:

Qunsat=?

q∈qunsatMiniCon(q,fetchViews(q)).

The CertainAnswerialgorithm extends Answeriby re-

placing the evaluation against the relevant Aboxes of the

union q of conjunctive queries obtained by reformulation of

the initial query(Line 2 in Algorithm5) by evaluatingagainst

the relevant view extensions the following query Q:

Q =?

Case

Forspacelimitation,we just

the decentralized versions of

q?∈qMiniCon(q?,fetchV iews(q?)).

5 Conclusion

This papers builds on and extends existing work in data inte-

gration ([Calvanese et al., 2007; 2008b] and [Pottinger and

Halevy, 2001]). For view-based query answering in DL-

LITER we provide a centralized and a decentralized algo-

rithmtocomputethecertainanswers basedonrewritings. For

the decentralized case, our work extends the data model of

the SOMEOWL and SOMERDFS PDMSs ([Adjiman et al.,

2006;2007]). We followthe sameapproachbasedonlimiting

the data model allowing to reduce consistency checking and

queryreformulationto reasoningin propositionallogic. It is a

way of getting the decidability for query answering in PDMS

which is not guaranteed in general ([Halevy et al., 2003]),

while adopting a standard logical semantics, in contrast with

other works (e.g., ([Calvanese et al., 2008a],[Franconi et al.,

2004],[Serafini et al., 2005]). It is also a distinguishingpoint

from the approach in [Bertossi and Bravo, 2007] (based on

answer set programs) for defining consistent answers in pos-

sibly inconsistent PDMSs.

References

[Adjiman et al., 2006] P. Adjiman, P. Chatalic, F. Goasdou´ e,

M.-C. Rousset, and L. Simon. Distributed reasoning in a

peer-to-peer setting. JAIR, 25, 2006.

[Adjiman et al., 2007] P. Adjiman, F. Goasdou´ e, and M.-C.

Rousset. Somerdfs in the semantic web. JODS, 8, 2007.

[Bertossi and Bravo, 2007] L. E. Bertossi and L. Bravo. The

semantics of consistency and trust in peer data exchange

systems. In LPAR, 2007.

[Calvanese et al., 2007] D. Calvanese, G. De Giacomo,

D. Lembo, M. Lenzerini, and R. Rosati. Tractable rea-

soning and efficient queryanswering in description logics:

The dl-lite family. JAR, 39(3):385–429,2007.

[Calvanese et al., 2008a] D. Calvanese, G. De Giacomo,

D. Lembo, M. Lenzerini, and R. Rosati. Inconsistency

tolerance in p2p data integration: An epistemic logic ap-

proach. Information Systems, 33(4-5), 2008.

[Calvanese et al., 2008b] D. Calvanese, G. De Giacomo,

M. Lenzerini, and R. Rosati. View-based query answer-

ing over description logic ontologies. In KR, 2008.

[Franconi et al., 2004] E. Franconi,G. Kuper,A. Lopatenko,

and I. Zaihrayeu. Queries and updates in the coDB peer-

to-peer database system. In VLDB, 2004.

[Halevy et al., 2003] A. Y. Halevy, Z. G. Ives, D. Suciu, and

I. Tatarinov. Schema mediation in peer data management

systems. In ICDE, 2003.

[Halevy, 2001] A. Y. Halevy.

views: A survey. VLDB J., 10(4):270–294,2001.

Answering queries using

[Pottinger and Halevy, 2001] R. Pottinger and A. Y. Halevy.

Minicon: A scalable algorithm for answering queries us-

ing views. VLDB J., 10(2-3):182–198,2001.

[Serafini et al., 2005] L.

A. Tamilin.

tology reasoning. In IJCAI, 2005.

Serafini, A.Borgida, and

Aspects of distributed and modular on-

2015

#### View other sources

#### Hide other sources

- Available from Marie-Christine Rousset · May 17, 2014
- Available from psu.edu