Page 1

Computed answer based on fuzzy knowledgebase

– a model for handling uncertain information –

Ágnes Achs

Departement of Computer Science

University of Pécs, Faculty of Engineering

Pécs, Boszorkány u. 2, Hungary

achs@witch.pmmf.hu

Abstract

The basic question of our study is how we

can give a possible model for handling

uncertain information. This model is worked

out in the framework of DATALOG. The

concept of fuzzy knowledge-base will be

defined as a quadruple of any background

knowledge; a deduction mechanism; a

connecting algorithm, and a decoding set of

the program, which help us to determine

the uncertainty level of the results. A

possible evaluation strategy is given also.

Keywords: fuzzy knowledgebase, fuzzy

DATALOG.

1. Introduction

The large part of human knowledge can’t be

modelled by pure inference systems, because this

knowledge is often ambiguous, incomplete and

vague. When knowledge is represented as a set of

facts and rules, this uncertainty can be handled by

means of fuzzy logic.

A few years ago in [1] and [3] there was given a

possible combination of Datalog-like languages and

fuzzy logic. In these works there was introduced the

concept of fuzzy Datalog by completed the Datalog-

rules and facts with an uncertainty level and an

implication operator. In [2] there was given an

extension of fuzzy Datalog to fuzzy relational

databases.

Parallel with these works, there were researches on

possible combination of Prolog language and fuzzy

logic also. Several solutions were arisen for this

problem. These solutions propose different methods

for handling uncertainty. Most of them use the

concept of similarity, but in various ways. More

essays deal with fuzzy unification and fuzzy

resolution, for example [5], [9], [10], [12].

In this paper, continuing our former concept, we

give other possible model for handling uncertain

information, based on the extension of fuzzy

Datalog. According to our former papers, in the next

chapter we describe the concept of fuzzy Datalog, in

the further chapters we discuss our actually new idea

about fuzzy knowledgebase. We build the model of

fuzzy knowledge-base as a quadruple of any kind of

background knowledge; a deduction mechanism; a

connecting algorithm and a decoding set.

2. The fuzzy Datalog

A Datalog program consists of facts and rules. In

fuzzy Datalog, we can complete the facts by an

uncertainty level, and the rules by an uncertainty

level and an implication operator. Applying this

operator, the uncertainty level of rule’s head can be

determined from the uncertainties of rule’s body and

the rule. For example if in the program

beautiful(’Mary’); 0.7

likes(’John’,x)←beautiful(x);0.8;I.

the implication operator is the Gödel-operator

(I(x,y) = 1 if x ≤ y; otherwise I(x,y) = y),

then the level of the rule-head can be calculate as the

minimum of the level of the rule-body and the level

of the rule. For John and Mary it does:

likes(’John’,’Mary’), 0.7,

that is John likes Mary at least 0.7 level.

EUSFLAT - LFA 2005

94

Page 2

In [1], [2] there is given an extension of Datalog-like

languages to fuzzy relational databases. In these

papers lower bounds of degrees of uncertainty in

facts and rules are used. This language is called

fuzzy Datalog (fDATALOG). In this language the

rules are completed by an implication operator and a

level. We can deduce the level of a rule-head from

the level of the body and the level of the rule by the

implication operator of the rule. Now we are going

to summarise the concept of fDATALOG based on

[1], [2].

Similarly to DATALOG programs, a fDATALOG

program consists of rules and facts. The notion of

fuzzy rule is given in definition below:

Definition 1.

A fDATALOG rule is a triplet r;β;I, where r is a

formula of the form

Q ← Q1,...,Qn

Q is an atom (the head of the rule), Q ← Q1,...,Qn are

literals (the body of the rule); I is an implication

operator and β ∈ (0,1] (the level of the rule).

For getting finite result, all the rules in the program

must be safe, that is all variables occurring in the

head also occur in the body, and all variables

occurring in a negative literal also occur in a positive

literal.

A fDATALOG program is a finite set of safe

fDATALOG rules.

There is a special type of rule, called fact. A fact has

the form A ← ;β;I. Further on we refer to facts as

(A,β).

The semantics of fDATALOG is defined as the

fixpoints of consequence transformations.

Depending on these transformations we can define

two semantics for fDATALOG based on a

deterministic and a nondeterministic transformation.

According to the deterministic transformation, the

rules of a program are evaluated parallel, while in

nondeterministic case the rules are considered

independently one after another. In [2] it is proved,

that starting from the set of facts, both deterministic

and nondeterministic transformation have fixpoint,

which are the least fixpoints (lfp) in the case of

positive program P. It was also proved, that this

fixpoints are models of P, so we could define this

fixpoints as the deterministic and the nondeterministic

semantics of fDATALOG programs.

(n≥0).

For function- and negation-free fDATALOG, the

two semantics are the same, but they are different if

the program has any negation. In this case the least

fixpoint of deterministic transformation is not

always a minimal model, but the nondeterministic

semantics is minimal under certain conditions. This

condition is the stratification. Stratification gives an

evaluating sequence in which the negative literals

are evaluated first. (Detailed in [2].)

Further on we deal only with the nondeterministic

semantics. This transformation is the following:

Definition 2.

Let BP the Herbrand base of the program P, and let

F(BP) denote the set of all fuzzy sets over BP. The

nondeterministic consequence transformation NTP :

F(BP)→F(BP) is defined as

NTP(X)={(A, αA )} U X, where

(A ← A1,…,An ; β; I) ∈ ground(P), (|Ai|,αAi )∈X,

1 ≤ i ≤n, αA=max(0,min{γ | I(αbody,γ) ≥ β})

|A| denotes p(c) if either A=p(c) or A=¬p(c) where p

is a predicate symbol with arity k and c is a list of k

ground terms.

Example 1.

Given the next fDATALOG program:

1. (r(a), 0.8).

2. p(x) ← r(x),¬q(x); 0.6; I.

3. q(x) ← r(x); 0.5; I.

4. p(x) ← q(x); 0.8; I.

The stratification is: P1 = {r,q} , P2 = {p}, so the

evaluation order is: 1., 3., 2., 4. Then in the case of

Gödelian implication operator, the nondeterministic

semantics of the program is lfp(NTP ) = {(r(a),0.8);

(p(a),0.5); (q(a),0.5)}.

3. Background knowledge

The facts and rules of a fDATALOG program can be

regarded as any kind of knowledge, but sometime –

as in the case of our model – we need other

information in order to get answer for a query. In

this paragraph we give a model of background

knowledge. We define similarity between predicates

and between constants, and these structures of

similarity will serve for the background knowledge.

EUSFLAT - LFA 2005

95

Page 3

Definition 3.

A similarity on a domain D is a fuzzy subset SD:

D×D → [0,1] such that the following properties

hold:

SD (x,x) = 1 for any x∈D

SD (x,y) = SD (y,x) for any x,y ∈D

In our model the background knowledge is a set of

similarity sets:

(reflexivity)

(symmetry).

Definition 4.

Let d ∈ D any element of domain D. The similarity

set of d is a fuzzy subset over D:

Sd = {(d1 , λ1), (d2 , λ2), …, (dn , λn)},

where di ∈ D and SD (d, di) = λi for i = 1,…n.

Based on similarities we can construct the back-

ground knowledge, which is information about the

similarity of terms and predicate symbols.

Definition 5.

Let T be any set of ground terms and P any set of

predicate symbols. Let ST and SP any similarity over

T and P respectively. The background knowledge is:

Bk = {STt | t ∈ T} U {SPp | p ∈ P}

4. Fuzzy knowledge-base

We made two steps on the way leading to the

concept of fuzzy knowledgebase: we defined the

concept of fuzzy Datalog program and the concept

of background knowledge. Now the question is: how

we can connect this program with the background

knowledge? In [4] the author gave a possible

connecting algorithm, replacing all predicates and all

ground terms of the program by theirs similarity

sets. Now we show a new and other possibility:

instead of modifying the program, we’ll modify the

evaluation transformation of the program.

To solve this problem, we’ll define the concept of

modified fDATALOG program, which is the

extension of the original one according to similarity;

and the concept of decoding functions, which

compute the final value of the uncertainty.

Evaluating an extended fDATALOG program, the

facts and the result atoms are completed by an

uncertainty level. The final uncertainty level can be

computed from this level and from the similarity

values of actual predicate and its arguments. It is

expectable, that in the case of identity the level must

be unchanged, but in other cases it is to be less or

equal then the original level or then the similarity

values. Furthermore we require the decoding

function to be monotone increasing.

Definition 6.

A decoding function is an (n+2)-ary fuzzy function :

ϕ(α,λ,λ1,…,λn) : (0,1] × (0,1] × … × (0,1] → [0,1],

so that

ϕ(α, λ, λ1 ,…, λn) ≤ min (α, λ, λ1 ,…, λn),

ϕ (α, 1, 1 ,…, 1) = α and

ϕ(α, λ, λ1 ,…, λn) is monotone increasing in all

arguments.

Example 2.

ϕ1(α, λ, λ1 ,…, λn) = min (α, λ, λ1 ,…, λn);

ϕ2(α, λ, λ1 ,…, λn) = min (α, λ, (λ1 ⋅ ⋅ ⋅ λn));

ϕ3(α, λ, λ1 ,…, λn) = α ⋅ λ ⋅ λ1 ⋅ ⋅ ⋅ λn

are decoding functions.

We have to order decoding functions to all

predicates of the program. The set of decoding

functions will be the decoding set of the program:

Definition 7.

Let P be a fuzzy DATALOG program. The decoding

set of P is:

ΦP = {ϕq (αq, λq, λt1 ,…, λtn) | ∀q(t1, t2 , …, tn) ∈ BP}

Let P be a fuzzy Datalog program, Bk be any

background knowledge and ΦP be the decoding set

of P. Now we want to connect the program with

background knowledge. For this purpose we decide

the modified fDATALOG program, mP. This is the

original program with modified consecution trans-

formation. The original consequence transformation

is defined over the set of all fuzzy sets of P’s

Herbrand base, that is over F(BP). To define the

modified transformation’s domain, let us extend P’s

Herbrand universe with all possible ground terms

occurring in background knowledge – so we get the

modified Herbrand universe, mHP. Let the modified

Herbrand base, mBP be the set of all possible ground

atoms whose predicate symbols occur in P U Bk and

whose arguments are elements of mHP. So:

Definition 8.

The modified consequence transformation

mNTP : F(mBP) → F(mBP) is defined as

mNTP(X)=

{(q(s1,...,sn), φp(α, λq, λs1 ,…, λsn)| (q, λq) ∈ SPp;

(si, λsi) ∈ STti , 1 ≤ i ≤ n} U X,

EUSFLAT - LFA 2005

96

Page 4

where

(p(t1,...,tn) ← A1,...,Ak ; I ; β) ∈ ground(P),

(|Ai|, αAi) ∈ X , 1 ≤ i ≤ k,

α = max(0,min{γ | I(αbody, γ) ≥ β}).

(|A| denotes r(c) if either A=r(c) or A=¬r(c) where r

is a predicate symbol with arity k and c is a list of k

ground terms.)

Then starting from the facts of the program and

creating the powers of the transformation mNTP,

finally we reach the fixpoint.

The next proposition can be easily prove:

Proposition 1.

In the case of positive or stratified program P, mNTP

has least fixpoint.

It can be shown, that this fixpoint is a model of P,

but lfp(NTP) ⊆ lfp(mNTP), so it is not a minimal

model.

Now there are all together to define the concept of a

fuzzy knowledge-base:

Definition 9.

A fuzzy knowledge-base (fKB) is a quadruple

{Bk, P, ΦP, mA}, where Bk is a background

knowledge, P is a fuzzy DATALOG program, ΦP is

a decoding set of P and mA is any modifying (or

connecting) algorithm.

Definition 10.

Let {Bk, P, ΦP, mA} be a fuzzy knowledge-base.

Then lfp(mNTP) is the consequence of the

knowledge-base, denoted by C(Bk, P, ΦP, mA).

Example 3.

Let us consider the next dialogue:

- Do you know: whether Mary likes Bach?

- I think, because her favourite composer is Vivaldi.

Now we try to give a computed answer for this

question. Let us suppose that the musicians

generally love the good composers, Mary fairly is a

music founder, and Vivaldi is nearly the favourite

composer. Let us denote “love” by “lo”, “good

composer” by “gc”,

“favourite” by “fv”, “music founder” by “mf”. Then

let the fDATALOG program, the background

knowledge and the decoding set be as follows

(according to the modifying algorithm, it is enough

to consider only the decoding functions of head-

“musician” by “mu”,

predicates). The implication operator of the program

be the Gödelian.

(I(x,y) = 1 if x ≤ y; otherwise I(x,y) = y)

lo(x,y) ← gc(y), mu(x); 0.7; I.

(fv(V), 0.9).

B

V

M

lo li gc

lo 1 0.8

li 0.9 1

gc 1 0.75

fv 0.75 1

mu

mf

φlo := φ = φ(α,x,y,z) := min(α,x,y,z)

φfv := θ = θ(α,x,y) := α⋅x⋅y

φmf := ω = ω(α,x,y) := min(α, x⋅y)

Let us apply the modification algorithm, and start

from the facts, {(fv(V),0.9), (mf(M),0.8)}, we get:

according to similarity:

{(fv(V),0.9), (gc(V), θ(0.9,0.75,1) = 0.9⋅0.75⋅1 =

0.675), (fv(B), θ(0.9,1,0.9) = 0.81),

(gc(B), θ(0.9,0.75,0.9) = 0.6075), (mf(M),0.8),

(mu(M), ω(0.8,0.6,1) = min(0.8, 0.6⋅1) = 0.6)}

applying the rules on the above set of facts :

lo(M,V) :- gc(V), mu(M); 0.7; I.

lo(M,B) :- gc(B), mu(M); 0.7; I.

we get:

(lo(M,V), min(0.675, 0.6, 0.7) = 0.6)

(lo(M,B), min(0.6075, 0.6, 0.7) = 0.6)

according to similarity:

(li(M,V), φ(0.6,0.8,1,1) = min(0.6,0.8,1,1) = 0.6),

(li(M,B), φ(0.6,0.8,0.9,1) = min(0.6,1,0.9,1) = 0.6)

So the consequence of knowledge-base is:

C(Bk, P, ΦP, mA) = {(fv(V),0.9); (gc(V),0.675);

(fv(B),0.81); (gc(B),0.6075); (mf(M),0.8);

(mu(M),0.6); (lo(M,V),0.6); (lo(M,B),0.6);

(li(M,V),0.6); li(M,B),0.6)}.

That is Mary likes Bach at least 0.6 levels.

B

1

0.9

V

0.9

1

M

1

(mf(M),0.8).

fv

mu

1

0.6

mf

0.6

1

and

EUSFLAT - LFA 2005

97

Page 5

5. Evaluation strategies

According to the previous example (especially in the

case of enlarging the program with other facts and

rules), it can be seen, that the fixpoint-query – that is

the bottom-up evaluation – maybe involves many

superfluous calculations, because sometimes we

want to give an answer for a concrete question, and

we are not interested in the whole consequence. If a

goal (query) is specified together with the fuzzy

knowledgebase, it is enough to consider only the

rules and facts being necessary to reach the goal. In

this chapter we deal with the top-down evaluation of

knowledgebase, which starts from the goal, and

applies the suitable rules and similarities to find the

required starting facts and rules to get answer for

this query.

A goal is a pair (q(t1, t2,… tn), α), where q(t1, t2,… tn)

is an atom, α is the level of the atom. It is possible,

that among the arguments of q there are variables or

constants, and α can be either a constant or a

variable.

In some cases – by the top down evaluation – the

goal is evaluated through sub-queries. This means,

that there are selected all possible rules, whose head

can be unify with the given goal, and the atoms of

the body are considered as new sub-goals. This

procedure continues until obtaining the facts.

[3] deals with the evaluation strategies of fuzzy

Datalog. The top down evaluation of fuzzy Datalog

doesn’t terminate obtaining the facts, because we

need to determine the uncertainty level of the goal.

The algorithm, given in [3], calculates this level in a

bottom up manner: starting from the leaves of

evaluating graph, going backward to the root, and

calculating the actual uncertainty-level along the

suitable path of this graph, finally we get the

uncertainty level of the root.

In recent model, we rather rely on the bottom up

evaluation, but the selection of required starting

facts takes place in a top-down way. As only the

required starting facts are searched, therefore in the

top-down part of the evaluation there are no need for

the uncertainty levels, so we search only among the

ordinary facts and rules. To do this, we need the

concept of substitution and unification which are

detailed for example in [3], [6], [11], etc. But now

sometime we need also other kind of substitutions: to

substitute some predicate p or term t for their

similarity sets Sp and St, and to substitute some

similarity sets for their members.

In the next, for the sake of simpler terminology, we

mean by goal, rules and facts these concepts without

uncertainty levels. An AND/OR tree arise during the

evaluation, this is the searching tree. Its root is the

goal; its leaves are one of YES or NO. The parent-

nodes of YES are the required starting facts. This

tree is build up by alternation of similarity-based and

rule-based unification.

The rule-based unification unifies the sub-goals with

the head of suitable rules, and continues the

evaluating by the bodies of these rules. This

unification is special, because during this unification

the similarity sets of terms are considered as

ordinary constants, and a constant can be unify with

its similarity set.

The similarity-based unification

predicates of sub-goals by the members of its

similarity set, excepting the first and last unification.

The first similarity-based unification unifies the

ground terms of the goal with their similarity sets,

and the last one unifies the similarity sets among the

parameters of resulting facts with theirs members.

The searching graph according to its depth is build

up in the following way: If the goal is on depth 0,

then every successor of any node on depth 3k+2

(k=0,1,…) are in AND connection, the others are in

OR connection. Detailing:

The successors of goal g(t1,t2,…tn) be all possible

g’(t’1,t’2,…t’n), where g’∈ Sg; t’i = ti if ti is some

variable and t’i = S ti if ti is a ground term.

If the atom p(t1,t2,…tn) is in depth 3k (k=1,2,…),

then the successor nodes be all possible p’(t1,t2,…tn),

where p’∈ Sp.

If the atom L is in depth 3k+1 (k=1,2,…), then the

successor nodes be the bodies of suitable unified

rules, or the unified facts, if L is unifiable with any

fact of the program, or NO, if there is not any

unifiable rule or fact. That is, if the head of rule

M ← M1,...,Mn (n>0) is unifiable with L, then the

successor of L be M1θ, ... ,Mn θ, where θ is the most

general unification of L and M. If n=0, that is in the

program, there is any fact with the predicate of L,

then the successors be the unified facts. If

L = p(t1,t2,…tn) and in the program there is any fact

with predicate p, then the successor nodes be all

possible p(t’1,t’2,…t’n), where t’i ∈ Sti if ti = Sti or t’i = tiθ,

if ti is a variable, and θ is a suitable unification.

unifies the

EUSFLAT - LFA 2005

98