Extension Information-Knowledge-Strategy System for Semantic Interoperability
ABSTRACT This paper discusses the issue of information interoperability. In view of the problems that it is difficult to solve the semantic conflicts in information interoperability, the paper shows how to build the extension ontology model for information interoperability based on Extenics information- knowledge-Strategy system. Take advantage of Extenics’ feature of contradiction problem solving, the extension system can eliminate semantic conflicts by extension transformation method. It overcomes the drawbacks of current ontology models that lacks transformation mechanisms, therefore supports information interoperability effectively. The paper presents examples of semantic interoperability process with the extension system. It describes some implement technologies of the extension system.
-
Citations (0)
-
Cited In (0)
Page 1
Extension Information-Knowledge-Strategy
System for Semantic Interoperability
Li Weihua
Guangdong University of Technology, Guangzhou, China
lw@gdut.edu.cn
Yang Chunyan
Guangdong University of Technology, Guangzhou, China
wyw@gdut.edu.cn
Abstract—This paper discusses the issue of information
interoperability. In view of the problems that it is difficult to
solve the semantic conflicts in information interoperability,
the paper shows how to build the extension ontology model
for information interoperability based on Extenics
information-knowledge-Strategy system. Take advantage of
Extenics’ feature of contradiction problem solving, the
extension system can eliminate semantic conflicts by
extension transformation method. It overcomes the
drawbacks of current ontology models that lacks
transformation mechanisms, therefore supports information
interoperability effectively. The paper presents examples of
semantic interoperability process with the extension system.
It describes some implement technologies of the extension
system.
Index Terms—information
conflict, extension, ontology, model, implementation
interoperability, semantic
I. INTRODUCTION
Information
meaningfully exchange information among separately
developed systems, including the understanding of the
information’s format, meaning, and quality [1].
Automated search engines, while being the most
comprehensive tools for Web coverage, are particularly
prone to inaccuracy. They provide users with poor quality
or irrelevant Web information. Manually maintained
classified directories, although intuitive to use and largely
accurate, cover just a small fraction of the information
available. Intelligent agency is a crucial tool in coping
with the complexities of the information-rich problems
imposed by the explosion of data residing on the Web.
Intelligent and autonomous problem-solving agents can
greatly facilitate users access to the Web-available
information sources [2]. However, automated processing
of Web content requires explicit machine-processable
semantics associated with those Web resources. That is
interoperability is the ability to
why we need “semantic” Web. The "Semantic Web" is a
Web that includes documents, or portions of documents,
describing explicit relationships between things and
containing semantic information intended for automated
processing by our machines [3]. If the Semantic Web
vision has come true, the information interoperability in
Web mining task done by agents would be much more
effective.
Currently, agents that try to perform Web mining task
do this by screen scraping: retrieving the information by
interpreting regularities in the layout of the Web pages.
They typically only retrieve limited information from the
Web pages. Agents are faced with the problem of
semantic interoperability, i.e., the difficulty in integrating
resources that were developed
vocabularies and different perspectives on the data. These
differences are the reasons why semantic conflicts exist.
To achieve semantic interoperability, agents must be able
to exchange data in such a way that the precise meaning
of the data is readily accessible and the data itself can be
translated by any agent into a form that it understands.
If we try to retrieve information on the Web by
intelligent agent, one basic component of the Semantic
Web, collections of information called ontologies is
useful.
Ontologies will be used to provide structured
vocabularies that explicate the relationships between
different terms, allowing intelligent agents (and humans)
to interpret their meaning flexibly yet unambiguously.
Ontologies should be described by some languages and
there exist many such description languages. OWL (Web
Ontology Language)[4] is a new formal language for
representing ontologies in the Semantic Web. It plays an
important role in helping agents to process information in
Web mining.
While ontologies can help agents to eliminate some
kinds of semantic conflicts, however, other kinds of
semantic conflicts occur. Although ontologies are
gradually built for Web mining, we should not count on
uniform ontology on the Web. If we try to process
information between different communities that have
using different
This work is supported by the Guangdong Provincial Natural Science
Foundation (grant no. 05001832) and the National Natural Science
Foundation of China (Grant no. 70671031).
32JOURNAL OF COMPUTERS, VOL. 3, NO. 8, AUGUST 2008
© 2008 ACADEMY PUBLISHER
Page 2
different ontologies, we need to tackle the problems of
semantic conflicts due to ontology mismatch.
Dealing with above problems has no obvious methods.
People have to find different kinds of solutions. This
paper proposes to use the extension information-
knowledge-Strategy system to eliminate different kinds
of semantic conflicts in Web mining. The extension
methods are the important part of Extenics [5], which is
the new disciplines studying objects’ extensibility and the
laws and methods of extension to solve contradiction
problems. The paper is organized as follows. In next
section it describes information interoperability in Web
mining. In section 3, it presents the main concept of
extension ontology model. In section 4, it introduces the
extension information-knowledge-Strategy system. In
section 5, the paper gives examples of semantic conflicts
elimination process by the extension methods for Web
mining user. The following section describes some
implement technologies. Last section will be the
conclusion.
II. INFORMATION INTEROPERABILITY IN WEB MINING
Web mining is not just searching the pages of the
World Wide Web, but also taking advantage of the
numerous databases and other information repositories
available on the Web. O. Etzioni wrote, Web mining may
be organized into the following three subtasks [6]:
• Resource discovery. According to Etzioni, this
subtask means locating unfamiliar documents and
services on the Web.
• Information extraction. It means automatically
extracting specific information from newly discovered
Web resources.
• Generalization. The third subtask means uncovering
general patterns at individual Web site and across
multiple sites.
In the second subtask, Web miner need to dynamically
extract information from unfamiliar resources where
semantic conflicts may occur. The semantic conflict
problems include the use of [7]:
• same terms for different concepts
• different terms for the same concepts
• semantically similar attributes which have different
meanings in their domains
• attributes which have different generalization and
aggregation level
• same attributes, but different data quality
requirements, e.g. accuracy
Intelligent agents can use test queries and domain-
specific knowledge to learn descriptions of Web services
to enable automatic information extraction. For these
potentials to be realized requires new integration of
syntactic and semantic interoperability.
The key challenges of syntactic interoperability are [8]:
• identifying all the elements in various systems
• establishing rules for structuring these elements
• mapping, bridging, creating crosswalks between
equivalent elements using schemas etc.
• agreeing on equivalent rules to bridge different
cataloguing and registry systems.
The advent of XML leveraged a promising consensus
on the encoding syntax for machine-processable
information. However, XML does not support semantic
interoperability.
If there were effective semantic interoperability
methods, intelligent agents could benefit from them.
Currently, such agents are very sensitive to changes in the
format of a web page. Although these agents would not
be affected by presentation changes if the pages were
available in XML, they would still break if the XML
representation of the data that was changed slightly [9].
Gerhard Budin has identified six methods for semantic
interoperability [10]:
• Mapping methods based on conceptual specifications
(conceptual relations in hierarchies)
• XMI (Extensible Markup Language Metadata
Interchange) based approaches
• SQL based approaches
• RDF (Resource Description Framework) based
approaches
• Schema based approaches
• Description Logic based approaches
Let’s discuss the first method in detail. This method
implies that it use ontologies to support mapping, since
an ontology is an explicit specification of a
conceptualization [11]. According to Gruber, ontologies
are a specification of the conceptualization and the
corresponding vocabulary used to describe a domain.
They can be used to describe the structure of semantics of
much more complex objects than common databases and
are therefore well-suited for describing heterogeneous,
distributed and semistructured information sources such
as found on the Web. These properties make ontologies
ideal for machine processing and enabling interoperation.
In fact, ontologies form the backbone of the Semantic
Web and are the key to enable automated interoperation
and collaboration [12].
However, as old contradictions are resolved, new ones
will arise. With the wide range of Web information
resources and more ontologies being built, it is unrealistic
to hope that there will be an agreement on one or even a
small set of ontologies [13]. Therefore, even though
ontologies can help to eliminate concept-level or
language-level semantic conflicts, there still exist
ontology-level semantic conflicts caused by ontology
mismatches which must also be eliminated.
For ontology mismatch problems, some people try to
use ontology mapping methods [14]. But there is no
obvious mapping solution. N. F. Noy and M. A. Musen
noticed, the work of mapping, merging, or aligning
ontologies is performed mostly by hand, without any
tools to automate the process fully or partially. They
developed and implemented PROMPT, an algorithm that
provides a semi-automatic approach to ontology merging
and alignment [15].
We try to find a method that can eliminate different
kinds of semantic conflicts including conflicts caused by
ontology mismatches. Since semantic conflicts are
contradiction problems, we introduce the extension
methodology [16] to solve them.
JOURNAL OF COMPUTERS, VOL. 3, NO. 8, AUGUST 200833
© 2008 ACADEMY PUBLISHER
Page 3
III. EXTENSION ONTOLOGY MODEL
Building ontologies requires some kinds of ontology
description language. OWL [4] suggested by W3C can
describe object types, classes, and existing entities in the
world. It can also describe attributes of these classes,
types, and instances. For example, to describe “Virus A is
a living thing”, the OWL description is:
<owl:class rdf:ID=“virus A”>
<rdfs:subClassOf>
<owl:onProperty rdf:resource=“#living thing”>
</rdfs:subClassOf>
</owl:Class>
We find out that we can use a matter-element in
Extenics to express this concept. The founder of Extenics,
Cai Wen, puts forwards the concept of “matter-
element”[17], which combines quality and quantity, an
ordered triple of a matter, its characteristic, and its
measure as to the characteristic, denoted by M=(O, c, v).
Therefore, we express “Virus A is a living thing” as:
M1 =(virus A, subclass, living thing)
This is equals to owl:class description in OWL.
For the owl:disjointWith description, we can use a
relation-element in Extenics to take the place of it:
⎡
second
R1=
⎥
⎦
⎥
⎥
⎤
⎢
⎣
⎢
⎢
Φ
B
A
result,
program item,
thingliving item,first relation, intersect
2
1
This means A1 and B2 are disjointed.
For the owl:ObjectProperty description, we can use an
affair-element in Extenics to take the place of it:
⎡
object, execute
This means virus A damages living thing A1.
The following part of ontology is described by OIL
(Ontology inference Layer):
ontology-definitions
slot-def eats
inverse is-eaten-by
slot-def has-part
inverse is-part-of
properties transitive
class-def animal
class-def plant
disjoint animal plant
class-def defined carnivore
subclass-of animal
slot-constraint eats
value-type animal
class-def defined herbivore
subclass-of animal
A1 =
⎥⎦
⎤
⎢⎣
1
1
thing
M
livingobject, controlleddamage,A
slot-constraint eats
value-type (plant or slot-constraint is-part-of
has-value plant))
disjoint carnivore herbivore
…
We can use basic-elements (matter-element, relation-
element, affair-element) and complex-elements in
Extenics to build this part of ontology:
⎡
second
R1=
⎥
⎦
⎥
⎥
⎤
⎢
⎣
⎢
⎢
Φ
B
A
result,
plant item,
animalitem,first relation, intersect
2
1
M1 = (carnivore A, subclass, animal)
M2 = (herbivore B, subclass, animal)
⎡
execute
⎡
execute
⎡
relation, intersect
A1 =
⎥⎦
⎤
⎢⎣
1
1
object,
animal
M
object, controlledeat,A
A2 =
⎥⎦
⎤
⎢⎣
2
2
object,
plant
M
object, controlledeat,B
R2=
⎥
⎦
⎥
⎥
⎤
⎢
⎣
⎢
⎢
Φ
A
A
result,
item, second
item, first
2
1
Therefore we can build the extension ontology model
by extension theory (Extenics).
What is the difference between current ontology
models and our extension ontology model? The latter can
perform extension transformation to solve contradiction
problems, because our ontology model is composed of
basic-elements and complex-elements
extensibility. It is the key idea of Extenics. We will
discuss it in the following section.
that have
IV. EXTENSION INFORMATION-KNOWLEDGE-STRATEGY
SYSTEM
Extenics is a new discipline that studies rules and
methods of solving contradiction problems by employing
formalized tools, i.e. qualitative analysis and quantitative
analysis. Extenics has three important parts as follows:
• The basic-element theory. Basic-element concept is
the cornerstone of Extenics.
• The extension set theory. Extension set differs from
classical set and fuzzy set.
• Extension Logic. It combines the dialectical logic and
the formal logic.
The basic method of Extenics is called the extension
methodology. Extension method acts as a “bridge”
between extension theory (Extenics) and its actual
application. The application
methodology in every field is called the extension
engineering methods.
of the extension
34JOURNAL OF COMPUTERS, VOL. 3, NO. 8, AUGUST 2008
© 2008 ACADEMY PUBLISHER
Page 4
Extension theory and its methodology can form an
information-knowledge-strategy system [18]. Basic-
elements describe information. Extension rules express
knowledge. Extension strategy generation methods can
generate strategy from extension information and/or
knowledge [19].
How can we use Extenics to solve contradiction
problems? The founder of Extenics, Cai Wen, pointed out
that we should consider the changeability of matters and
their characteristics [17]. He studied the changeability of
matters and the laws of their changes to see how to turn
contradiction problems into compatible ones. The
changeability of matters is called the extensibility of
matter-element and the changes of matters are described
by transformation of matter-element. Matter-element
transformations provide us with feasible tool to solve
contradiction problems.
Let’s cite an example. To weigh an elephant of several
tons with a steelyard of weighing only one hundred
kilogram is a contradiction problem. Let matter-element
M0 and r be:
M0 = (elephant A, weight, xkg)
m=(steelyard B, measuring capacity, ykg)
M0 is the aim that we want to realize. m is the condition
we have now. According to Extenics, we may change M0
or m to find the solution. Proposition 2 of the divergence
of matter-element shows, one characteristic can be shared
by countless matters, which is denoted by:
(O, c, v) ┤ {(O1, c, v1), (O2, c, v2),⋯, (On, c, vn)}
The ┤ symbol means “can be extended to”.
This Proposition means: If a contradiction problem
cannot be solved by a matter with the characteristic c, it
may be solved by another matter with the same
characteristic c.
Let’s change M0 into matter-element M:
M = (stones D, weight, xkg)
where D is decomposable. Decompose D into D1,
D2, …, Dn, we get:
M1 = (stone D1, weight, x1kg)
M2 = (stone D2, weight, x2kg)
…
Mn = (stone Dn, weight, xnkg)
where x1+ x2+…+xn=x and stone D1, D2, …, Dn can be
weighed by the steelyard B. We say M1, M2, …, Mn are
compatible with m. Therefore, the contradiction problem
has been solved.
Of cause, the key to solve this problem is to find stones
that are as heavy as the elephant. In fact, in history a
person called Tsao Chung put the elephant into a boat and
marked the water line, then substituted some stones for
the elephant in the boat to get the same water line. By
weighing the stones he solved the problem successfully.
Now we try to use Extenics to eliminate semantic
conflicts in Web mining. Suppose an agent meets “Virus
damages my computer” and “Virus kills fowl” in
different Web pages, it could be confused. According to
the basic-element theory, we represent “virus” term by
matter-element:
M1=(virus A, subclass, program)
M2=(virus B, subclass, biology)
Now agent believes that “M1 damage my computer”
and “M2 kills a bird” has no semantic conflict.
We can regard this method as the RDF based
approaches in Section II. But our method has better
theoretical basis. In addiction, the extension methodology
includes the divergent tree method, the decomposition
and combination chains method, the correlative net
method, the implication system and the conjugate pair
method, and so on. These different kinds of extension
methods to solve contradiction problems are suitable for
ontology-level semantic conflicts elimination.
Our extension information-knowledge-strategy system
for information interoperability consists three layers. The
information layer is to use basic-elements to present
objects and their characteristics. The knowledge layer is
the extension ontology model. The strategy layer is the
semantic conflicts elimination process which will be
described in next section.
As written by Noy, a partial list of ontology-level
mismatches includes using the same linguistic terms to
describe different concepts; using different terms to
describe the same concept; using different modeling
paradigms (e.g., using interval logic or points for
temporal representation); using different modeling
conventions and levels of granularity; having ontology
with differing coverage of the domain, and so on [13].
We can use extension information-knowledge-strategy
system to solve them.
For the “using the same linguistic terms to describe
different concepts” problem, we can extent terms to
basic-elements. Basic-element concept stems from the
matter-element concept. Its expression is the ordered
triple of object, characteristic, and corresponding measure.
Different objects should have different characteristics or
corresponding measures so that we can distinguish terms.
For the “using different terms to describe the same
concept” problem, we can list all characteristics of the
corresponding basic-element of these different terms to
show their same factors.
For the “using different modeling paradigms” problem,
we can choose the transition transformation in extension
methodology to eliminate mismatches of the different
ontologies.
For the “using different modeling conventions and
levels of granularity” problem, we can use the
decomposition and combination chains in extension
methodology to change the conventions and decompose
or integrate the granularity.
For the “having ontologies with differing coverage of
the domain” problem, we can solve it by the universe of
discourse transformation in extension methodology [20].
Each of these semantic conflict elimination processes
cannot be easily described in a few words. We will
describe some examples of these elimination processes in
detail in the following section as case studies.
JOURNAL OF COMPUTERS, VOL. 3, NO. 8, AUGUST 200835
© 2008 ACADEMY PUBLISHER
Page 5
V. SEMANTIC CONFLICTS ELIMINATION PROCESS WITH
THE EXTENSION SYSTEM
The semantic conflicts elimination process is the
strategy layer in our information-knowledge-strategy
system for information interoperability. We suggest the
process of semantic conflict elimination be as follows:
• analyze what kind of conflict occurs.
• (if necessary) represent objects for different concepts
by basic-elements.
• choose suitable extension methods to eliminate the
conflict.
Example 1. When an agent visits some Web pages, it
finds out that in one page a sentence says “I use my PC to
browse Web pages” while in another page a sentence says
“I use my desktop machine to browse Web pages”. The
agent could report a conflict.
Now we start our conflict elimination process. First we
identify it is the “using the different terms to describe the
same concepts” problem.
Then we represent term “PC” and term “desktop
machine” by complex-element Z:
⎤
⎢
22
,Ac
(
11
jj
==
is a logic OR combination of affair-elements.
⎡
<
,function
Z=
⎥
⎦
⎥
⎥
⎢
⎣
⎢
⎡
i3
11
,
,,
vc
vcO
where
)
,,, 1,2,,
nn
i ijij ijij
AAdbui
= ∨= ∨=
L m
M1=
⎥
⎦
⎥
⎥
⎤
⎢
⎣
⎢
⎢
>
><
1
43
21
,
A
,weight
,,volume,
yy
xxPC
A1=
A
n
j 1
=∨
1j =(d
n
j 1
=∨
1j, b1j,, u1j )
= (run, controlled object, program)
∨ (process, controlled object, data)
∨…∨ (browse, controlled object, Web page)
⎡
machine, desktop
M2=
⎥
⎦
⎥
⎥
⎤
⎢
⎣
n
⎢
⎢
><
><
2
43
21
,function
,,weight
,, volume
A
yy
xx
A2=
A
j 1
=∨
2j =(d
n
j 1
=∨
2j, b2j,, u2j )
= (run, controlled object, program)
∨ (process, controlled object, data)
∨…∨ (browse, controlled object, Web page)
The next step is choosing a suitable extension method
for the problem. We choose the divergent tree method.
We summarize the divergence of matter-elements as “one
matter has many characteristics; a characteristic is shared
by many matter; a measure can be used to describe many
matters”. The method employing divergence to solve
knowledge-oriented problems and implementation-
oriented problems is called divergent tree method.
Proposition 4 in this method shows:
A characteristic-element (c, v) can be shared by many
matters, which is denoted as:
(O, c, v) ┤ {{(O1, c, v), (O2, c, v),⋯, (On, c, v)}
It tells us that two objects may have the same
characteristics and corresponding measures. We have an
algorithm with this function: Given n characteristic-
elements (ci, vi) ( i=1,2,…n) of an object and the universe
of discourse U, the algorithm can find other objects that
satisfy ci(Oi)=vi ( i=1,2,…n). With this help, the agent can
check that “PC” and “desktop machine” are the same
objects.
Example 2. Suppose an agent searches an ontology, it
gets “The voltage of an electrical appliance is 220 Volt”
while it gets from another ontology “The voltage of an
electrical appliance is 240 Volt”. The agent could report a
conflict.
We believe this conflict belongs to the “having
ontologies with differing coverage of the domain”
problem. In China electricity system the standard voltage
is 220V for civil use. The former ontology may cover
China market domain while the latter may cover UK
market domain.
We can eliminate this ontology-mismatch conflict by
the transforming universe of discourse method in
extension methodology. In classical set and fuzzy set, the
universe of discourse is fixed. This is suitable for search.
However, it also limits our thinking. To solve
contradiction problems, we may extent the original set to
find solutions. That is why Extenics tries to solve
contradiction problems under changeable universe of
discourse.
The universe of discourse U has five basic
transformations:
• Replacement transformation
TU=U’
• Increasing or decreasing transformation
TU=U⊕U1 or
TU=U⊖U 1, U⊃U 1
• Expansion or contraction transformation
TU =αU
• Decomposition transformation
TU ={U1, U2, …, Un}
• Duplicating transformation
TU ={U, U *}
We choose increasing transformation so that the new
universe of discourse becomes U⊕U1 where U1 stands for
oversea market domain. Now the universe of discourse
covers international market domain. The corresponding
ontology merges U and U1 two parts of specification of
the conceptualization and the corresponding vocabulary
for the new domain. When the agent searches the extent
ontology, it can conclude that 220V is suitable to use in
China but 240V is not suitable to use in China. By this
method it eliminates the original conflict.
There are many kinds of extension methods. We
believe they are suitable to eliminate different kinds of
36JOURNAL OF COMPUTERS, VOL. 3, NO. 8, AUGUST 2008
© 2008 ACADEMY PUBLISHER