Content uploaded by Kavi Mahesh
Author content
All content in this area was uploaded by Kavi Mahesh on Feb 24, 2016
Content may be subject to copyright.
Ontology Development for Machine Translation:
Ideology and Methodology
Kavi Mahesh
MCCS-96-292
Computing Research Laboratory
New Mexico State University
The Computing Research Laboratory was established by the
New Mexico State Legislature
under the Science and Technology Commercialization Commission
as part of the Rio Grande Research Corridor.
ii
CONTENTS
iii
Contents
1 Introduction to the Mikrokosmos Ontology 1
1.1 The MT Situation
2
1.2 What Is an Ontology?
5
1.3 A Situated Ontology
6
1.4 What Does the Ontology Do for NLP?
7
1.5 Computational Ontologies for NLP
9
1.6 Our Product: The Mikrokosmos Ontology
10
2 The Structure of the Ontology 11
2.1 Slots and Facets
17
2.1.1 Special Slots
18
2.2 Subgraph Patterns in the Ontology
19
2.2.1 The RELATION Pattern
19
2.2.2 The ATTRIBUTE Pattern
21
2.3 A Taxonomy of Symbols
24
2.3.1 Characteristics of Literal Symbols
24
2.4 The Need for Nonmonotonic Inheritance
25
2.5 Other Structural Problems
26
2.6 Complex Events: A Proposal and its Rejection
29
2.6.1 Implications of Introducing Complex Concepts
31
2.6.2 Ontological Instances
32
3 Principles of Ontology Design 32
3.1 Ontology Development: Desiderata
33
3.2 Limited Expressiveness and Usability
35
3.3 Structural Principles and Why We Violate Them
35
4 Ontology as a Sharable Resource 36
iv
CONTENTS
4.1 Sharability and the Ten Commitments 37
4.2 Ontologies for MT
Encyclopedia 41
5 Ontological Redundancy: A Theoretical Problem 41
5.1 Duality between States and PROPERTYs
42
5.1.1 Duality between OBJECTs and PROPERTYs
42
5.1.2 Duality between EVENTs and PROPERTYs
42
5.1.3 On the Need for Properties in the Ontology
43
5.2 Duality between OBJECTs and EVENTs
44
6 Constraints on Concept Representation 45
6.1 PROPERTYs are Not Stand-Alone Concepts
45
6.2 PROPERTYs Cannot be Fillers in Other Slots of OBJECTs and EVENTs
45
6.3 Binary RELATIONs Only
46
6.4 Literal and Scalar ATTRIBUTEs
46
6.5 All Scoping Is over Concepts
47
7 Ontology Acquisition: Situated Development 47
7.1 Approaches to Ontology Acquisition
47
7.2 The Situated Methodology
48
7.3 Technology for Ontology Development
50
7.3.1 The Customer Support Interface
52
7.4 Distinctions between the Ontology and the Onomasticon
52
7.5 Case Study: What Concepts to Add to the Ontology
54
7.6 Guidelines
55
7.6.1 Guidelines for Deciding What Concepts to Put in
59
7.6.2 General Guidelines and Constraints
60
7.6.3 Guidelines for Naming Concepts
60
8 The Development of the Mikrokosmos Ontology 61
CONTENTS
v
8.1 Quality Improvement
62
8.1.1 Ways of Controlling the Quality of the Ontology
63
Acknowledgements 64
References 65
Appendix A: BNF for the Mikrokosmos Ontology 68
Appendix B: Axiomatic Specification of the Mikrokosmos Ontology 71
vi
CONTENTS
LIST OF FIGURES
vii
List of Figures
1 The Role of the Ontology in Mikrokosmos
3
2 The Ontology Situated in the Mikrokosmos NLP Architecture. It Supplies Conceptual
Knowledge both for Lexical Representation and for Constraining Semantic Interpretation. 4
3 Top-Level Hierarchy of the Mikrokosmos Ontology Showing the First Three Levels of the
OBJECT, EVENT, and PROPERTY Taxonomies.
12
4 A Snapshot of the ORGANIZATION Hierarchy Under OBJECT in the
K Ontology. 13
5 A Snapshot of the SOCIAL-EVENT Hierarchy in the
K Ontology. 14
6 A Snapshot of the RELATION Hierarchy Under PROPERTY in the
K Ontology. 15
7 Frame Representation for Concept ACQUIRE. Also Shown is a Part of the Lexical Entry for
the Spanish Verb ‘‘adquirir’’ with Semantic Mappings to ACQUIRE and LEARN Events. The
Mappings Modify the Constraints in the Ontology and Add New Information such as Aspect. 16
8 The RELATION Pattern.
20
9 Complex RELATION Patterns
22
10 The ATTRIBUTE Pattern.
23
11 Pattern to Produce Overlapping Subclasses.
27
12 Overlapping Subclasses: Intended Meaning.
28
13 Overlapping Subclasses: Possible Misinterpretation.
28
14 Disjoint Subclasses: Intended Meaning.
29
15 CORPORATION Concept in a Taxonomy Only Ontology.
39
16 CORPORATION Concept in the
K Ontology Showing Non-Taxonomic Links. 39
17 Searching for Concepts Related to FOOD.
40
18 Knowledge Acquisition in the Mikrokosmos Situation.
51
19 The Customer Interface: Main Window.
53
20 Words for SWIM and FLOAT in Different Languages
54
21 SWIM and FLOAT: The ‘‘Agentivity’’ Dimension.
54
22 SWIM and FLOAT: The Specific Gravity Dimension.
56
23 SWIM and FLOAT: The Vertical/Horizontal Dimension.
56
24 SWIM and FLOAT: The Upward/Downward Dimension.
57
viii
LIST OF FIGURES
25 A Classification of MOTION-EVENTs in the K Ontology. 58
26 Rate of Growth of the
K Ontology. 62
1
Ontology Development for MT: Ideology and Methodology
Kavi Mahesh
Computing Research Laboratory
New Mexico State University
Las Cruces, NM 88003-8001
Ph: (505) 646-5466 Fax: (505) 646-6218
mahesh@crl.nmsu.edu
Abstract
In the Mikrokosmos approach to knowledge-based machine translation, lexical representation of
word meanings as well as text meaning representation is grounded in a broad-coverage ontology of the
world. We have developed such a language-neutral ontology for the purpose of machine translation in
the situationof theMikrokosmosproject. In orderto acquirea fairly large ontologywith limited timeand
manpower resources, a number of problems in knowledge representation, natural language semantics,
and software technology were solved by taking a practical, situated approach which does not always
produce formally clean or provably correct solutions. The ontology thus developed has already proven
to be of great value in almost every stage of natural language processing and linguistic knowledge
acquisition and has potential applications in many other fields as well. In this report, we describe the
Mikrokosmos ontology and the ideology behind it and present some of the practical solutions adopted to
solve a range of problems. We also present concrete details of the methodology for concept acquisition
that evolved from our exercises in machine translation.
1 Introduction to the Mikrokosmos Ontology
This report documents the methodology we followed in developing a broad-coverage ontology for use
primarily in natural language processing (NLP) and machine translation (MT). The report describes our
product,theMikrokosmos(
K )ontology,andaddressesanumberofissuesthataroseduringitsacquisition.
A common thread running through our solutions to the many problems in ontological engineering is a
strong preference for submitting to practical considerations as dictated by the entire situation of our MT
project. The technology we use for ontology development as well as a detailed set of guidelines and sample
scenarios for concept acquisition are documented in a separate report (Mahesh and Wilson, forthcoming).
Extensive documentation on the K ontology is also available on the World Wide Web at the URL
http://crl.nmsu.edu/users/mahesh/onto-intro-page.html
1
This report explains our solutions to problems in ontological engineering in a somewhat informal
manner and with the help of concrete illustrations taken from our experience. The reader is encouraged to
refer to the BNF specification in Appendix A and the axiomatic specification in Appendix B at the end of
the report for precise formulations of the structure and semantics of the
K ontology.
1
The
K ontology is available to interested researchers in a variety of forms including C++ objects, Lisp-like lists, or plain
ASCII text. Interfaces for browsing the ontology and searchingfor conceptsarealso available in severalforms. Further information
may be obtained from the author or on the World Wide Web at the above address.
2
1 INTRODUCTION TO THE MIKROKOSMOS ONTOLOGY
1.1 The MT Situation
Knowledge-Based Machine Translation (KBMT) views the taskof translating a text in one natural language
to a different natural language as a problem in mapping meanings. Meanings in the source language,
rather than just its syntax, must be mapped to the target language. A basic requirement for doing this
is a representation of text meaning which is not in the source or target languages. This interlingual
meaning representation must be grounded in a language-independent ontology that supplies not only the
primitive symbols that constitute the representation but also a set of well-defined composition operations
for combining different primitive symbols to represent bigger chunks of meaning.
Mikrokosmos (
K ) is a KBMT system under development at New Mexico State University jointly
with the US Department of Defense (Beale, Nirenburg, and Mahesh, 1995; Mahesh and Nirenburg, 1995a,
1995b; Onyshkevych and Nirenburg, 1994). Unlike many previous projects in interlingual MT, K is
a large-scale, practical MT system focusing currently on translating Spanish news articles to English. A
lexicon of approximately 7,000 Spanish words supported by an ontology of about 4500 concepts is already
in place. High qualitysemantic analyses of 5 article-length Spanish texts in the domainof company mergers
and acquisitions has already been produced.
The role of the ontology in
K is shown in Figure 1. The ontology sits centrally serving at least three
different purposes. It provides concepts for representing word meanings in the lexicon for a source or target
language. The figure shows an analysis lexicon where words are mapped to concepts in the ontology along
with any modifiers that augment the selectional constraints represented by the conceptual relationships in
the ontology. Such modification enables the system to capture the many nuances of meanings in different
languages. Every lexicon in the
K system, irrespective of its language and whether it is constructed for
analysis or for generation, maps to the same set of concepts in the ontology.
The network of concepts in the ontology also serves the role of a semantic network by providing the
search space for a powerful search mechanism (called Onto-Search in Figure 1) that finds the shortest path
in the ontology between a pair of concepts. Measures of such ‘‘distances’’ between concepts are very
useful for checking selectional constraints in service of basic disambiguation processes in the semantic
analyzer. Moreover, the search also supports far more powerful ‘‘semantic affinity’’ tests for the purpose of
interpreting various non-literal expressions (such as metaphor and metonymy) and anomalous inputs using
special purpose microtheories.
Concepts in the ontology are the building blocks used by the lexicon (statically) and the analyzer
(dynamically) to constructText Meaning Representations (TMR). A TMR can be viewed as an instantiation
of a subgraph of the ontology (typically much smaller than the entire ontology) with certain linguistic
augmentations (such as aspect, attitudes, modalities, and so on). TMRs serve as the interlingual
representation of meanings that are fed to target language generators. The role of the ontology in text
generation is not shown in Figure 1 or discussed in this report.
Figure 2 illustrates the K architecture for analyzing input texts. The K ontology can be seen
situated in this architecture which also includes components where the ontology does not play a direct role.
However, the ontology also plays a significant role in the acquisition of lexicons and in the building and
testing of analyzers and generators. We return to this issue in the discussion of acquisition methodology
later in this report.
1.1 The MT Situation
3
Ontology: Concept Network
Word:
Syntax:
Semantics:
....
Modifier
....
Concept
Nonliteral Microtheories
(metonymy, metaphor, ...)
Semantic Analyzer
Onto-Search Engine
Lexicon-1
(more lexicons)
(more entries)
Search RequestsScores Scores
Shortest-Path
Search between
Concepts
TMR: Instance Network
Instance
Instantiation
Selectional
Constraints
Figure 1: The Role of the Ontology in Mikrokosmos
4
1 INTRODUCTION TO THE MIKROKOSMOS ONTOLOGY
SYNTACTICO-SEMANTIC
PREPROCESSOR
INPUT TEXT
SYNTACTIC
PARSER
ONTOLOGY
LEXICON
Lexical
Semantic
Specification
FINISHED TMRs
Ontology
Concepts
SYNTACTIC
PARSE
FOREST
WORD SENSE
SELECTION
ONTO
SEARCH
COMBINATION/
TMR BUILDING
(Including morphology and
part of speech tagging)
TRANSFORMED
PARSE
FOREST
Ontology
Concepts
Iterate
Time
....
TMRs
MIKROKOSMOS ARCHITECTURE
(Constraint Checking)
Lexical
Syntactic
Specification
PANGLOSS
System
Proposition
Aspect
Gender
Apposition
Set & Conjunction
Ontology
Lexicon
Parse Forest
TMRs
INSTANTIATION and
SYNTAX-SEMANTICS
VARIABLE BINDING
CANDIDATE TMR
FRAMES & SLOTS
MICROTHEORIES
Figure 2: The OntologySituated in the MikrokosmosNLP Architecture. It Supplies Conceptual Knowledge
both for Lexical Representation and for Constraining Semantic Interpretation.
1.2 What Is an Ontology?
5
1.2 What Is an Ontology?
In the knowledge-based approach to machine translation, meanings of source language (e.g., Spanish)
texts are represented internally in a language-neutral interlingua (e.g., Nirenburg, 1989).
2
The interlingual
meaning representation (that we call a TMR) is derived from representations of word meanings in
computational lexicons and from representations of world knowledge in ontologies (and possibly episodic
knowledge bases). Aninterlingual meaning representation oncederived is inputto alanguage generator that
produces the translation in the target language (e.g., English). A key issue in the design and development
of KBMT systems is the set of symbols used to represent interlingual meaning as well as the structure of
the meaning representation. In our methodology, the set of symbols and possible relationships among them
are grounded in a language-independent knowledge source called the ontology. The symbols are defined as
concepts in the ontology. The same set of concepts are used to represent word meanings in lexicons. The
internal structure of these concepts is reflected in both lexical and text meaning representations. TMRs are
essentially instantiations of ontological concepts connected together according to constraints derived both
from the ontology and elsewhere.
A typical dictionary definition of an ontology is ‘‘The branch of metaphysics that studies the nature
of existence.’’ For us, an ontology is a computational entity, a resource containing knowledge about what
‘‘concepts’’ exist in the world and how they relate to one another. A concept is a primitive symbol for
meaning representation with well defined attributes and relationships with other concepts. An ontology is a
network of such concepts forming a symbol system where there are no uninterpreted symbols (except for
numbers and a small number of known literals).
An ontology for NLP purposes is a body of knowledge about the world (or a domain) that a) is a
repository of primitive symbols used in meaning representation; b) organizes these symbols in a tangled
subsumption hierarchy; and c) furtherinterconnects these symbols using a rich system of semantic relations
defined among the concepts. In order for such an ontology to become a computational resource for solving
problems such as ambiguity and reference resolution, it must be actually constructed, not merely defined
formally. The ontology must be put into well-defined relationships with other knowledge sources in the
system. In an NLP application, the ontology supplies world knowledge to lexical, syntactic, semantic, and
pragmatic processes, and other microtheories.
An ontology is a database with information about
what categories (or concepts) exist in the world/domain,
what properties they have,
and how they relate to one another.
In interlingual machine translation, the principal reasons for using an ontology are (Figure 1):
to provide a grounding for representing text meaning in an interlingua;
to enable lexicons for different languages to share knowledge;
to enable source language analyzers and target language generators to share knowledge.
2
This and the following subsections in Section 1 reuse parts of the introductory text in Mahesh and Nirenburg (1995a; 1995b).
6
1 INTRODUCTION TO THE MIKROKOSMOS ONTOLOGY
In addition, ontologies are also used in KBMT
to store selectional restrictions and other pieces of world knowledge;
to ‘‘fill gaps’’ in text meaning by making inferences based on the content of conceptual knowledge
in the ontology;
to resolve semantic ambiguities and interpret non-literal language by making inferences using the
topology of the ontology to measure the semantic affinity between meanings;
as a classification of people, places, social roles, and organizations which forms the basis for
organizing an onomasticon, a gazetteer, or other such databases.
In addition, the same ontology can be of great value in a variety of other tasks including database
merging (Dowell, Stephens, and Bonnell, 1995; Van Baalen and Looby; 1995), integration of software or
business enterprise models (Fillion, Menzel, Mayer, and Blinn, 1995), and so on. Essentially, an ontology
such as the
K ontology is invaluablewherever a ‘‘semantic wall’’ is to be scaled, be it to translate between
a pair of natural languages, a pair of database schemas, or to integrate different models of the same domain
or similar phenomena in the world. We provide specific illustrations of the uses of the ontologyin NLP and
MT below, but refer the reader to other literature for examples of more broader uses of ontologies (IJCAI
Ontology Workshop, 1995).
In the Mikrokosmos project, we have developed an ontology covering a wide range of categories in
the world. Several illustrative concepts from the
K ontology will be shown below. The above uses of
the ontology in machine translation will also be illustrated through examples. See Mahesh and Nirenburg
(1995b) and Beale, Nirenburg, and Mahesh (1995) for more detailed examples.
1.3 A Situated Ontology
Asituatedontologyisaworldmodelusedasacomputationalresourceforsolvinga particularsetofproblems
(Mahesh and Nirenburg, 1995a). It is treated as neither a ‘‘natural’’ entity waiting to be discovered nor
a purely theoretical construct. World models (ontologies) in computational applications are artificially
constructed entities. They are created, not discovered. Many ontologies are developed for purely theoretical
purposes and never really constructed to become a computational resource. Even those that are constructed,
Cyc (Lenat and Guha, 1990) being the best example, are often developed without the context of a practical
situation (e.g., Smith, 1993). Many practical knowledge-based systems, on the other hand, employ world
or domain models without recognizing them as a separate knowledge source (e.g., Farwell, et al. 1993).
In the field of NLP, there is now a consensus that all NLP systems that seek to represent and manipulate
meanings of texts need an ontology (e.g., Bateman, 1993; Nirenburg, Raskin, and Onyshkevych, 1995). In
our continued efforts to build a multilingual KBMT system using an interlingual meaning representation
(e.g., Onyshkevych and Nirenburg, 1994), we have developed an ontology to facilitate natural language
interpretation and generation. The central goal of the Mikrokosmos project is to develop a system that
produces a comprehensive Text Meaning Representation (TMR) for an input text in any of a set of source
languages.
3
Knowledge that supports this process is stored both in language-specific knowledge sources
and in an independently motivated, language-neutral ontology (Carlson and Nirenburg, 1990; Mahesh, and
Nirenburg, 1995a).
3
The current system prototype is an analyzer of Spanish.
1.4 What Does the Ontology Do for NLP?
7
Not only is the
K ontology situated in our machine translation architecture, its development is also
very much situated in the processes of lexical knowledge acquisition, development of analysis programs,
and system testing and evaluation. The primary source of meanings to be encoded as new concepts
in the ontology is the continual stream of requests for concepts from lexicographers trying to represent
meanings of words using concepts in the ontology. Concepts acquired per such requests are in turn tested
in the semantic analyzer almost immediately and changes and corrections sent to the ontology acquirers
within a few days. Moreover, since the number of people browsing the ontology (lexicographers, system
builders, and testing and evaluation experts, over 10 in our situation) is many times the number of ontology
developers (at most two in our case), it is highly likely that any error will be noticed during browsing,
especially by those who requested the concept in error, and corrected through the feedback process. In
fact, because of this imbalance in the numbers of ontology developers and customers, we had to resort to
computer support in the form of interfaces andbookkeeping programs to assistin sending and keeping track
of requests and complaints to ontology developers. Ontology developers at present interact regularly with
lexicographers building lexicons in Spanish, Japanese, and Russian.
4
1.4 What Does the Ontology Do for NLP?
As already noted, an ontology has several different uses in KBMT. Brief examples of how the ontology
aids NLP and MT are shown below:
It is the main repository of selectional preferences on meaning composition. Knowledge of such
constraints is invaluable for resolving ambiguities by means of the constraint satisfaction process
shown in the form of the ‘‘Onto Search’’ (Onyshkevych, 1995) box in Figures 1 and 2. For example,
consider the Spanish sentence ‘‘El grupo Roche adquirio Docteur Andreu.’’ Did Roche ACQUIRE
5
or LEARN Docteur Andreu? The verb ‘‘adquirir’’ can mean either of these. However, selectional
constraints on the ACQUIRE and LEARN concepts in the ontology tell us that if the theme is not an
ABSTRACT-OBJECT, then the meaning of the verb is likely to be ACQUIRE and not LEARN (see Figure 7).
Since Docteur Andreu is known to be the name of a CORPORATION (which is a SOCIAL-OBJECT, not an
ABSTRACT-OBJECT), the correct meaning of ‘‘adquirir’’ in this sentence is ACQUIRE. The ontology is
also used for checking selectional constraints. For example, it is used to determine if Docteur Andreu
is an ABSTRACT-OBJECT.
It enables inferences to be made from the input text using knowledge contained in the concepts.
This can help resolve ambiguities as well as fill gaps in the text meaning. A default value from an
ontological concept can be filled in a slot, for instance, when a text does not provide a specific value.
For example, if the text says ‘‘John went swimming although the water was cold,’’ using the default
value of the SUBSTRATE slot of SWIM, namely WATER, the analyzer can infer that the relationship
between the MOTION-EVENT SWIM and WATER is one of SUBSTRATE (and its inverse SUBSTRATE-OF)
rather than other possibilities such as a MATERIAL that is INGESTed.
It enables inferences to be made using the topology of the network, as in searching for the shortest
path between two concepts. Such search-based inferences are used all the time (in the ‘‘onto search’’
box in Figures 1 and 2) to check how well a selectional constraint is satisfied by a piece of text. It is
often the case in NLP that none of the alternatives satisfies the constraints completely. The topology
4
Constructionof Japaneseand Russian lexiconshasbegun recently. Althoughthe ontologyis already beingusedforrepresenting
lexical meanings in all three languages, the
K analyzer has so far only been tested on Spanish texts.
5
Names of concepts in the ontology are shown in small capitals in this report.
8
1 INTRODUCTION TO THE MIKROKOSMOS ONTOLOGY
of the ontology helps the analyzer pick the meaning that is ‘‘closest’’ to meeting the constraints.
Such inferences can also support metonymy and metaphor processing, figuring out the meaning of a
complex nominal, or be used in other types of constraint relaxation when the input cannot be treated
with the available knowledge.
Theontologyparallels
K lexiconsindevelopingtheMT system. Wordmeaningsare representedpartly
in the lexicon and partly in the ontology (see Figure 7 for an example). In principle, the separation between
ontology and lexicon is as follows: language-neutral meanings are stored in the former; language-specific
information in the latter. In a multilingual situation, it is not easy, however, to determine this boundary. As
a result, ontology and lexicon acquisition involves a process of daily negotiations between the two teams
of acquirers. The easiest solutions to many difficult problems in lexical semantic representation require the
addition of a new concept to the ontology under certain ‘‘catch all’’ frames (this is the only solution in
the ‘‘word sense approach’’ to ontology development where word senses are mapped almost one to one to
concept names (e.g., Bateman, et al. 1990; Knight and Luk, 1994)). In Mikrokosmos, a set of guidelines
was developed for suggesting ways of finding solutions to lexical problems without adding ‘‘catch all’’
concepts. (A sample of these guidelines is presented in Section 7.6 of this report.)
The ontology also aids meaning representation and, in particular, lexical representation as follows:
It forms a substrate upon which word meanings in any language are grounded and constructed in the
lexicon (see, e.g., Nirenburg and Levin, 1992). It guarantees that every symbol used in representing
lexical semantics is defined as a concept, is well-formed, and has known relations to all other
symbols. Moreover, it provides the basic internal structure of meaning elements (i.e., concepts)
so that lexicographers only need to add in the lexicon any necessary modifiers to the ontological
meaning.
The ontology guides lexicographers along particular ways of decomposing and representing word
meanings. Lexicographers constantly face the choice between making a word sense a primitive in the
meaning representation and decomposing it in various ways to arrive at other, existing primitives.
Sucha choiceis almostalways underconstrainedgiven onlythe meaning of the word in that language.
The ontology provides a basis for making a good decision by bringing in a range of ontological and
representational constraints that cover the classification of all related meanings, irrespective of the
language the word belongs to.
It helps partition multilingual MT work. This is a big methodological advantage since it allows
us to partition the task of developing multilingual MT systems into the independent development
of analysis and generation lexicons for the different languages. The ontology serves as a common
ground both between an analyzer and a generator and between different languages. It provides a
way for a clean separation of language independent world knowledge from linguistic knowledge to
maximize the sharing of knowledge between the different representations (Mahesh and Nirenburg,
1996).
An ontology-based approach to lexical semantics makes lexical representations highly parsimonious
by reducing the number of different entries needed in the lexicon. Meaning common to many words
in a language can be factored out and represented in the ontology in a concept to which the words
map. Moreover, this approach provides rich expressiveness for modifying and augmenting base
meanings in a variety of ways, instead of simply enumerating a multitude of senses. Concepts in
the ontology can be modified both through the relations and attributes in their slots and using the
1.5 Computational Ontologies for NLP
9
representational apparatus external to the ontology for encoding aspect, attitude, modality, and other
elements of linguistic semantics. As a result, the ontology allows the lexicon to capture word senses
through far fewer entries than found in a typical dictionary. For example, Nirenburg, Raskin, and
Onyshkevych (1995) have shown that the 54 meanings for the Spanish verb ‘‘dejar’’ listed in the
Collins Spanish-English dictionary can be collapsed into just 7 different lexical mappings using the
K approach. Much of this power comes from the expressiveness of the representation that allows
complex combinations and modifications of ontological concepts in meaning representations. See
Onyshkevych and Nirenburg (1994) for detailed examples of lexical representations.
It allows generic and incomplete lexical representations which are nevertheless well-formed. For
example, in mapping certain adjectives and adverbs, the lexicon needs to refer to the AGENT slot
(say) of an EVENT without knowing which particular event it is. This can be done in Mikrokosmos
by directly referring to the AGENT slot of the EVENT concept (even though not every EVENT has an
AGENT) and using a variable-binding or coreference mechanism to ultimatelymap the event to a more
specific event in the TMR.
It allows variable-depth semantics where lexical semantics can range from a direct mapping to a
concept without any modifications all the way to a complex combination of several concepts with
variousadditional constraints, relaxations, and additionalinformationsuch as time, aspect, quantities,
and their relationships. It helps avoid unnecessarily deep decomposition of complex meanings by
resorting to a controlled proliferation of primitive concepts (see Section 4.1).
It also enables sophisticated ways of reducing manual effort in lexicon acquisition. For example, it
allows a derivational morphology engine to automatically suggest concepts for the lexical mappings
of the derived words which can then be refined by the acquirer instead of acquiring from scratch.
1.5 Computational Ontologies for NLP
Just as no one so far has found ‘‘the right’’ grammarfora natural language (such as English), itis reasonable
to argue that any ontology we build will not be ‘‘the right’’ or the only ontology for a domain or a task.
The
K ontology is one possible classification of concepts in its domain constructed manually according
to a well developed set of guidelines. Its utility in NLP can only be evaluated by the quality of the meaning
representations or translations produced by the overall system or through some other evaluation of the
overall NLP system (such as in an information extraction or retrieval test). This is not to say that the
ontology is randomly constructed. It is not. Its construction has been constrained throughout by guidelines
(see Section 7.6 below), by constant scrutiny by lexicographers, testers, and other experts, as well as by the
requirements of meaning representation and lexicon acquisition.
In NLP work, the term ‘‘ontology’’ is sometimes also used to refer to a different kind of knowledge
base which is essentially a hierarchical organization of a set of symbols with little or no internal structure to
each node in the hierarchy (e.g., Farwell, et al. 1993; Knight and Luk, 1994). Frames in the
K ontology,
however, have a rich internalstructure throughwhich are represented various types of relationships between
concepts and the constraints, defaults, and values upon these relationships. It is from this rich structure
and connectivity that one can derive most of the power of the ontology in producing a TMR from an input
text. Mere subsumption relations between nearly atomic symbols do not afford the variety of ways listed
above in which the
K ontology aids lexicon acquisition and semantic processes such as disambiguation
and non-literal interpretation.
10
1 INTRODUCTION TO THE MIKROKOSMOS ONTOLOGY
The above distinction between highly structured concepts and nearly atomic concepts can be traced
to a difference in the grain size of decomposing meanings. A highly decompositional (or compositional)
meaning representation relies on a very limited set of primitives (i.e., concept names). If we do this, the
representation of many basic concepts becomes too complex and convoluted. The other extreme is to map
each word sense in a language to an atomic concept. As a result, the nature of interconnection among these
concepts becomes unclear, to say nothing about the explanatory power of the system (cf. the argument
about the size of the set of conceptual primitives in Hayes, 1979). In Mikrokosmos, we take a hybrid
approach and strive to contain the proliferation of concepts for a variety of methodological reasons, such
as tradeoffs between the parsimony of ontological representation and that of lexical representation and the
need for language independent meaning representations. Control over proliferation of concepts is achieved
by a set of guidelines that tell the ontology acquirer when not to introduce a new concept (see Section 7.6
below).
We do not make any distinction between an ontologyand a world knowledge base. The K ontology is
a taxonomic classification of concepts as well as a repository of world knowledge expressed in the form of
conceptual relationships. Instead of separating the classification (often called ‘‘ontology’’) from the rest of
the knowledge about the concepts beingclassified (often called ‘‘knowledgebase’’), wekeep all knowledge
about concepts in the ontology. This way all knowledge that is not specific to any particular natural
language will be in the ontology while language-specific knowledge is confined to individual lexicons for
the different languages.
The K ontology however makes a clear distinction between conceptual and episodic knowledge and
includes only conceptual knowledge. Instances and episodes are acquired in a separate knowledge base
called the onomasticon. The methodology for acquiring the onomasticon includes a significant amount of
automationandisvery different fromontologyacquisition whichisdonemanuallyvia continualinteractions
with lexicographers.
1.6 Our Product: The Mikrokosmos Ontology
The
K system currently processes Spanish texts about mergers and acquisitions of companies. However,
since the input language is unrestricted, the ontology must, in fact, cover a wide range of concepts outside
this particulardomain. In addition, the analyzer encounters a variety of well-known problems in NLP which
require the application of a significant amount of world knowledge.
We are currently in the process of a massive acquisition of concepts (OBJECTs, EVENTs as well as
PROPERTYs) related to the domain of company mergers and acquisitions (Mahesh and Nirenburg, 1995a,
1995b).
6
However, since our input texts are unedited, real-world texts, the ontology must support a wide
range of word meanings many of which are outside the domain of mergers and acquisitions by any stretch
of imagination. Moreover, we also need to support alternative meanings of words in the texts which must
be represented in the lexical entries for the word. As a result, the K ontology is best described as a
broad-coverage ontology with more complete coverage in the domain of mergers and acquisitions than in
many other domains.
Over a period of several months, the
K ontology was developed by starting with an ontology that
had less than 2000 concepts and acquiring nearly 3000 new concepts organized in a tangled hierarchy with
ample interconnectivity across the branches. Continued efforts have been made to monitor some of the
6
In parallel, a Spanish lexicon that maps lexemes to concepts in this ontology is also being acquired on a massive scale.
11
qualities of the ontologyand maintain certain overall characteristics. For example, the ontology emphasizes
depth in organizing concepts and reaches depth 10 or more along a numberof paths. The branching factor is
kept much less than 10 at most points (the overall average being less than 5). Each concept has, on average,
10 to 15 slots linking it to other concepts or literal constants. The top levels of the hierarchy (Figure 3
below) have proved very stable as we are continuing to acquire new concepts at the lower levels.
Unlike many other ontologies with a narrow focus (e.g., Casati and Varzi, 1993; Hayes, 1985; Mars,
1993), our ontology must cover a wide variety of concepts in the world. In particular, our ontology
cannot stop at organizing terminological nouns into a taxonomy of objects and their properties; it must also
represent a taxonomy of (possibly, complex) events and include many interconnections between objects
and events to support a variety of disambiguation tasks. At present, the
K ontology has about 4600
concepts in its hierarchies. Each concept, on an average, is connected to at least 14 other concepts through
the relationships represented in its slots.
All entities in the
K ontology are classified into free-standing entities
7
and properties. Free-standing
entities are in turn classified into OBJECTs and EVENTs. Figure 3 shows the top-level hierarchy in the
ontology. OBJECTs, EVENTs, and PROPERTYs constitute the concepts in the ontology which are represented
as frames. Figures 4, 5, and 6 show snapshots of parts of the OBJECT, EVENT, and PROPERTY hierarchies in
the
K ontology. These figures show only the names of concepts and the taxonomic relations among them.
Each node shown in these figures in fact has an elaborate internal structure of slots and fillers. An example
of the internals of a concept can be seen in Figure 7. A more detailed tour of the
K ontology is presented
in ‘‘A Guided Tour of the Mikrokosmos Ontology’’ (see Mahesh and Wilson, forthcoming).
Eachframe is a collectionof slotswithoneormorefacets and fillers. The slots(includinginheritedones)
collectively define the concept by specifying how the concept is related to other concepts in the ontology
(through relations) or to literal or numerical constants (through attributes). Lexicon entries represent word
or phrase meanings by mapping them to concepts in the ontology.
A TMR is a result of instantiating concepts from the ontology that are referred to in a text and linking
themtogetheraccording to the constraints in the conceptsas well as thoselistedin lexiconentriesandspecial
TMR building rules. A number of concepts in the domain of mergers and acquisitions are located under the
ORGANIZATION subtree under SOCIAL-OBJECTs and the BUSINESS-ACTIVITY subtree under SOCIAL-EVENTs
(see Figures 3, 4, and 5). Figure 7 shows a sample frame in the ontology along with a lexical mapping to
that concept.
2 The Structure of the Ontology
The ontology is a directed graph where nodes are concepts. Links between nodes are represented as slots
and fillers. Slot names themselves are the class of concepts known as PROPERTYs except for the names of
some ‘‘bookkeeping’’ slots. Links are in fact multi-dimensional since each link can have several facets.
Each facet can take one or more fillers. No importance is attached to the order of multiple fillers in a slot.
8
A complete BNF description of the ontology is provided in Appendix A. A corresponding set of axioms
defining the structure of the ontology is shown in Appendix B at the end of this report.
7
Free-standingentity is not an actual concept in the ontology; we use the term when describingthe top-level organization of the
ontology in order to distinguish between OBJECTs and EVENTs on the one hand and PROPERTYs on the other.
8
See Section 6.4 for an exception in the case of Range specifications of LITERAL-ATTRIBUTEs.
12
2 THE STRUCTURE OF THE ONTOLOGY
ALL
OBJECT
PROPERTY
PHYSICAL-OBJECT
MENTAL-OBJECT
SOCIAL-OBJECT
RELATION
ATTRIBUTE
SCALAR-ATTRIBUTE
LITERAL-ATTRIBUTE
EVENT-RELATION
OBJECT-RELATION
EVENT-OBJECT-RELATION
MATERIAL
SEPARABLE-ENTITY
PLACE
REPRESENTATIONAL-OBJECT
ABSTRACT-OBJECT
ORGANIZATION
GEOPOLITICAL-ENTITY
EVENT
PHYSICAL-EVENT
MENTAL-EVENT
SOCIAL-EVENT
PERCEPTUAL-EVENT
COMMUNICATIVE-EVENT
COGNITIVE-EVENT
EMOTIONAL-EVENT
SOCIAL-ROLE
Figure 3: Top-Level Hierarchy of the Mikrokosmos Ontology Showing the First Three Levels of the
OBJECT, EVENT, and PROPERTY Taxonomies.
13
ORGANIZATION
FOR-PROFIT-ORGANIZATION
NON-PROFIT-ORGANIZATION
RELIGIOUS-ORGANIZATION
NON-PROFIT-SERVICE-CORPORATION
NON-PROFIT-CORPORATION
FOR-PROFIT-SERVICE-
CORPORATION
FOR-PROFIT-MANUFACTURING-
CORPORATION
FOR-PROFIT-CORPORATION
CORPORATION
PARTNERSHIP
COOPERATIVE
SOLE-PROPRIETORSHIP
CONGLOMERATE
PRIVATE-ORGANIZATION
MILITARY-ORGANIZATION
PUBLIC-ADMINISTRATION-
ORGANIZATION
POLITICAL-ENTITY
GOVERNMENTAL-ORGRANIZATION
CONSORTIUM
Figure 4: A Snapshot of the ORGANIZATION Hierarchy Under OBJECT in the K Ontology.
14
2 THE STRUCTURE OF THE ONTOLOGY
SOCIAL-EVENT
WORK-ACTIVITY
CRIMINAL-ACTIVITY
FINANCIAL-EVENT
INVESTMENT-EVENT
BANKING-EVENT
COMMERCE-EVENT
CORPORATE-EVENT
MERGE
EVERYDAY-FINANCIAL-EVENT
NATIONAL-ECONOMIC-EVENT
INFORMATION-MANAGING-
ACTIVITY
MONEY-MANAGING-ACTIVITY
BUSINESS-ACTIVITY
PEOPLE-MANAGING-ACTIVITY
ACADEMIC-ACTIVITY
FARMING-ACTIVITY
SERVICE-EVENT
MANUFACTURING-ACTIVITY
MILITARY-ACTIVITY
TRANSFER-POSSESSION
ACQUIRE
BUY
OWN
EVERYDAY-NON-WORK-ACTIVITY
INTERACT-SOCIALLY
PLAY-SPORT
NON-WORK-ACTIVITY
RELIGIOUS-ACTIVITY
POLITICAL-EVENT
POSSESSION-EVENT
Figure 5: A Snapshot of the SOCIAL-EVENT Hierarchy in the K Ontology.
15
RELATION
OBJECT-RELATION
EVENT-RELATION
EVENT-STATE-RELATION
EVENT-OBJECT-RELATION
PHYSICAL-OBJECT-RELATION
MENTAL-OBJECT-RELATION
SOCIAL-OBJECT-RELATION
SOCIAL-ROLE-RELATION
ORGANIZATION-RELATION
CORPORATE-OBJECT-RELATION
SOCIAL-RELATION
CASE-ROLE
CONCEPTUAL-CASE-ROLE
CASE-ROLE-INVERSE
SPATIAL-RELATION
CORPORATE-RELATION
FIELD-OF-STUDY-RELATION
PRODUCTION-RELATION
INVERSE-EVENT-OBJECT-RELATION
INVERSE-EVENT-STATE-RELATION
INVERSE-EVENT-RELATION
Figure 6: A Snapshot of the RELATION Hierarchy Under PROPERTY in the K Ontology.
16
2 THE STRUCTURE OF THE ONTOLOGY
ACQUIRE: The transfer of possession event where the agent transfers an object to its possession.
IS-A
Value TRANSFER-POSSESSION
Time-Stamp: "created by mahesh at 17:36:28 on 03/13/95"
THEME
Sem OBJECT
AGENT
Sem HUMAN
PURPOSE-OF
Sem BID
INSTRUMENT
Sem HUMAN, EVENT
OWN
PRECONDITION-OF
Sem
SOURCE
Sem HUMAN
LOCATION :: PLACE
Concept Name
"Definition"
Slot
Facet
Filler (a value)
Filler (a constraint)
Inherited Slot and Filler
Lexical Entry
"adquirir"
Category: Verb
.....
Semantics: ACQUIRE
Agent: Relaxable-To ORGANIZATION
Source: Relaxable-To ORGANIZATION
Aspect:
Phase: Begin
Duration: Momentary
Iteration: Single
Telic: Yes
Semantics: LEARN
Theme: Sem ABSTRACT-OBJECT
Source: Default TEACHER, BOOK
Figure 7: Frame Representation for Concept ACQUIRE. Also Shown is a Part of the Lexical Entry for the
Spanish Verb ‘‘adquirir’’ with Semantic Mappings to ACQUIRE and LEARN Events. The Mappings Modify
the Constraints in the Ontology and Add New Information such as Aspect.
2.1 Slots and Facets
17
Formally, the ontologyis a graph with only two kinds of patternsin its subgraphs. The structure of these
patterns will be elaborated further below. All concepts in the ontology are classified into one of OBJECTs,
EVENTs, or PROPERTYs. OBJECTs and EVENTs are stand-alone concepts; they are instantiated in the TMR.
PROPERTYs, on the other hand, are not normally individually instantiated;
9
they become slots in OBJECTs
and EVENTs. As such, PROPERTYs cannot modify other PROPERTYs. They are specified only in terms of their
Domains and Ranges.
PROPERTYs are of two types: RELATIONs and ATTRIBUTEs. Corresponding to these two are the two
subgraph patterns that constitute the ontology (shown in Section 2.2 below). RELATIONs differ from
ATTRIBUTEs in that RELATIONs map an OBJECT or EVENT to another OBJECT or EVENT while ATTRIBUTEs
map an OBJECT or an EVENT to a scalar (i.e., number) or a literal symbol. In other words, a filler of an
ATTRIBUTE slot is a number, a literal symbol, an open or closed range of numbers, or, in unfortunate cases,
an undefined symbol not in the known set of literals.
2.1 Slots and Facets
A slot is the basic mechanism for representing relationships between concepts. In fact, slot is the
fundamental meta-ontological predicate based on which the entire ontology can be described axiomatically
(see Appendix B). Most ‘‘content’’ slots are PROPERTYs which are themselves defined as concepts in the
ontology. There is a closed class of special slots. These slots are described below.
All slots have all permissible facets except as mentioned for the special slots below. An additional
constraint from the distinction between concepts and instances is that slots in concepts (other than the
special slots listed below) have only Sem facets and slots in instances have only Value facets. Permissible
facets are:
Value: the filler of this facet is an actual value; it may be an instance of a concept, a literal symbol, a
number, or another concept (in the case of the special slots listed below).
Sem: the filler of a Sem facet is either another concept or a literal, number, or scalar range. In any
case, the filler serves as a selectional restriction on the fillers of the Value and Default facets of the
slot. It is through these selectional restrictions that concepts in the ontology are related (or linked) to
other concepts in the ontology (in addition to taxonomic links).
Default: the filler of a Default facet is the value of the slot in the absence of an explicit Value facet.
For the sake of simplicity, Default facets are not inherited in the
K ontology (and hence are not in
much use either).
Measuring-Unit: thisfacet is used to add a measuring unit for the numberthat fillsthe Value, Default,
or Sem facets of the same slot. MEASURING-UNITs are also defined as concepts in the ontology.
10
Salience: the Salience facet can be used to indicate how important or central a slot is to the entire
concept. This may be used in lexicons or actual TMRs to indicate the focus of a part of the text, but
is not used in the ontology itself.
9
They may, however, be instantiated in a TMR by means of a reification operation (e.g., Russell and Norvig, 1995), thereby
making them stand-alone instances in the TMR. Such reification can also be triggered from the lexicon.
10
‘‘Measuring unit’’ has been used as the name of the facet as well as the concept. The facet should perhaps be renamed as
‘‘Unit.’’ However, this facet is not used much in the ontology.
18
2 THE STRUCTURE OF THE ONTOLOGY
Relaxable-to: this facet is used only in the lexicon and indicates to what extent a language permits a
selectional constraint to be violated in nonliteral usage such as a metaphor or metonymy. The filler of
this facet is a concept that indicates the maximal set of possible fillers beyond which the text should
be considered anomalous.
In addition to slots and facets, additional structure is available through the notion of a View. Each facet
can have multiple views. In the current representation, we use only one view called Common. All facets in
all slots of concepts have the Common view. In fact, the Common view is transparent in the Mikrokarat
knowledge acquisition tool used for building the K ontology.
2.1.1 Special Slots
Definition: This slot is mandatoryin all concepts and instances. It has only a Value facet whose filler
is a definition of the concept in English intended only for human consumption.
Time-Stamp: This is used to encode a signature showing who created or updated this concept and
when. It may be noted that not all concepts have an up to date Time-Stamp since the tools we have
been using have not provided this feature at various times.
Referenced-By-Token: This slot was used in the past to encode a sentence from a source text which
led to a need for this concept. Many English sentences were thus inserted into concept frames in
the past. In the current multi-lingual setup of this project, we have done away with this slot (mainly
because of inconsistent ways of encoding the example sentences in the ontology we started with).
This slot perhaps belongs only in a lexicon and will not be used in the ontology.
Is-A: This slot is mandatory for all concepts except ALL which is the root of the hierarchy. Instances
donot havean Is-Aslot. Thisslot has onlyaValue facet which is filledby the names of the immediate
parents of the concept. A concept missing an Is-A slot is called an orphan. Ideally, only ALL should
be an orphan.
Subclasses: This slot is mandatory for all concepts except leaves (which do not have other concepts
as their immediate children). Concepts that only have instances as children do not have this slot. This
slot also has only a Value facet which is filled by the names of the immediate children of the concept.
Instances: This slot is present in any concept that has instances as children. A concept may have
both Subclasses and Instances. There is no requirement that only leaf concepts have instances. This
slot also has only a Value facet filled by the names of the instances of this concept.
Instance-Of: This slot is mandatory for all instances and is present only in instances. It has only a
Value facet which is filled by the name of the parent concept.
Inverse: This slot is present in all RELATIONs and only in RELATIONs. It has only a Value facet which
is filled by the name of the RELATION which is the Inverse of this RELATION.
Domain: This slot is present in all PROPERTYs and only in PROPERTYs. It has only a Sem facet which
is filled by the names of concepts which can be in the domain of this RELATION or ATTRIBUTE. A
Domain slot never has a Value facet since there is never an instance of a PROPERTY in the ontology.
A PROPERTY enters a text meaning representation (TMR) only as a slot in an instance of an OBJECT
or EVENT unless the slot has been reified in which case the reified slot has Domain and Range slots
in the TMR.
2.2 Subgraph Patterns in the Ontology
19
Range: This slot is also present in all PROPERTYs and only in PROPERTYs. It too has only a Sem
facet. In RELATIONs, the filler of the Sem facet is the names of concepts that are in the range of this
RELATION. In an ATTRIBUTE, the Sem facet is filled by all the possible literal or numerical values
permissible for that ATTRIBUTE. The filler can also be a numerical range specified using appropriate
mathematical comparison operators (such as
, , ...). Again, the Range slot never has a Value facet
since there is never an instance of a PROPERTY in the ontology.
All other slots have Sem facets in concepts and Value facets in instances. Any slot either in a concept
or in an instance may have in addition a Default, a Salience, or a Measuring-Unit facet. These facets have
only been used sparingly in the current ontology. In fact, the Salience facet has neverbeen used. It may also
be noted that no facet is mandatory in these non-special slots. For example, a slot may have just a Default
facet specified (with the implication that there is no constraint on other possible values for the filler).
2.2 Subgraph Patterns in the Ontology
The entire ontology is built of only two different subgraph patterns. These two patterns correspond to a
slot that is a RELATION and a slot that is an ATTRIBUTE. In this analysis, we ignore certain special slots.
Some of them like Definition and Time-Stamp are like ATTRIBUTEs but have fillers that are not in the set of
literals. Others are like RELATIONs but are not defined as RELATIONs. Examples are the subsumption links
of Is-A and Subclasses and the instance links Instances and Instance-Of. There are also two complexities
associated with the RELATION pattern that we illustrate briefly.
2.2.1 The RELATION Pattern
When a slot in an EVENT or OBJECT is a RELATION, the filler of the slot will be another EVENT or OBJECT.
This fillershould have a corresponding link to thefirst concept through a different slot. This slot will also be
a RELATION that is an Inverse of the first RELATION. This pattern is shown in a simplified form in Figure 8.
For simplicity, the first two slots are simply labeled PROPERTY. Only the third slot which is a RELATION is
expanded. It may be noted that the RELATION itself is defined as a concept in the ontology. This RELATION,
shown in the bottom half of the figure, has only Domain, Range, and Inverse slots.
Additional complexities arise from the facts that the Inverse link may be
inherited by the second concept from one of its ancestors rather than defined as a local slot in that
concept, or
to a concept that is an ancestor of the first concept, or
implicit in the domain and range specifications of the RELATIONs involved.
The resulting complex patterns are illustrated in Figure 9 which shows patterns developing from both
inheritance and implicit links. Part (a) of the Figure shows an inherited link that does not have a direct
inverse. There is a pair of links, HEAD-OF and HEADED-BY, between CORPORATION and PRESIDENT-
CORPORATION. The two RELATIONs HEAD-OF and HEADED-BY are inverses of each other. The concept
BANK is a descendant of CORPORATION and inherits the HEADED-BY link to PRESIDENT-CORPORATION.
20
2 THE STRUCTURE OF THE ONTOLOGY
Event/Object
Property
SEM
Property
SEM
SEM
CONCEPT
SLOT
FACET
FILLER
Event/Object
Property
SEM
Property
SEM
SEM
SEM
SEM
Domain
Range
SEM
SEM
Domain
Range
Inverse
VALUE
Inverse
VALUE
Relation Relation
Relation
Relation
Patterns in Ontology: Example 1: Relations and Inverse Relations.
Figure 8: The RELATION Pattern.
2.2 Subgraph Patterns in the Ontology
21
Now, if we just look at BANK and PRESIDENT-CORPORATION, we find that BANK has a HEADED-BY link to
PRESIDENT-CORPORATION but the inverse link is not to BANK but to its ancestor CORPORATION. Intuitively,
we refer to a pattern such as the one in Part (a) of the Figure as an inheritance triangle.
Inheritance triangles can occur on both sides of a pair of reciprocal links. For example, Part (b)
of Figure 9 shows a pair of relations CORPORATE-ASSET and CORPORATE-ASSET-OF linking the concepts
CORPORATION and FINANCIAL-OBJECT. When these slots are inherited down to MONEY (via ASSET) and
DRUGSTORE as shown, the resulting pattern has the shape of an X: MONEY has a link to CORPORATION but
not vice versa; DRUGSTORE has a link to FINANCIAL-OBJECT but not vice versa. Nor is there any direct
link between MONEY and DRUGSTORE. However, these links are implicitly present and a constraint that is
looking for them will be satisfied through inheritance. For example, MONEY can fill the CORPORATE-ASSET
slot of a DRUGSTORE.
A complex pattern resulting partly from inheritance and partly from implicit inverse links is shown in
Part (c) of Figure 9. Here, HUMAN has a slot OWNER-OF which is constrained to OBJECT. This link gets
inherited, for example, to TERRORIST. However, there is no explicitly encoded inverse link from OBJECT
to HUMAN. Instead, when we look at the PROPERTY concept OWNED-BY (or its inverse OWNER-OF) and
examine its Domain and Range slots, we find that they implicitly say that HUMAN can be OWNED-BY of an
OBJECT. This information in PROPERTY concepts is in fact extracted and used by the
K analyzer to check
constraints. Thus, an OBJECT can be OWNED-BY a TERRORIST although there is no such link from OBJECT to
TERRORIST or to any of its ancestors.
Finally, Part (d) of Figure 9 shows another pattern similar to Part (c) but with redundant links. In this
case, there is no direct link in either direction between EVENT and OBJECT as far as the pair of RELATIONs
THEME/THEME-OF is concerned. These links are implicit in the Domain and Range slots of THEME and
THEME-OF. However, there is an explicit THEME-OF slot in ANIMATE, a descendant of OBJECT. This link is
redundant, but does not hurt either the consistency of the ontology or the analysis processes in any way.
Similarly, there is a redundant THEME link from PHYSICAL-EVENT to OBJECT.
Because the tools we use for concept acquisition are limited in their ability to help us visualize these
complex patterns, our ontology does contain several such redundant links which are harmless for our
purposes. It may also be noted that, when there is an explicit link, it takes precedence over the information
in the corresponding Domain and Range slots. Typically, Domains and Ranges of PROPERTYs contain the
most general constraints; we add explicit links between concepts to encode a more specific constraint on a
conceptual relation.
11
2.2.2 The ATTRIBUTE Pattern
In the second subgraph pattern shown in Figure 10, the slot in the concept is an ATTRIBUTE. The filler
therefore is a number or a literal. The ATTRIBUTE is also defined as a concept in the ontology. It has only
Domain and Range slots.
11
There are other reasons for adding redundant links. A slot must have a filler. In fact, our acquisition tool does not retain a slot
that has no fillers. As such, we are sometimes forced to introduce a redundant filler just to retain the slot. We would want to retain
a slot to fulfill lexical requirements for indicating to the analyzer that it should look for a filler for this slot wheneverthe concept is
instantiated.
22
2 THE STRUCTURE OF THE ONTOLOGY
HUMAN
TERRORIST
OBJECT
Owned-by
Owner-of
EVENT
PHYSICAL-EVENT
OBJECT
ANIMATE
HUMAN
ThemeTheme-of
CORPORATION
BANK PRESIDENT-CORPORATION
Head-of
Headed-by
SOCIAL-OBJECT HUMAN
Domain
Range
Range
Inverse
FINANCIAL-OBJECT
ASSET
MONEY
CORPORATION
DRUGSTORE
Corporate-asset-of
Range
Domain
Domain
Range
Inverse
Range Range
Inverse
(a) Inheritance on one side
(b) Inheritance on both sides
(c) Implicit link in one direction (d) Implicit link in both directions
Corporate-asset
Figure 9: Complex RELATION Patterns
2.2 Subgraph Patterns in the Ontology
23
Event/Object
Property
SEM
Property
SEM
SEM
CONCEPT
SLOT
FACET
FILLER
SEM
SEM
Domain
Range
Patterns in Ontology: Example 2: Attributes.
Attribute
Attribute
Scalar or Literal
Figure 10: The ATTRIBUTE Pattern.
24
2 THE STRUCTURE OF THE ONTOLOGY
2.3 A Taxonomy of Symbols
All symbols in the ontology are alphanumeric strings with the addition of only the hyphen character. No
accents are permitted on any of the characters. Such enhancements are permitted only in lexicons. As
far as ontology development is concerned, all symbols that we encounter can be classified into one of the
following types:
concept names: typically English words with at most four words in a name separated by hyphens.
instance names: following the standard practice in AI, an instance is given a name by appending its
concept name with a hyphen followed by an arbitrary (but unique) integer.
special slot names: Is-A, Subclasses, Definition, Instances, Instance-Of, Time-Stamp, Domain,
Range, and Inverse. These are not defined as PROPERTYs in the ontology.
literal constants: these are also usually English words, in fact single words most of the time.
the special symbol *nothing*: used to block inheritance as described below.
other miscellaneous symbols used in the K system including:
-- TMR symbols
-- lexicon symbols
-- numbers and mathematical symbols
-- other extraneous symbols (which are detected and eliminated periodically using a set of quality
control programs)
Brief listings/glossaries of TMR and lexicon symbols are provided in the accompanying report by
Mahesh and Wilson (forthcoming). Ideally, numbers are not present either in the lexicon or the ontology
(though for current purposes we may have added some to the lexicon). They are recognized as numbers by
a morpho-syntactic analyzer (such as the Panglyzer system for Spanish from the Pangloss (1994) project)
and directly incorporated into the TMR. Mathematical symbols such as the multiplication and division
operators, * and /, are used in certain slot fillers in the ontology, such as, for example, to represent
conversion factors between different MEASURING-UNITs. All other symbols are considered extraneous and
should not be present.
2.3.1 Characteristics of Literal Symbols
Literal symbols in the ontology are used to stop unending decompositions of meanings. These symbols
are used to fill certain slots (namely LITERAL-ATTRIBUTEs) but are not defined in any way in the ontology.
Some characteristics of literal symbols worth noting include:
Literals symbols are used in our representations in much the same way as the qualitative values used
in qualitative physics and other areas of artificial intelligence(AI) thatdeal with modeling and design
of physical artifacts and systems (de Kleer and Brown, 1984; Goel, 1992).
2.4 The Need for Nonmonotonic Inheritance
25
Literal symbols are either binary or constitute approximate positions along an implied scale. For
binary values, it is often preferable to use attribute-specific literal symbols rather than a generic pair
(such as ‘‘yes’’ and ‘‘no’’ or ‘‘on’’ and ‘‘off’’). The primary benefit of doing this is being able to
map the lexical entries of corresponding words in a language to these literal symbols.
Literal symbols are often used when there is no numerical scale in common use in physical or social
models of the concerned part of the world. For example, OFFICIAL-ATTRIBUTE has in its range the
literals ‘‘official’’ and ‘‘unofficial.’’ Although one can talk about an EVENT or a DOCUMENT being
more official than another, there is no obvious scale that is in use in the world for this attribute. Thus
the two literals seem to serve well in the range of this ATTRIBUTE.
It is not always true that scalar and literal ATTRIBUTEs correspond to the existence or not of a
numerical scale in the physical or social world. A classical example of this is COLOR. Although
several well-defined numerical scales exist in models of physics (such as the frequency spectrum,
hue and intensity scales, etc.) such a scale does not serve our purposes well at all. First of all it would
make our TMRs more or less unreadable by humans if it has a frequency in MHz in place of a literal
such as ‘‘red’’ or ‘‘green.’’ Moreover, it makes lexicon acquisition more expensive; lexicographers
will have to consult a physics reference to find out the semantic mapping for the word ‘‘red’’ instead
of quickly using their own intuitive understanding of its meaning.
Often, values of LITERAL-ATTRIBUTEs nevertheless will need to be quantified to express the meanings
of phrases such as ‘‘light red.’’ It is not practical to introduce more and more literals (such as
‘‘light-red’’) to solve this problem. As a solution to this, we have introduced two general RELATIONs
in the ontology called GREATER-THAN and LESS-THAN. Using this along with a reification operation
called attribute raising, in a TMR, we can say LESS-THAN ‘‘red’’ to indicate a COLOR that is
somewhere between ‘‘red’’ and the previous literal in the list of literals in the Range of COLOR.
2.4 The Need for Nonmonotonic Inheritance
The special symbol *nothing* has been introduced to block inheritance. If a parent has a slot with
particular facets and fillers and if one of its children has a *nothing* in the Sem facet for that same slot,
then the slot won’t be inherited from the parent. Since the local slot in the child has *nothing* as a filler
of the Sem facet, no instance of any OBJECT or EVENT or any number or literal will match this symbol.
As such, no filler is acceptable to this slot and the slot will never be present in any instance of the child
concept. This has the same effect as removing the slot from the child concept by blocking its inheritance
from parent concepts.
There are two reasons why we need to block inheritance using *nothing*:
In a subtree in the ontology, all but a few concepts might have a slot. It is much easier to put the slot
at the root of the subtree and block it in those few concepts (or subtrees) which don’t have that slot
rather than put the slot explicitly in each of the concepts that do accept the slot.
For example, all EVENTs take the AGENT slot except PASSIVE-COGNITIVE-EVENTs and INVOLUNTARY-
PERCEPTUAL-EVENTs. We can put the AGENT slot (with a Sem ANIMAL) in the root EVENT and put
a Sem *nothing* in PASSIVE-COGNITIVE-EVENT and INVOLUNTARY-PERCEPTUAL-EVENT. This will
effectively block theAGENT slot in the subtrees rooted underthese two EVENTs while all other EVENTs
automatically have the AGENT slot.
26
2 THE STRUCTURE OF THE ONTOLOGY
A stronger reason for introducing this mechanism comes from the needs of lexical semantics.
Sometimes, lexical mappingsof certain words will need to refer to a slotof an entire class of concepts
even though a few Subclasses in that class do not have the slot. For example, in the lexical entry
for one of the senses of the Spanish word ‘‘activo,’’ we must refer to the AGENT of EVENT without
knowing which EVENT it is. This requires us to add an AGENT slot to EVENT even though there are
two Subclasses of EVENT that do not have AGENT slots. The alternative would be to list every type of
EVENT other than the above two in the entry for this word which is not practical at all.
Another example where we have used *nothing* is RUN. All CHANGE-LOCATIONs have a THEME
Sem PHYSICAL-OBJECT except RUN because you don’t run somebody else. Hence the THEME slot of
RUN has been blocked using a *nothing*.
In a sense, this mechanism introduces the powerof having default slots just like we have a Default facet
in a slot. A slot specified for a class of concepts acts like a default slot: it is present in every concept unless
there is an explicit Sem *nothing* filler in the concept.
2.5 Other Structural Problems
In reality, an ontology should be an And-Or graph.
12
This, however, makes inheritance very complex. Our
ontology is an ‘‘And graph’’ where children with multipleparents must be subsumedby each ofthe parents.
Multiple parents mean a conjunction of each of those parents. There is no way to represent a disjunction of
parent links. This can be a problem when a child can have any of several parents but not all instances of the
child are subsumed by each of the parents.
Sometimes it is possible to overcome this problem by introducing an ATTRIBUTE as a slot in the concept,
instead of classifying further with disjunctive multiple inheritance. For example, we have classified all
SERVICE-EVENTs into MEDICAL-SERVE, TRANSPORTATION-SERVE, COMMUNICATION-SERVE, and so on. Some
of each of these types of service involve direct interactions with customers and others do not. Rather
than classifying each of these concepts into ‘‘medical-serve-customer-end,’’ ‘‘medical-serve-back-end,’’
13
etc., we introduce a CUSTOMER-CONTACT-ATTRIBUTE as a slot in SERVICE-EVENTs and specify ‘‘yes’’ and
‘‘no’’ as possible literal fillers. This now allows us to classify the various types of service-events without
unnecessarily proliferating the ontology with a large numberof concepts. It maybe notedhere (as illustrated
in Carlson and Nirenburg, 1990) that another choice would have been to classify SERVICE-EVENTs first into
‘‘service-customer-end’’ and ‘‘service-back-end.’’ However, it should be clear that this choice would lead
to an even greater number of concepts under this tree. We would now need medical service EVENTs under
both ‘‘service-customer-end’’ and ‘‘service-back-end,’’ and so on.
The above solution of introducing an ATTRIBUTE works when there is no need to encode any other
knowledge about that slot or ATTRIBUTE. For example, we assumed in the above example that we do
not need to elaborate on what a CUSTOMER-CONTACT-ATTRIBUTE is or how it is related to customers and
contacts. If we need to elaborate on such information we need a different solution.
For example, we have classified ORGANIZATIONs into FOR-PROFIT- and NON-PROFIT-ORGANIZATIONs.
There are some ORGANIZATIONs such as a LABORATORY which can be of either type. We could not introduce
a ‘‘Laboratory-Attribute’’ and say ‘‘yes’’ or ‘‘no’’ for each ORGANIZATION. This is not only awkward, but
12
See any textbook on artificial intelligence for an explanation of And-Or graphs. For example, see Rich and Knight (1991).
13
It may also be noted here that such long names may violate the kind of naming guidelines outlined later in this document.
2.5 Other Structural Problems
27
it also does not allow us to represent all the knowledge about laboratories that we want to represent. We
do not have the expressive power to make a slot in a concept conditional on the value of a filler in another
slot. For example, we cannot associate a ‘‘research-area’’ slot to ORGANIZATIONs and say this slot can
be filled only when the ‘‘Laboratory-Attribute’’ slot is filled with ‘‘yes’’. In our representation, all slots
represented in a concept are valid and can be filled any time. Moreover, no matter what dimensions (such
as profit/non-profit) we choose for classifying a concept into its children concepts, there will often be some
childrenor descendants which do not belong exclusively under any one branch. Since thereis no disjunction
in subsumptionlinks (Is-A and Subclasses) in the ontology, we cannot makethat descendant a child of more
than one concept. For example, we cannot put LABORATORY directly under both FOR-PROFIT-ORGANIZATION
and NON-PROFIT-ORGANIZATION. Such a thing would be self-contradictory since it would mean that every
LABORATORY is both a FOR-PROFIT-ORGANIZATION and a NON-PROFIT-ORGANIZATION. Our solution to this
problem is shown in Figure 11. This solution guarantees that there is still a place to put all the knowledge
that is common to both types of laboratories. However, classifications along multiple dimensions are
collapsed into the same level of the hierarchy which could be unintuitive. For example, ORGANIZATIONs
are now classified as LABORATORYs, NON-PROFIT-ORGANIZATIONs, and FOR-PROFIT-ORGANIZATIONs which
could be read as ‘‘laboratories are neither for-profit nor non-profit’’ though this is not the intended meaning
of the classification.
ORGANIZATION
LABORATORY
FOR-PROFIT-ORGANIZATION
NON-PROFIT-ORGANIZATION
LABORATORY-INDUSTRIAL
LABORATORY-ACADEMIC
Subclass
Subclass
Subclass
Subclass
Subclass
Subclass
Subclass
Subclass
Figure 11: Pattern to Produce Overlapping Subclasses.
In effect, we are saying that when a concept has more than one child, the children do not represent
disjointsubsets that cover the parent concept. Rather, they represent overlapping subsets that together cover
the parent concept. The relationships between these sets and subsets is shown in Figure 12 for our example
of the LABORATORY. While Figure 12 shows the intended meaning of this representation, Figure 13 shows
a possible misinterpretation of it. If the ontology developers themselves misinterpret such a representation
in this manner (which is possible and in fact likely), they could add a third child to LABORATORY thereby
making the interpretation in Figure 13 the only possible interpretation. There is nothing in our current
representation or our tools to prevent such an eventuality. In addition, in some other case of a parent and a
set of children, if the intended meaning is disjoint classification (as shown in Figure 14), we do not have a
representational mechanism to distinguish this from the overlapping cover of a parent shown in Figure 12.
28
2 THE STRUCTURE OF THE ONTOLOGY
ORGANIZATION
FOR-PROFIT-
ORGANIZATION
NON-PROFIT-
ORGANIZATION
LABORATORY-
ACADEMIC
LABORATORY-
INDUSTRIAL
LABORATORY
Intended Meaning: Organizations are either for-profit or non-profit;
Laboratories are either for-profit or non-profit; All laboratories are
either for-profit or non-profit organizations.
Figure 12: Overlapping Subclasses: Intended Meaning.
ORGANIZATION
FOR-PROFIT-
ORGANIZATION
NON-PROFIT-
ORGANIZATION
LABORATORY-
INDUSTRIAL
LABORATORY
LABORATORY-
ACADEMIC
Possible Meaning: Organizations are either for-profit, non-profit, or
laboratories. Some laboratories are for-profit and some are non-profit.
There are laboratories that are neither for-profit nor non-profit organizations.
Figure 13: Overlapping Subclasses: Possible Misinterpretation.
2.6 Complex Events: A Proposal and its Rejection
29
ANIMAL
VERTEBRATE
INVERTEBRATE
Intended Meaning: All animals are either vertebrates or invertebrates.
There are no animals that are neither. Nor are there any vertebrates that
are not animals or invertebrates that are not animals.
Figure 14: Disjoint Subclasses: Intended Meaning.
One solution to the above problems is to explicitly represent multiple orthogonal classifications in the
ontology. For example, we could allow multiple sets of fillers for the Subclasses of each concept where
each individual set is a partition (i.e., an exclusive-or, or disjoint classification) of the parent concept into
childrenconcepts. The cost of doing thisis a significant decrease in the usability and ease of comprehension
of the ontology. It will no longer be possible to look at an individual frame in the ontology and understand
the full meaning of the concept it represents. (See Section 3.2 for a discussion of limited expressiveness
and its virtues.)
2.6 Complex Events: A Proposal and its Rejection
A complex concept
14
is one that cannot be represented in a single frame in the ontology. Each concept in
the current ontology is represented in a single frame. A complex concept may also involve the need for
ontological instances (see below) and variable binding within the complex concept. It is desirable to be able
to introduce complex concepts into the ontology for the following reasons:
Cleanliness of Knowledge Partitioning: A good criteria for partitioning knowledge between lexicons
and the ontology is whether the piece of knowledge is specific to a language (and hence belongs in
the lexicon for that language) or is general (and hence belongs in the ontology). However, in order to
adhere to this rule, we must be able to represent complex concepts in the ontology. For example, in
the current ontology, we cannot represent the fact that there must be at least two ORGANIZATIONs for
14
We use the term complex concept because we might have a need for complex objects at some point in addition to complex
events (and be able to use the same representation for both). However, only the representation of complex events is discussedhere.
30
2 THE STRUCTURE OF THE ONTOLOGY
a MERGE to take place.
15
The only alternative at present is to violate the above rule and duplicate this
knowledge in the lexicon for each language.
Inference and Context: In order to take semantic analysis beyond disambiguation using selectional
restrictions, we need to be able to model context and make inferences from nascent TMRs. This
requires having more detailed representations of situations and therefore complex concepts in the
ontology.
A complex event can be represented as follows:
a head or main frame which bears the name of the complex concept will be attached to the hierarchies
just like any other concept (through Is-A links).
this main frame has a Subevent slotthat is filled by one or more fillers thatare "ontological instances."
an ontological instance is an "orphan" in the ontology with no Is-A or Instance-Of link; it hangs off
of a Subevent-Of slot from the complex event. However, we need to introduce an Onto-Instance-Of
link to attach these subevents to the parent event of their types. This will serve to distinguish them
from onomasticon instances.
the fillers of a Subevent slot are implicitlyordered in sequence. Any other ordering should be handled
by temporal relations.
the head frame will have an Onto-Part slot where we put all additional relations such as temporal or
any other reified relation frames that are part of the description of the complex event.
This way, we can represent entire TMRs as complex concepts in theontology. Forexample, the concept
of a conversation can be represented as follows:
Example: CONVERSATION
(make-frame CONVERSATION
(Definition (Value "a complex event that involves more than one
communicative-event between 2 or more humans where
more than one person does the communication."))
(Time-Stamp (Value "Not yet created!"))
(Is-A (Value COMMUNICATIVE-EVENT))
(Subevent (Value CONVERSATION-COMMUNICATE-1 CONVERSATION-COMMUNICATE-2))
)
(make-frame CONVERSATION-COMMUNICATE-1
(Definition (Value "a subevent of the complex event CONVERSATION."))
(Time-Stamp (Value "Not yet created!"))
(Subevent-Of (Value CONVERSATION))
(Onto-Instance-Of (Value COMMUNICATIVE-EVENT))
(Agent (Value CONVERSATION-COMMUNICATE-2.Destination))
15
Of course, it is possible to introduce a ‘‘number-of-organizations’’ ATTRIBUTE and fill (
2) in it, but that is a hack and is
redundant with the way we represent sets. Since we employ set notations in our meaning representations, we should be able to use
cardinalities of sets to represent the minimum-2 requirement for MERGE. Using sets implies using more than one frame to represent
the MERGE concept.
2.6 Complex Events: A Proposal and its Rejection
31
(Destination (Value CONVERSATION-COMMUNICATE-2.Agent))
(Aspect (Value (duration prolonged) (iteration multiple)))
)
(make-frame CONVERSATION-COMMUNICATE-2
(Definition (Value "a subevent of the complex event CONVERSATION."))
(Time-Stamp (Value "Not yet created!"))
(Subevent-Of (Value CONVERSATION))
(Onto-Instance-Of (Value COMMUNICATIVE-EVENT))
(Agent (Value CONVERSATION-COMMUNICATE-1.Destination))
(Destination (Value CONVERSATION-COMMUNICATE-1.Agent))
(Aspect (Value (duration prolonged) (iteration multiple)))
)
;; Add temporal relations here to indicate that both subevents occur
;; "during" the other.
We can also make full use of the set notation in describing complex concepts in the ontology. For
example, for MERGE, we will now be able to say that there must be at least two ORGANIZATIONs that merge.
In summary, new machinery needed for the representation of complex events proposed above includes:
the Subevent and Subevent-Of relations
the Onto-Instance-Of and Onto-Instances relations
the Onto-Part and Onto-Part-Of relations.
Subevents are also Onto-Part-Ofs. However, we introduce Subevents to distinguish sub-events from
other components of a complex concept and to order them implicitly. We could also introduce any
relationships between the subevents (other than a sequence) as needed and add them to the Onto-Part slot.
It may be noted in addition that the dot notation above such as in ‘‘conversation-communicate-
2.Destination’’ refers to other subevents without any direct link from them. That is, there is no link between
‘‘conversation-communicate-1’’ and ‘‘conversation-communicate-2’’ above. The above representation
assumes that we can get the necessary information by going up the Subevent-Of link and then down the
Subevent link. Alternatively, we could introduce an Other-Subevent relation to fill in this information
directly in each subevent.
2.6.1 Implications of Introducing Complex Concepts
Introducing complex concepts into the ontology will have many implications for lexicon acquisition and
semantic analysis, some of which make it an expensive proposition. Some of the costs of introducing
complex concepts include:
A new semanticanalyzer. Significantnon-local changes need to be madeto the K semanticanalyzer
to deal with variable binding, instantiation, and constraint checking with complex concepts.
Better tools. We will need at least some improvements to our acquisition, browsing, and display tools
to deal with complex concepts with new types of links between them.
32
3 PRINCIPLES OF ONTOLOGY DESIGN
New training: Additional training will be needed not only for ontology acquirers but also all the
lexicographers and testing personnel.
For example, in the analyzer, instantiation and variable binding become very complex. We get
syntax-semantics variable binding information from the complex concept in addition to the lexicon. How
do we combine all of this information so that we can use the selectional constraints buried inside the
subconcepts of the complex event? Should the lexicon duplicate this information from the ontology and
map the variables to syntactic vars or can it provide only partial mappings? For example, the lexicon can
provide the mappings for the head concept in the complex event and the analyzer can somehow identify the
fillers for the other concepts in the complex concept.
Software tools for acquiring and browsing lexical and ontological knowledge bases also need to be
made more sophisticated to handle complex concepts. This will complicate their design, make them more
expensive, and perhaps also slow. For example, in the ontology acquisition editor, we should at least be
able to see Subevent and Subevent-Of links just as Is-A and Instance-Of links (so that we can use graphical
editing operations, such as the add-link operation in the Mikrokarat tool, on them).
Given these costs, we have chosen not to introduce complex concepts at this time. It is a desirable
enhancement of our ontological representations but not a practical one at the present time. Another
justification for this stance is that complex concepts may not be as necessary for machine translationas they
are for other NLP tasks, such as question answering or reasoning of some sort.
2.6.2 Ontological Instances
The introduction of complex concepts also entails the need for variable binding across frames in the
ontology. For example, the DESTINATION of one Subevent is the same as the AGENT of the other Subevent
in the CONVERSATION complex event above. (See Carlson and Nirenburg, 1990 for a second example, the
TEACH complex event.) As noted above, such binding can be done by introducing the notion of ontological
instances, that is, instances of concepts thatreside in the ontology purely for the purposeof variable binding
across frames in a complex concept.
16
These instances add to the complexities of representation, browsing,
and software design for ontology development in significant ways and hence we have not attempted to
include them in the
K ontology.
3 Principles of Ontology Design
Though there is no set of formal principles that determine precisely the structure or content of the ontology
we are developing, we can list several principles that constitute the foundation of our methodology and that
suggest the guidelines we follow. The principles can also be viewed as the desiderata for the ontology we
are developing.
These principles are task oriented or situated unlike the structural principles for ontologies that have
been proposed in the literature (Bouaud, Bachimont, Charlet, and Zweigenbaum, 1995; Zarri, 1995). In
fact, we show below that many of the structural principles had to be violated in our ontology in order to
16
The issue of generic instances is a separate one; it deals with instances of concepts, simple or complex, such as ‘‘the Big
Mac,’’ that do not refer to any particular entity in the world.
3.1 Ontology Development: Desiderata
33
meet the practical objectives of simplicity in representation, software requirements, understandability and
ease of browsing by non-experts.
3.1 Ontology Development: Desiderata
The
K ontology is based on the following desiderata:
1. An ontology must be language independent. An ontology is not a transfer dictionary between a pair
of languages. Our ontology is language independent in two ways:
(a) It is not specific to any particular natural language such as English or Spanish.
(b) The concepts in the ontology do not have a one-to-one mapping to word senses in natural
languages. Many concepts may not map to any single word in a language;
17
other concepts may
map to more than one word in the same language.
What counts as a concept should be a decision that is quite independent of what words exist in a
particular language. An illustrationof ourapproach fordeciding what concepts to add to the ontology
is provided in Section 7.5 of this report. In the
K project, ontology developers are not Spanish or
Japanese speakers so that we hope to balance any Spanish or Japanese bias of lexicographers against
an English (or other) bias on the part of ontology developers. This situation where lexicons for two
different languages are being developed simultaneously with the development of the ontology is an
ideal one for ensuring language independence.
2. An ontology must be independently motivated, not dictated by the lexicon of a particular language.
Ontology development is not subservient to lexicography. The two are sister processes that aid each
other and at the same time constrain each other in significant ways.
3. An ontology must be well-formed according to a precise, axiomatic specification (such as the one
shown in Appendix B) and internally consistent in structure, naming, and content across concepts.
4. It must be consistent and compatible with other knowledge and processing resources such as the
lexicon, the semantic analyzer, the language of the TMR, and any expert microtheories.
5. Anontologymust berichincontent, conceptualstructure, and degreeofconnectivityamong concepts.
For machine translation and other NLP tasks, its structure is necessarily rich; it cannot be just a
hierarchy of concept names. A concept cannot be just a label; it must have a rich internal structure
resulting in a high degree of connectivity with other concepts in the ontology.
6. Limited expressiveness is a virtue. In particular, we reject full first-order predicate logic and the
ability to make arbitrary assertions within the ontology. Limited expressiveness is essential for the
acquisition of large-scale lexicons that conform to the ontology.
7. An ontology must be easy to comprehend. It must be simple to search for concepts. It must be easy to
browse, easy to train acquirers, presentable, and so on. For example, an And-Or tree with disjunctive
inheritance is not suitable for our ontology because it is too hard to comprehend and use for both
ontology acquirers and lexicographers. Computational complexity of the inheritance algorithm is not
the reason for rejecting complex inheritance methods.
17
Although, for practical reasons, one might want to introduce phrasal lexical entries such as compound nouns that map directly
to many concepts in the ontology.
34
3 PRINCIPLES OF ONTOLOGY DESIGN
8. An ontology must have a high utility for the task it is meant for. It must ultimately aid language
processing in resolving a variety of ambiguities and making necessary inferences. Ontology
development is a goal driven process: first there must be one or more tasks, well understood ways
of using the ontology for the task(s), and an immediate need for the knowledge to be acquired. For
example,although the ontologymust be language independent, it couldbe rather language-processing
dependent in the following ways:
(a) It must have all the necessary knowledge to sufficiently constrain language interpretation and
generation.
(b) It must be a richly connected network of concepts to enable a variety of inferences.
(c) It must support deep but variable-depth semantics. The semantics must be deeper than what is
possible with just labels for word senses.
9. The ontology must be cost effective. Not all knowledge that we can lay our hands on should be added
to the ontology. Given that we do not have the resources to build a complete encyclopedic ontology
of the entire world, we should be acquiring only those concepts and conceptual relations that are of
immediate utility in the chosen task(s). This can be monitored in a simple way, in the K situation
for example, by calculating the number of concepts that are not used in any of the lexical entries (and
nor are any of the descendents of those concepts). It turns out that in the current
K ontology, the
maximum height of an unused subtree in the ontology is just two, indicating that there is no large
portion of the ontology that is of no utilityfor the
K system. This shows clearly that we have in fact
been doing a situated development of the
K ontology (see Section 7).
10. It must have a limited scope to make its acquisition tractable. The ontology is not an Encyclopedia
Britannica; it has a significantly smaller scope than Cyc (Lenat and Guha, 1990).
11. The ontology has conceptual but not episodic knowledge. Acquisition of episodic knowledge (i.e., an
onomasticon) is significantly different in scope, methodology, and cost from ontology development.
Ourmethods for concept acquisitionare notcost-effective for acquiring instances in the onomasticon.
Onomasticon acquisition must be automated to a far higher degree.
12. The ontology is not limited to a domain but does focus on a domain of choice.
13. Ontology development must be technology aided. The task is made more tractable by the deployment
of latest technologies: faster machines, color graphical user interfaces, graphical browsers and
editors, on-line lexicons, corpora, other ‘‘ontologies,’’ semi-automated interfaces for customer
interactions and database maintenance, and programs to detect errors and inconsistencies and enforce
a comprehensive set of guidelines.
In the situation of NLP, a good way to operationalize the desire for a limited proliferation of concepts
is to say that the # concepts must be
the # words in any one lexicon. In K , at present, more than two
Spanish words refer to any concept that is used in the lexicon.
18
The task oriented and situated nature of ontology development can be characterized as an attempt to
maximize the ratio:
#
#
18
This ratio is expected to go up to about 6 once the lexicon is populated by entries generated automatically by means of a
derivational morphology engine currently under development.
3.2 Limited Expressiveness and Usability
35
In a more fine-grained analysis, one can actually consider each slot of a concept (after accounting
appropriately for inheritance) to see how much of the actual knowledge encoded in the ontology is used in
processing a set of texts. Such an analysis of the
K ontology will be attempted at a later time.
3.2 Limited Expressiveness and Usability
In our practical, situated approach to ontology building, the usability of the ontology in browsing and
searching for the right concepts is a fundamental requirement. To this end we have chosen a representation
for concepts that is rather limited in its expressiveness. The ontology is essentially a frame hierarchy with
multiple inheritance. In general, it is not possible to represent, within the ontology, negations, ternary and
higher-order relations (without creating an OBJECT or EVENT), universal and existential quantification or
any form of sets, disjunctions, or other arbitrary assertions. Such expressiveness is available to a certain
extent in other components of the
K system such as in lexicons and TMRs. Some of the above features
are also available in restricted forms in particular parts of the ontological representation (as can be seen
from examples in this report).
Methodologically, a primary use of the ontology is in supporting the representation of word meanings
in the ontology. Lexicon acquirers must be able to comprehend the ontology, visualize the ways in which it
organizes concepts, and find the right concept(s) to construct the representation of the meaning of a word.
It is critical in the
K situation that users be able to browse the ontology and see an entire concept together
in one place. Therefore, the entire set of slots and fillers for a concept must be represented in one place. The
meaning of a concept cannot depend on which set of parents a particular instance has. Nor should it require
inferences to be drawn from assertions which may be distributed all over the ontology. Introducing full
first-order assertions makes the ontology impossible to browse. For our purposes, we must ensure that the
ontology is an entirely declarative body of knowledge that can be browsed statically. We must be able to
compute all inheritance effects a priori and let our users see the resulting static description of each concept.
Increasing the expressiveness of the ontology beyond simple frame inheritance makes it a black box that
can be used only to ask queries and get answers back. Our customers need to visualize the entire ontology,
not just derive answers to specific queries.
3.3 Structural Principles and Why We Violate Them
Several structural principles have been proposed in ontology literature as methodological principles for
constructing a taxonomy. Instead of starting with a predetermined structure for the ontology and attempting
to carve the world to fit that structure, we look at the world (or domain) and decide what is a reasonable
structure given the above desiderata. In this section, we take up a set of four principles suggested by
Bouaud, Bachimont, Charlet, and Zweigenbaum (1995)
19
and illustrate why we had to violate each of these
principles on several occasions. Our methodology relies on guidelines that we developed to help adhere to
a set of more complex axioms (see Appendix B) instead of mandating a simple set of structural principles.
The principles suggested include:
Similarity Principle: This principle essentially says that a child must share the meaning of a parent.
Since, in our case, the meaning of a concept is the combination of the information in all its slots, this
19
Some of these principles are very well known and, in fact, date back to Aristotle.
36
4 ONTOLOGY AS A SHARABLE RESOURCE
principle means that a child must inherit all the slots of a parent. As already seen in Section 2.4, we
often need to block inheritance using the special symbol *nothing*. When we do that, a child
does not share the meaning that is part of the parent concept. For example, many ANIMALs have a
MATERIAL-OF RELATION to AGRICULTURAL-PRODUCTs. However, HUMANs do not constitute materials
of which agricultural products are made. As such, we must block the MATERIAL-OF slot in HUMAN.
Specificity Principle: This states that the child must differ from its parent in a distinctive way which
is the necessary and sufficient condition for being the child concept. Often, in our ontology, there is
no distinction represented between a parent and its child. The distinction may not be useful for our
purposes in the MT situation. We often cannot justify the cost of formulating and representing all
such distinctions. For example, the distinction between ANIMAL and its child concept INVERTEBRATE
is not encoded in our ontology.
Opposition Principle: This states that a concept must be distinguishable from its siblings and the
distinction between each pair of siblings must be represented. As noted below in Section 4.2, and
illustrated by the pair of concepts WALK and RUN, such distinctions are often unnecessary in our
situation and are not represented.
Unique Semantic Axis Principle: This states that all children of a given concept must differ from
one another along a single dimension. As already illustrated with the example of LABORATORY in
Section 2.5, this too is unduly restrictive for our purposes. If we were to adhere to this principle, the
tradeoff would be to introduce disjunctions among parents when there are multiple parents, a choice
that we believe hurts the comprehensibility of the ontology to a far greater extent than not having the
semantic axis principle.
20
4 Ontology as a Sharable Resource
Only ontologies that are constructed as computational entities can be shared effectively. The informational
content of a computational ontology is much more important in solving practical problems than the form
in which it is represented. We do not place much emphasis on the representational formalism used in an
ontology database. We are open to converting the
K ontology into another format if there is a standard
that emerges. Until such time, we must live with different representation formats in addition to different
choices of primitives, conceptual relations and configurations.
Given the diversity in ontological designs, a good way to share them as computational resources may
be to share a set of tools which provides a common substrate. Various translators can be built upon this
substrate for converting an ontology from one representation to another. Such translations of ontologies can
then be shared between modules and subgroups of a project, with other projects, and with the community
at large. We already have such a system in operation within the K project where the ontology is routinely
translated between two different representational forms and is shared among the large group of people
working for the project. We also have the beginnings of other translators such as one that converts TMRs
to a template of the kind employed in the TIPSTER text processing initiative.
20
Surprisingly, Bouaud et. al (1995) argue that there is no need for multiple inheritance and that ontologies should be simple
trees. We find this vastly inadequate for our requirements and those of most knowledge representation systems.
4.1 Sharability and the Ten Commitments
37
4.1 Sharability and the Ten Commitments
Although we are not committed to the format we use for concept representation, we are committed to the
following characteristics of our ontology. If a standard emerged either in knowledge representation or
in ontological representations such as, for example, the KIF (Genesereth and Fikes, 1992), these are the
characteristics that we would consider in deciding whether and how to port the
K ontology to the standard
format.
The 10 Commitments:
1. Broad coverage: Since our input texts are real-world, unedited news articles, they contain words
which have meanings from practically any domain in the world. In order to represent those word
meanings in the lexicons and to process the texts in the
K system, the ontology must contain
concepts that cover a broad range of topics in the world. A domain-specific ontology, such as one
that may be sufficient for database merging or process model integration in a domain, will not serve
our purposes. This does not mean that our ontology must contain every conceivable concept in the
world before we start using it for machine translation. On the other hand, it must cover many of
the commonly used terms from a wide range of domains, leaving out the more technical terms from
domains that are not the focus of our texts. For example, we do not need all the terminology used in
neuroscience and brain surgery, but we are very likely to need commonly used terms such as BRAIN
and SURGERY in our ontology.
2. Rich properties and interconnections: One of thebiggest uses of ourontology in processing natural
language texts is in checking how well selectional constraints are satisfied. In the majority of cases,
constraints are not directly satisfied by natural language texts. They are often partially satisfied by
each of the possible meanings of an ambiguous piece of text. In order to compare the various choices
against each other and determine the best choice, there must be a rich set of connections between
conceptsinthe ontologyintheform ofa numberof properties ofconcepts. Givenanypairofconcepts,
we must be able to find the best (as in shortest or least cost) path between the two in the ontology.
We want our ontology to act not only as a taxonomic classification of concepts in the world, but
also very much like a semantic network. The difference between the two types of knowledge bases
is shown in Figures 15 and 16 which show the concept CORPORATION with hierarchical information
only and in its actual form in the
K ontology, respectively.
3. Ease of understanding, searching and browsing: The majority of our customers (lexicographers
and testing and evaluating personnel) are not experts in knowledge representation. They are expert
linguists or experts in a particular language or domain. It must be easy for them to search for a
concept in the ontology, to browse the hierarchies, and understand the relationships between different
concepts in the ontology. They should be able to do all this starting with just a rough sense (or
gloss) of the meaning they are trying to represent in the lexicon or find in the ontology. In order
to meet this goal, we had to lean towards simplicity rather than aim for better expressive power or
theoretical cleanliness in resolving a number of issues as noted throughout this report. Choices made
for enhancing ease of understanding include:
No complex concepts: each concept is represented in exactly one frame.
No disjunction in inheritance: a simple, depth-first, conjunctive inheritance algorithm is used;
inheritance is precomputed and inherited slots displayed to the user.
No default inheritance: default values are not inherited.
38
4 ONTOLOGY AS A SHARABLE RESOURCE
No ontological instances.
No multiple views: there are no alternative views at the hierarchy, concept, slot, or facet levels.
No sets and quantifiers: there are no set or quantifier notations in the ontology; such
expressiveness is confined to concepts in lexical and TMR representations.
All slots are equal: there is no distinction between definitional and factual or necessary and
peripheral slots; a concept is the conjunction of all its slots.
Searching for a concept is facilitated by a simple string matching technique in our browsing tools.
Users can search for a concept by providing a sub-string of its name or any sub-string of its English
definition. An example is shown in Figure 17 where all concepts with the string ‘‘food’’ in them are
listed on the left while all those with ‘‘food’’ mentioned somewhere in their definition strings are
shown on the right. In addition, our tools enable users to view the hierarchies graphically as well as
jump from one concept to any other concept that it is related to.
4. NLP-oriented: The
K ontology is designed and built for machine translation. As illustrated in
Section 4.2, the kinds of knowledge that are needed in concept descriptions for machine translation
may not be the same as what is needed for a reasoning or inference task. Our task is to extract and
represent the ‘‘direct’’ meaning of the input text using the concepts in the ontology. We often do not
need to make elaborate inferences for this purpose.
5. Economy/cost-effectiveness/tractability: As in most projects, we have limitedresources to build the
ontology. We don’t have either the time or the number of people to expend several person-centuries
to build the ontology. Instead, we must build a broad-coverage, usable ontology in only a few person
months. The ontology must be usable at every stage in its development no matter how partial or
inconsistent it is.
6. Language independence: Concepts in the ontology must not be based on words in any one natural
language. Our goal is to derive an interlingual meaning representation from any of a set of natural
languages. Although we use English names for concepts, this is merely for convenience and
readability. Concept names do not correspond one to one with English words or words in another
language such as Spanish. A good example of a language dependent ontology is WordNet (Miller,
1990).
7. No unconnected terms: Every concept in the ontology must be related to other concepts in well-
formed ways. There should be no disjoint components in the ontological graph. This is necessary
since virtually any two concepts might have to be compared for ‘‘closeness’’ in checking selectional
constraints during NLP.
8. Taxonomic organization and Inheritance: These are inevitable for representing word and text
meanings succinctly. If we had just a list of concepts not organized in any hierarchical way, lexical
semantic entries would be prohibitively long and tedious to acquire.
9. Intermediate-level grain size: As illustrated in the discussion on complex concepts (Section 2.6),
we do not want either an elaborate decomposition of meanings into a small set of primitives or little
decomposition where each word sense is its own atomic concept. We want word meanings to be
decomposed to a significant extant so that relationships and commonalities in meanings between
words become clear and so that meaning representations are not language specific. Yet, we do not
want to enforce a closed set of primitive concepts into which all meanings should be decomposed.
The search for such an ‘‘ideal’’ set of primitives becomes a research enterprise of its own, has been
attempted by other researchers in the past, and does not suit our practical, situated approach well.
4.1 Sharability and the Ten Commitments
39
10. Equal status