A Computational Framework for Non-Lexicalist Semantics
ABSTRACT Under a lexicalist approach to semantics, a verb completely encodes its syntactic and semantic structures, along with the relevant syntax-tosemantics mapping; polysemy is typically attributed to the existence of different lexical entries.
-
Citations (0)
-
Cited In (0)
Page 1
A Computational Framework for Non-Lexicalist Semantics
Jimmy Lin
MIT Computer Science and Artificial Intelligence Laboratory
Cambridge, MA 02139
jimmylin@csail.mit.edu
Abstract
Underalexicalistapproachtosemantics, averb
completely encodes its syntactic and semantic
structures, along with the relevant syntax-to-
semantics mapping; polysemy is typically at-
tributed to the existence of different lexical en-
tries. A lexicon organized in this fashion con-
tains much redundant information and is un-
able to capture cross-categorial morphological
derivations. The solution is to spread the “se-
mantic load” of lexical entries to other mor-
phemes not typically taken to bear semantic
content. This approach follows current trends
in linguistic theory, and more perspicuously ac-
counts for alternations in argument structure.
I demonstrate how such a framework can be
computationally realized with a feature-based,
agenda-driven chart parser for the Minimalist
Program.
1Introduction
The understanding of natural language text includes not
only analysis of syntactic structure, but also of semantic
content. Due to advances in statistical syntactic parsing
techniques (Collins, 1997; Charniak, 2001), attention has
recently shifted towards the harder question of analyzing
the meaning of natural language sentences.
A common lexical semantic representation in the com-
putational linguistics literature is a frame-based model
where syntactic arguments are associated with various se-
mantic roles (essentially frame slots). Verbs are viewed
as simple predicates over their arguments. This approach
has its roots in Fillmore’s Case Grammar (1968), and
serves as the foundation for two current large-scale se-
manticannotationprojects:FrameNet(Bakeretal., 1998)
and PropBank (Kingsbury et al., 2002).
Underlying the semantic roles approach is a lexical-
ist assumption, that is, each verb’s lexical entry com-
pletelyencodes(moreformally, projects)itssyntacticand
semantic structures. Alternations in argument structure
are usually attributed to multiple lexical entries (i.e., verb
senses). Under the lexicalist approach, the semantics of
the verb break might look something like this:
(1) break(agent, theme)
agent: subject
break(agent, theme, instrument)
agent: subject
instrument: oblique(with)
break(theme)
theme: subject
...
theme: object
theme: object
The lexicon explicitly specifies the different subcate-
gorization frames of a verb, e.g., the causative frame, the
causative instrumental frame, the inchoative frame, etc.
The major drawback of this approach, however, is the
tremendous amount of redundancy in the lexicon—for
example, the class of prototypical transitive verbs where
the agent appears as the subject and the theme as the di-
rect object must all duplicate this pattern.
The typical solution to the redundancy problem is
to group verbs according to their argument realization
patterns (Levin, 1993), possibly arranged in an inheri-
tance hierarchy. The argument structure and syntax-to-
semantics mapping would then only need to be specified
once for each verb class. In addition, lexical rules could
beformulatedtoderivecertainalternationsfrommoreba-
sic forms.
Nevertheless, the lexicalist approach does not capture
productive morphological processes that pervade natu-
ral language, for example, flat.V → flatten.ADJ or ham-
mer.N → hammer.V; most frameworks for computational
semantics fail to capture the deeper derivational relation-
Page 2
ship between morphologically-related terms.
guages with rich derivational morphology, this problem
is often critical: the standard architectural view of mor-
phological analysis as a preprocessor presents difficulties
in handling semantically meaningful affixes.
Inthispaper, Ipresentacomputationalimplementation
of Distributed Morphology (Halle and Marantz, 1993), a
non-lexicalist linguistic theory that erases the distinction
between syntactic derivation and morphological deriva-
tion. This framework leads to finer-grained semantics ca-
pable of better capturing linguistic generalizations.
For lan-
2Event Structure
It has previously been argued that representations based
on a fixed collection of semantic roles cannot adequately
capture natural language semantics. The actual inventory
of semantic roles, along with precise definitions and di-
agnostics, remains an unsolved problem; see (Levin and
Rappaport Hovav, 1996).
grained to account for certain semantic distinctions—the
only recourse, to expand the inventory of roles, comes
with the price of increased complexity, e.g., in the syntax-
to-semantics mapping.
There is a general consensus among theoretical lin-
guists that the proper representation of verbal argument
structure is event structure—representations grounded in
a theory of events that decompose semantic roles in
terms of primitive predicates representing concepts such
as causality and inchoativity (Dowty, 1979; Jackendoff,
1983; Pustejovsky, 1991b; Rappaport Hovav and Levin,
1998). Consider the following example:
Fixed roles are too coarse-
(2) He sweeps the floor clean.
[ [ DO(he, sweeps(the floor)) ] CAUSE
[ BECOME [ clean(the floor) ] ] ]
Dowty breaks the event described by (2) into two
subevents, the activity of sweeping the floor and its result,
thestateofthefloorbeingclean. Amorerecentapproach,
advocated by Rappaport Hovav and Levin (1998), de-
scribes a basic set of event templates corresponding to
Vendler’s event classes (Vendler, 1957):
(3)a. [ x ACT<MANNER>] (activity)
b. [ x <STATE> ] (state)
c. [ BECOME [ x <STATE> ] ] (achievement)
d. [ x CAUSE [ BECOME [ x <STATE> ] ] ]
(accomplishment)
e. [ [ x ACT<MANNER>] CAUSE [ BECOME
[ x <STATE> ] ] ] (accomplishment)
A process called Template Augmentation allows basic
event templates to be freely “augmented” to any other
event template. This process, for example, explains the
resultative form of surface contact verbs like sweep:
(4) a. Phil swept the floor.
[ Phil ACT<SWEEP>floor ]
b. Phil swept the floor clean.
[ [ Phil ACT<SWEEP>floor ] CAUSE
[ BECOME [ floor <CLEAN> ] ] ]
Following this long tradition of research, I propose a
syntactically-based event representation specifically de-
signed to handle alternations in argument structure. Fur-
thermore, I will show how this theoretical analysis can
be implemented in a feature-driven computational frame-
work.The product is an agenda-driven, chart-based
parser for the Minimalist Program.
3A Decompositional Framework
A primary advantage of decompositional (non-lexicalist)
theories of lexical semantics is the ability to transpar-
ently relate morphologically related words—explaining,
for example, categorial divergences in terms of differ-
ences in event structure. Consider the adjective flat and
the deadjectival verb flatten:
(5)a. The tire is flat.
b. The tire flattened.
Clearly, (5a) is a stative sentence denoting a static situ-
ation, while (5b) denotes an inchoative event, i.e., a tran-
sition from “tire is not flat” to “tire is flat”. One might
assign the above two sentence the following logical form:
(6)a. BE(tire,[stateflat])
b. ARGδ(tire,e) ∧ BECOME(BE([stateflat]),e)
In Davidsonian terms, dynamic events introduce event
arguments, whereas static situations do not. In (6b), the
semantic argument that undergoes the change of state
(ARGδ) is introduced externally via the event argument.
Considering that the only difference between flat.ADJ
and flatten.V is the suffix -en, it must be the source of
inchoativity and contribute the change of state reading
that distinguishes the verb from the adjective. Here, we
have evidence that derivational affixes affect the seman-
tic representation of lexical items, that is, fragments of
event structure are directly associated with derivational
morphemes. We have the following situation:
(7) ?flat? = [stateflat]
?flat-en? = λx.ARGδ(x,e)∧
In this case, the complete event structure of a word
can be compositionally derived from its component mor-
phemes. This framework, where the “semantic load” is
?is flat? = λxBE(x,[stateflat])
BECOME(BE([stateflat]),e)
?-en? = λsλxARGδ(x,e) ∧ BECOME(BE(s),e)
Page 3
spread more evenly throughout the lexicon to lexical cat-
egories not typically thought to bear semantic content, is
essentially the model advocated by Pustejovsky (1991a),
among many others. Note that such an approach is no
longer lexicalist: each lexical item does not fully encode
its associated syntactic and semantic structures. Rather,
meanings are composed from component morphemes.
In addition to -en, other productive derivational suf-
fixes in English such as -er, -ize, -ion, just to name a
few, can be analyzed in a similar way. In fact, we may
view morphological rules for composing morphemes into
larger phonological units the same way we view syntac-
tic rules for combining constituents into higher-level pro-
jections, i.e., why distinguish VP → V + NP from V
→ Adj + -en? With this arbitrary distinction erased, we
are left with a unified morpho-syntactic framework for
integrating levels of grammar previously thought to be
separate—this is indeed one of the major goals of Dis-
tributed Morphology. This theoretical framework trans-
lates into a computational model better suited for analyz-
ing the semantics of natural language, particularly those
rich in morphology.
A conclusion that follows naturally from this analysis
is that fragments of event structure are directly encoded
in the syntactic structure. We could, in fact, further pos-
tulate that all event structure is encoded syntactically, i.e.,
that lexical semantic representation is isomorphic to syn-
tacticstructure. Sometimes, thesefunctionalelementsare
overtly realized, e.g., -en. Often, however, these func-
tional elements responsible for licensing event interpre-
tations are not phonologically realized.
These observations and this line of reasoning has not
escaped the attention of theoretical linguists: Hale and
Keyser (1993) propose that argument structure is, in fact,
encoded syntactically. They describe a cascading verb
phrase analysis with multiple phonetically empty verbal
projections corresponding to concepts such as inchoativ-
ity and agentivity. This present framework builds on the
work of Hale and Keyser, but in addition to advancing a
more refined theory of verbal argument structure, I also
describe a computational implementation.
4Event Types
Although the study of event types can be traced back
to Aristotle, it wasn’t until the twentieth century when
philosophers and linguists developed classifications of
events that capture logical entailments and the co-
occurrence restrictions between verbs and other syntactic
elements such as tenses and adverbials. Vendler’s (1957)
four-way classification of events into states, activities, ac-
complishments, and achievements serves as a good start-
ing point for a computational ontology of event types.
Examples of the four event types are given below:
(8)
States
know
believe
Accomplishments
paint a picture
make a chair
Activities
run
walk
Achievements
recognize
find
Under Vendler’s classification, activities and states
both depict situations that are inherently temporally un-
bounded (atelic); states denote static situations, whereas
activities denote on-going dynamic situations. Accom-
plishments and achievements both express a change of
state, and hence are temporally bounded (telic); achieve-
ments are punctual, whereas accomplishments extend
over a period of time. Tenny (1987) observes that ac-
complishments differ from achievements only in terms of
event duration, which is often a question of granularity.
From typological studies, it appears that states, change
of states, and activities form the most basic ontology of
event types. They correspond to the primitives BE, BE-
COME, and DO proposed by a variety of linguists; let us
adopt these conceptual primitives as the basic vocabulary
of our lexical semantic representation.
Following the non-lexicalist tradition, these primitives
are argued to occupy functional projections in the syntac-
tic structure, as so-called light verbs. Here, I adopt the
model proposed by Marantz (1997) and decompose lexi-
cal verbs into verbalizing heads and verbal roots. Verbal-
izing heads introduce relevant eventive interpretations in
the syntax, and correspond to (assumed) universal primi-
tives of the human cognitive system. On the other hand,
verbal roots represent abstract (categoryless) concepts
and basically correspond to open-class items drawn from
encyclopedic knowledge. I assume an inventory of three
verbalizing heads, each corresponding to an aforemen-
tioned primitive:
(9) vDO [+dynamic, −inchoative] = DO
vδ
[+dynamic, +inchoative] = BECOME
vBE [−dynamic]
The light verb vDO licenses an atelic non-inchoative
event, and is compatible with verbal roots expressing ac-
tivity. It projects a functional head, voice (Kratzer, 1994),
whose specifier is the external argument.
= BE
(10) John ran.
voiceP
DP
John
voice
vDOP
vDO
√
run
ARGext(John,e) ∧ DO([activityrun],e)
Page 4
The entire voiceP is further embedded under a tense
projection (not shown here), and the verbal complex un-
dergoes head movement and left adjoins to any overt
tense markings. Similarly, the external argument raises to
[Spec, TP]. This is in accordance with modern linguistic
theory, more specifically, the subject-internal hypothesis.
The verbal root can itself idiosyncratically license a
DP to give rise to a transitive sentence (subjected, nat-
urally, to selectional restrictions). These constructions
correspond to what Levin calls “non-core transitive sen-
tences” (1999):
(11) John ran the marathon.
voiceP
DP
John
voice
vDOP
vDO
√P
run
DP
the marathon
ARGext(John,e) ∧ DO([activityrun(marathon)],e)
Similarly, vBElicenses static situations, and is compat-
ible with verbal roots expressing state:
(12) Mary is tall.
vBEP
DP
Mary
vBE
is
√
tall
BE(Mary,[statetall])
The light verb vδlicenses telic inchoative events (i.e.,
change of states), which correspond to the BECOME
primitive:
(13) The window broke:
vδP
DP
window
vδ
vBE
√
break
ARGδ(window,e) ∧ BECOME(BE([statebreak]),e)
The structure denotes an event where an entity under-
goes a change of state to the end state specified by the
root. vδP can be optionally embedded as the complement
of a vDO, accounting for the causative/inchoative alterna-
tion. Cyclic head movement (incorporation) of the verbal
roots into the verbalizing heads up to the highest verbal
projection accounts for the surface form of the sentence.
(14) John broke the window.
voiceP
DP
John
voice
vDOP
vDO
vδP
DP
window
vδ
vBE
√
break
CAUSE(e1,e2) ∧ ARGext(John,e1) ∧
DO([activityundef],e1) ∧ ARGδ(window,e2) ∧
BECOME(BE([statebreak]),e2)
Note that in the causative form, vDOis unmodified by
a verbal root—the manner of activity is left unspecified,
i.e., “John did something that caused the window to un-
dergo the change of state break.”
Given this framework, deadjectival verbs such as flat-
ten can be directly derived in the syntax:
(15) The tire flattened.
vδP
DP
tire
vδ
-en
vBEP
vBE
√
flat
ARGδ(tire,e) ∧ BECOME(BE([stateflat]),e)
In (Lin, 2004), I present evidence from Mandarin Chi-
nese that this analysis is on the right track. The rest of
this paper, however, will be concerned with the computa-
tional implementation of my theoretical framework.
5 Minimalist Derivations
My theory of verbal argument structure can be imple-
mented in a unified morpho-syntactic parsing model
that interleaves syntactic and semantic parsing.
system is in the form of an agenda-driven chart-based
parser whose foundation is similar to previous formaliza-
tions of Chomsky’s Minimalist Program (Stabler, 1997;
Harkema, 2000; Niyogi, 2001).
The
Page 5
Lexical entries in the system are minimally specified,
each consisting of a phonetic form, a list of relevant fea-
tures, and semantics in the form of a λ expression.
The basic structure building operation, MERGE, takes
two items and creates a larger item.
compatible features are canceled and one of the items
projects.Simultaneously, the λ expression associated
with the licensor is applied to the λ expression associated
with the licensee (in theoretical linguistic terms, Spell-
Out).
The most basic feature is the =x licensor feature,
which cancels out a corresponding x licensee feature and
projects. A simple example is a determiner selecting a
noun to form a determiner phrase (akin to the context free
rule DP → det noun). This is shown below (underline in-
dicates canceled features, and the node label < indicates
that the left item projects):
In the process,
(16)
<
the
:::
=n d -k
shelf
:n
The features >x and <x trigger head movement (in-
corporation), i.e., the phonetic content of the licensee is
affixed to the left or right of the licensor’s phonetic con-
tent, respectively. These licensor features also cancel cor-
responding x licensee features:
(17)
<
book -s
>n d -k
:::
book
:n
<
de- bone
<n V
::
bone
:n
Finally, feature checking is implemented by +x/-x fea-
tures. The +x denotes a need to discharge features, and
the -x denotes a need for features. A simple example of
this is the case assignment involved in building a preposi-
tional phrase, i.e., prepositions must assign case, and DPs
much receive case.
(18)
<
on
+k ploc
:::
=d:::
<
the
:::
=n:d::
-k
shelf
:n
Niyogi (2001) has developed an agenda-driven chart
parser for the feature-driven formalism described above;
please refer to his paper for a description of the parsing
algorithm. I have adapted it for my needs and developed
grammar fragments that reflect my non-lexicalist seman-
tic framework. As an example, a simplified derivation of
the sentence “The tire flattened.” is shown in Figure 1.
The currently implemented system is still at the “toy
parser” stage. Although the effectiveness and coverage
<
//
::
>s vbe
λx.BE(x)
/flat/
:s
[stateflat]
<
/flat -en/
>be =d
::::
λx.λy.ARGδ(y,e)∧
BECOME(x,e)
<
::
>s:::
vbe
BE([stateflat])
:s
>
/the tire/
::d
tire
<
/flat -en/
>be::
λy.ARGδ(y,e)∧
BECOME(BE([statetall]),e)
::::
=d
<
:::
>s::::
vbe
:s
ARGδ(he,e) ∧ BECOME(BE([statetall(3cm)]),e)
Figure 1: Simplified derivation for the sentence “The tire
flattened.”
of my parser remains to be seen, similar approaches have
been successful at capturing complex linguistic phenom-
ena. With a minimal set of features and a small num-
ber of lexical entries, Niyogi (2001) has successfully
modeled many of the argument alternations described by
Levin (1993) using a Hale and Keyser (1993) style anal-
ysis. I believe that with a suitable lexicon (either hand
crafted or automatically induced), my framework can be
elaborated into a system whose performance is compara-
ble to that of current statistical parsers, but with the added
advantageofsimultaneouslyprovidingaricherlexicalse-
mantic representation of the input sentence than flat pred-
icate argument structures based on semantic roles.
6Conclusion
A combination of factors in the natural development of
computational linguistics as a field has conspired to nar-
row the diversity of techniques being explored by re-
searchers. While empirical and quantitative research is
the mark of a mature field, such an approach is not with-
out its adverse side-effects. Both syntactic and semantic
parsing technology faces a classic chicken-and-egg prob-