ArticlePDF Available

Abstract and Figures

We present a baseline system for model- ing textual entailment that combines deep syntactic analysis with structured lexi- cal meaning descriptions in the FrameNet paradigm. Textual entailment is approx- imated by degrees of structural and se- mantic overlap of text and hypothesis, which we measure in a match graph. The encoded measures of similarity are pro- cessed in a machine learning setting.1
Content may be subject to copyright.
Approaching Textual Entailment with LFG and FrameNet Frames
Aljoscha Burchardt
Dept. of Computational Linguistics
Saarland University
Saarbr¨ucken, Germany
albu@coli.uni-sb.de
Anette Frank
Dept. of Computational Linguistics
Saarland University &
Language Technology Lab, DFKI GmbH
Saarbr¨ucken, Germany
frank@coli.uni-sb.de
Abstract
We present a baseline system for model-
ing textual entailment that combines deep
syntactic analysis with structured lexi-
cal meaning descriptions in the FrameNet
paradigm. Textual entailment is approx-
imated by degrees of structural and se-
mantic overlap of text and hypothesis,
which we measure in a match graph. The
encoded measures of similarity are pro-
cessed in a machine learning setting.
1
1 Introduction
In this paper, we present a baseline system for ap-
proaching the textual entailment task as presented
in the PASCAL RTE Challenge. This task in-
volves complex examples from unrestricted do-
mains, a challenge for deep semantics-based pro-
cessing. Similar to previous work (Dagan et al.,
2005) we explore semantically informed approxi-
mations of textual entailment. As shown by (Bos
and Markert, 2005), fine-grained semantic analysis
and reasoning models can yield high precision, but
are severely restricted in recall. The architecture we
present is open for extension to deeper methods.
We assess the utility of approximating entail-
ment in terms of structural and semantic overlap of
text and hypothesis, combining wide-coverage LFG
1
This work has been carried out in the project SALSA,
funded by the German Science Foundation DFG, title PI
154/9-2. We thank Katrin Erk and Sebastian Pado for providing
and supporting the Fred and Rosy systems and Alexander Koller
for his contributions and for implementing the FEFViewer.
parsing with frame semantics, to project a lexical se-
mantic representation with semantic roles. We com-
pute various measures of overlap to train a machine
learning model for entailment.
In Section 2, we describe the linguistic resources
and our system architecture. In Section 3, we
present our approach for modeling similarity of text
and hypothesis in a match graph. In Section 4, we
report on our machine learning experiments, the re-
sults in the RTE task, and provide some error anal-
ysis, including discussion of typical examples that
show the strength and weaknesses of our approach.
We conclude with a discussion of perspectives.
2 Base Components and Architecture
2.1 Basic Analysis Components
Our primary linguistic analysis components are the
probabilistic LFG grammar for English developed
at Parc (Riezler et al., 2002), and a combination
of systems for frame semantic annotation : two
probabilistic systems for frame and role annotation,
Fred and Rosy (Erk and Pado, 2006) and a rule-
based system for frame assignment, called Detour
(to FrameNet) (Burchardt et al., 2005), which uses
WordNet to address coverage problems in the cur-
rent FrameNet data. In addition we use the Word
Sense Disambiguation system (Banerjee and Peder-
sen, 2003) and mappings from WordNet to SUMO
(Niles and Pease, 2003) to assign WordNet synsets
and SUMO ontological classes to main predicates.
2.2 Frame Semantics
Frame Semantics (Baker et al., 1998) models the
lexical meaning of predicates and their argument
Role Example
SELLER BMW bought Rover from British Aerospace.
BUYER Rover was bought by BMW, which financed
[. . . ] the new Range Rover.
GOODS
BMW, which acquired Rover in 1994, is now
dismantling the company.
MONEY
BMW’s purchase of Rover for
$
1.2 billion was
a good move.
Figure 1: Frame COMMERCE GOODS-TRANSFER.
structure in terms of frames and roles. A frame de-
scribes a conceptual structure or prototypical situa-
tion together with a set of semantic roles that iden-
tify participants involved in the situation. FrameNet
currently contains more than 600 frames with al-
most 9000 lexicalizations (word-frame pairs). Fig-
ure 1 displays examples involving the frame COM-
MERCE
GOODS-TRANSFER.
Frame-semantic analysis is especially interesting
for the task of recognizing textual entailment if we
aim at robust, yet high-quality measures for seman-
tic overlap. Frames provide normalisations over di-
verse surface realizations (lexicalisation, verb vs.
nominalisation, etc.), including variations in argu-
ment structure realisation (cf. Fig. 1). Thus, we can
determine semantic similarity based on lexical se-
mantic meaning, combined with measuring similar-
ity of argument structure at a high level of abstrac-
tion. Moreover, the coarse-grained frame structures
make it possible to assess the core meaning of a sen-
tence (“what is it about?”) in a shallow analysis,
separated from the pitfalls of deep, structural analy-
sis of scope, modality, etc.,which must be treated by
other components, or can be selectively introduced,
as will be illustrated for the case of modality.
2.3 Enriched Frame Semantic Representations
As displayed in Figure 2, LFG-based syntactic anal-
ysis is integrated with frames and roles assigned by
Fred, Detour and Rosy, as well as WordNet synsets
and SUMO concepts, to yield an f-structure with
frame-semantic projection (Frank and Erk, 2004),
including conceptual class assignments.
2
Additional rules introduce frames and concepts
based on named entities recognized in LFG parsing
(companies, political offices etc.), as well as extrath-
2
The integration and semantics projection is defined using
the XLE rewrite system of (Crouch, 2005).
LFG
f-structure
Fred/Detour/Rosy
frames & roles
WordNet/
SUMO
F-structure with
semantics projection
Rule-based frame assignment and normalisations:
NEs, extra-thematic roles; modality; co-reference
FEF: Frame Exchange Format
Figure 2: Architecture of linguistic analysis
ematic semantic roles (TIME, LOCATION, REASON,
etc.) for corresponding adjunct types in f-structure.
As a heuristic device to establish co-referential links,
we collect possible antecedent referents for pronom-
inals. Finally, we identify various types of modal
contexts, introduced by negation, modals, condition-
als or future tense, which allows us to detect text-
hypothesis pairs that preclude entailment.
The resulting structures are converted to a Frame
Exchange Format (FEF), a flat predicate repre-
sentation comprising syntactic and semantic analy-
sis. Table 1 displays the FEF for (1). The parts
printed in bold show information from different lev-
els for the predicate manufacturer: f-structure node
f(5), semantics projection to node s(61) which is
labled with the frame MANUFACTURING (with roles
PRODUCT and MANUFACTURER) plus a projection
to ontological information (s(71)), WordNet synset
and SUMO super-class in this case. A FEFViewer
(Figure 3) displays the major elements of the graphs.
(1) Mercedes-Benz is a German car manufacturer.
Figure 3: FEFView for example (1).
normalized f-structure
with syn-sem projections
frames, roles and ontological
info (WordNet/SUMO)
xcomp(f(0),f(5)).
tense(f(0),pres).
stmt
type(f(0),declarative).
pred(f(0),be).
mood(f(0),indicative).
dsubj(f(0),f(1)).
pred(f(1),’Mercedes-
Benz’).
num(f(1),sg).
subj(f(5),f(1)).
pred(f(5),manufacturer).
num(f(5),sg).
mod(f(5),f(11)).
det
type(f(5),indef).
adjunct(f(5),f(7)).
pred(f(7),’German’).
atype(f(7),attributive).
adjunct
type(f(7),nominal).
adegree(f(7),positive).
pred(f(11),car).
num(f(11),sg).
sslink(f(1),s(67)).
sslink(f(5),s(61)).
sslink(f(7),s(66)).
sslink(f(11),s(60)).
frame(s(60),’Vehicle’).
vehicle(s(60),s(60)).
descriptor(s(60),s(66)).
rel(s(66),’German’).
frame(s(61),’Manufacturing’).
product(s(61),s(60)).
manufacturer(s(61),s(67)).
rel(s(67),’Mercedes-
Benz’).
ont(s(60),s(72)).
ont(s(66),s(73)).
ont(s(61),s(71)).
wn
syn(s(71),’manufacturer#1’).
sumo
sub(s(71),’Corporation’).
milo
sub(s(71),’Corporation’).
wn syn(s(72),’car#n#1’).
sumo
sub(s(72),’Transp˜Device’).
milo
sub(s(72),’Transp˜Device’).
wn syn(s(73),’german#a#1’).
sumo
inst(s(73),’Nation’).
milo
syn(s(73),’Germany’).
Table 1: FEF for example (1).
2.4 Overall RTE Architecture
Our RTE system architecture comprises the follow-
ing steps: We compute LFG f-structures with ex-
tended frame semantics projections for text and hy-
pothesis pairs. We identify their structural and se-
mantic similarities and represent them in a match
graph. From text, hypothesis, and match graph we
extract features that characterize their syntactic and
semantic properties, as well as various proportional
measures that can be relevant for establishing or re-
jecting entailment. These features are fed into a Ma-
chine Learning system for training on the develop-
ment set and testing on the test set.
3 Computing Semantic Overlap
We approximate textual entailment by statistical pre-
diction on the basis of measurements for structural
and semantic overlap between text and hypothesis.
3.1 Matching Text and Hypothesis
In the graph matching process we compute the over-
lap of the f-structures with semantics projection for
text and hypothesis which we record in a match
graph. The latter consists of matched predicates and
features from both input graphs. We distinguish var-
ious (sub)types of matches, in order to selectively
extract features for the learning phase.
Node (predicate) matching. Node matching rules
match nodes for identical syntactic predicates and
frames. We also allow matches for predicates that
are semantically related on the basis of WordNet.
To prevent overgeneration, WordNet-based match-
ing is restricted to predicates that are related by an
edge in the match graph. Further, the respective
synsets have to be closely related in terms of Word-
Net path distance (<3). Using (heuristically de-
fined) antecedent sets for pronouns, we allow special
types of predicate matches for pronouns and non-
pronominal predicates in text and hypothesis.
In addition, we allow matches between frame
nodes that are known to be related by FrameNet
frame relations, such as inheritance, or those that
are considered related by the Detour system, mea-
suring frame distance on the basis of WordNet.
Feature (edge) matching. Feature matches are re-
stricted to features that connect matching nodes, or
those that take identical atomic values. The lin-
guistic nature of these edges ranges from morpho-
syntactic features in LFG f-structure, such as NUM,
PERS, over grammatical functions ((deep) subject,
(deep) object, adjunct, oblique, complement, etc.),
to frame semantic roles in the semantic projection.
Modality contexts. Besides finding matches for
similar nodes and edges, some rules are intended
to detect semantic dissimilarity in terms of incom-
patible modality types. We normalise the different
modal contexts to five basic types: conditional, sub-
junctive, diamond, box and negation. An example of
incompatible modalities is the pair: A pet must have
rabies protection confirmed by a blood test A case
of rabies was confirmed.
3.2 Feature Extraction
The features we extract from the text, hypothesis
and match graphs to train a machine learning model
for textual entailment can be classified according
to their (i) nature in terms of level of representa-
tion (lexical, syntactic, semantic), (ii) degree of con-
nectedness in matching, (iii) source (text, hypothesis
1. No. of predicate matches relative to hypothesis.
2. No. of frame (Fred, Detour) matches relative to hypoth-
esis.
3. No. of role (Rosy) matches relative to hypothesis.
4. Match graph size relative to hypothesis graph size, in-
cluding syntactic, semantic, ontological information.
Table 2: Feature Set for Submitted Test Runs
All tasks IE IR QA SUM dev set
run 1 59.0 49.5 59.5 54.5 72.5 61.1
run 2 57.8 48.5 58.5 57.0 67.0 59.8
Table 3: RTE 2006 results.
test set dev set
run 1 54.6 51.2
run 2 53.3 54.3
Table 4: RTE 2005 data.
or match graph), and (iv) proportional relation (hy-
pothesis/text, match/hypothesis ratio).
Lexical features count the number of lexical
items, syntactic features record the number of LFG
predicate matches, including pronominal and co-
referential matches in the match graph, and syntac-
tic feature matches. Semantic features distinguish
between those frames and roles that were assigned
by the Fred, Detour and Rosy systems, and those
that were successfully interfaced with LFG analy-
ses.
3
We further distinguish semantic node matches
of different types as discussed above (e.g. identical
or semantically related frames, modal properties).
Finally, we compute the number and size of con-
nected clusters in the match graph, as well as their
size in relation to that of the hypothesis graph.
4 Experiments and Results
4.1 Training and Classification
Feature selection. We experimented with various
learners and the attribute selection module of Weka
(Witten and Frank, 2005). Many learners (evalua-
tors) selected features that seem intuitively impor-
tant. However, also unintuitive features, such as
the number of predicates in the hypothesis graph,
showed up as high-valued features, which could be
due to idiosyncrasies in the development set. We
chose to submit a run that is based on a small and
3
A number of frames and roles could not be ported from
Fred and Detour onto the f-structure due to mismatches in lem-
matisation/tokenisation and fragmentary or failed parses.
intuitively plausible feature set which led to constant
results on a number of classifiers. This feature set is
listed in Table 2.
Results. We submitted two runs for different clas-
sifiers from Weka, using the feature set from Table 2.
For run 1, we used a simple conjunctive rule classi-
fier. It generated a single rule measuring predicate
and frame matches relative to the hypothesis:
(frames
m relto h 0.954546) and
(preds m relto h 0.485294)
rte entails = 0
For run 2, we used the LogitBoost
4
classifier from
Weka’s meta classifers which used features (1.), (2.)
and (4.) in its iteration steps. Table 3 lists the results
on the current task (Table 4 on the RTE-2005 data).
4.2 Discussion of Results and Error Analysis
The conjunctive rule of run 1 imposes a medium and
high threshold, respectively, on predicate and frame
matches, as criteria for rejection. So, the system ac-
cepts high degrees of semantic similarity based on
frames, joint with medium degree overlap at the syn-
tactic predicate level to model entailment.
This is in accordance with the view that frame se-
mantics models “aboutness”, on the basis of coarse-
grained conceptual meaning, as opposed to veridi-
cality as it is modeled by truth-conditional seman-
tics. This is further confirmed by the results for the
different RTE tasks (Table 3): we obtain higher ac-
curacy for SUM (and IR), as opposed to QA and IE,
which (in the RTE setting) need deeper modeling in
terms of veridicality. Run 2, which uses the more
“informative” feature set of Table 2 performs only
slighly worse than run 1, and better on QA.
True positives. Table 5 lists typical examples of
true positives. Entailment is triggered by high se-
mantic overlap between hypothesis and match graph
in terms of matching predicates, frames, and f-
structure. In ex. 602 frames establish a semantic
match for predicates without a syntactic match: the
verb purchase and the nominal purchase are both as-
signed the frame COMMERCE
BUY.
Missing or non-matching frame assignments can
be compensated by WordNet relatedness: in ex.
4
LogitBoost performs additive logistic regression using the
classifier DecisionStump.
True positives
103 T: Everest summiter David Hiddleston has passed away in an avalanche of Mt. Tasman.
H: A person died in an avalanche.
129 T: In one of the latest attacks, a US soldier on patrol was killed by a single shot from a sniper in northern Baghdad, the
military said yesterday.
H: A sniper killed a U.S. soldier on patrol in Baghdad with a single shot.
602 T: The system of government purchases of food under the U.N. Oil-for-Food Program was alleged to have many abuses.
H: A government purchases food.
626 T: An earthquake has hit the east coast of Hokkaido, Japan, with a magnitude of 7.0 Mw.
H: An earthquake occurred on the east coast of Hokkaido, Japan.
True negatives
233 T: The goal of preserving indigenous culture can hardly be achieved by a handful of researchers and curators at museums
of ethnology and folk culture.
H: Indigenous folk art is preserved.
322 T: Even today, within the deepest recesses of our mind, lies a primordial fear that will not allow us to enter the sea without
thinking about the possibility of being attacked by a shark.
H: A shark attacked a human being.
Table 5: Examples from RTE 2006.
103, die is matched with pass away although the lat-
ter has not been assigned a frame. Active-passive
diathesis as in ex. 129 is resolved on the f-structure
level where we normalize to deep subject and ob-
ject. As seen in ex. 626 and 129, due to proportional
measures of overlap, we also obtain good results for
longer hypotheses.
True negatives. 27% of justified rejections in-
volve mismatches of modality, while only 11.9% of
all sentences contain modal contexts. The algorithm
for construction of the match graph rejects predicate
(and feature) matches if the predicates (features) are
embedded in inconsistent modal contexts. Thus,
mismatching modalites are reflected in two ways:
by (distinct) modality features in text and hypothe-
sis, and in terms of reduced size of the match graph.
Ex. 233 and 322 are true negatives where predicate
matches of the underlined predicates are blocked.
Error analysis for base components. LFG pars-
ing yielded 99% coverage for the test set. 24% of
the sentence pairs involved a fragmentary parse. For
these, we rely on non-LFG-integrated frame and role
assignments by Fred, Rosy and Detour. To assess
the impact of losses in syntactic analysis, enriched
semantic representations and the resulting overlap
measures, we restricted the test set to pairs without
fragmentary parses, which yielded an improvement
of 1-3% for various learners and feature sets.
Overall, the system assigned 14326 frames and
13325 roles, including 3199 frames and 1736 roles
added by default rules. In average, 8.9 frames per
sentence and 1.1 role per frame. We identified losses
in the interface that projects frames and roles to
the LFG (10% for frames, 38.9% for roles) that
are due to failed or partial parses, but also to re-
maining differences in tokenisation and lemmatisa-
tion. Losses in porting frame and role assignments
to LFG are compensated by the fall-back to non-
assigned frames and roles, yet they do have an im-
pact on the graph connectedness measures.
Sparse features. From a machine learning view,
the size of the development corpus is very small.
Features that do not occur in the majority of sen-
tence pairs are neglected by the machine learning
systems. Currently, we have many high-frequency
features that measure similarity (e.g. predicate and
frame overlap), but only few and low-frequency fea-
tures that identify dissmimilarity, such as mismatch-
ing modalities. Therefore, the learners have a ten-
dency to reject too little: 29.5% false positives as
opposed to 12.75% false negatives.
False positives and negatives. False positives of-
ten involve dissimilar non-matching main predicates
within larger match graphs. In line with the above
observation of sparse features for dissimilarity, we
see potential for improvement by including specific
dissimilarity measures between non-matching nodes
in otherwise connected match graphs.
A related problem we observed for nodes in the
match graph that are e.g. closely connected in the
hypothesis graph, but match with far distant parts of
the text graph, as in ex.198: 4.4 million people were
executed in Singapore Some 420 people have been
hanged in Singapore[...]. That gives the country of
4.4 million people the highest execution rate.. For
such configurations, we could introduce weights that
reflect the relative distance of matching node pairs
in the text and hypothesis graphs, measured in terms
of f-structure or frame structure path distance. This,
we hope, could help the learner to establish further
criteria for rejection.
Inferences on partial structures. Our architec-
ture is open for extension to deeper methods. We
have started to integrate inferences on partial struc-
tures in order to bridge partial non-matching text and
hypothesis graphs: e.g., joins(x
1
, y
1
) in the text
graph supports the hypothesis member
of(x
2
, y
2
),
for matching node pairs (x
1
/x
2
, y
1
/y
2
). In the graph
matching process, inferences of this type introduce
special types of matches, which can be exploited by
the learner directly, or indirectly, through the ensu-
ing extension of the match graph. However, due to
the small, manually crafted rule set, this feature was
not yet effective. The next step is thus to identify and
integrate suitable, large-scale resources for infer-
ences, both lexical and based on world-knowledge.
5 Conclusions and Perspectives
We presented a baseline system for textual entail-
ment that is based on “informed” features for struc-
tural and semantic overlap between text and hypoth-
esis. The system’s performance is on a par with the
best systems in last year’s RTE Challenge. We con-
sider this to demonstrate the usefulness of a frame-
based approach to textual entailment combined
with deep syntactic analysis and further components
that complement aspects of semantic modeling not
covered in frame semantics.
We identified various possibilities for further im-
provement. The current bias towards positive en-
tailment judgments can be compensated by intro-
ducing more negative features that measure the dis-
tance semantic or constructional between ma-
terial involved in partial match graphs. More gen-
erally, starting from the determination of structural
and semantic overlap, or similarity, we can now im-
prove the modeling of dissimilarity. The detection
of incompatible modalities has proved rather effec-
tive, but can be further extended to lexically induced
modalities (e.g. possibility of, alleged, promise).
The usage of an integrated syntactic-semantic-
ontological representation supports the integration
of selected deeper and fine-grained methods for se-
mantic analysis.
References
Collin F. Baker, Charles J. Fillmore, and John B. Lowe.
1998. The Berkeley FrameNet project. In Proceedings of
COLING-ACL, Montreal, Canada.
Satanjeev Banerjee and Ted Pedersen. 2003. Extended gloss
overlaps as a measure of semantic relatedness. In Proceed-
ings of the Eighteenth International Joint Conference on Ar-
tificial Intelligence, Acapulco, Mexico.
Johan Bos and Katja Markert. 2005. Combining shallow and
deep NLP methods for recognizing textual entailment. In
Proceedings of the First Challenge Workshop, Recognizing
Textual Entailment. PASCAL.
Aljoscha Burchardt, Katrin Erk, and Anette Frank. 2005. A
WordNet Detour to FrameNet. In B. Fisseni, H.-C. Schmitz,
B. Schr¨oder, and P. Wagner, editors, Sprachtechnologie, mo-
bile Kommunikation und linguistische Resourcen, volume 8
of Computer Studies in Language and Speech, pages 408–
421. Peter Lang, Frankfurt am Main.
Richard Crouch. 2005. Packed Rewriting for Mapping Seman-
tics to KR. In Proceedings of the Sixth International Work-
shop on Computational Semantics, IWCS-06, Tilburg.
Ido Dagan, Oren Glickman, and Bernardo Magnini. 2005. The
PASCAL recognising textual entailment challenge. In Pro-
ceedings of the First Challenge Workshop, Recognizing Tex-
tual Entailment. PASCAL.
Katrin Erk and Sebastian Pado. 2006. Shalmaneser a
toolchain for shallow semantic parsing. In Proceedings of
LREC-2006 (to appear), Genoa, Italy.
Anette Frank and Katrin Erk. 2004. Towards an LFG Syntax-
Semantics Interface for Frame Semantics Annotation. In
A. Gelbukh, editor, Computational Linguistics and Intelli-
gent Text Processing, LNCS, pages 1–12. Springer.
Ian Niles and Adam Pease. 2003. Linking lexicons and on-
tologies: Mapping wordnet to the suggested upper merged
ontology. In H.R. Arabnia, editor, IKE. CSREA Press.
Stefan Riezler, Tracy H. King, Ronald M. Kaplan, Richard
Crouch, John T. III Maxwell, and Mark Johnson. 2002.
Parsing the Wall Street Journal using a Lexical-Functional
Grammar and Discriminative Estimation Techniques. In
Proceedings of ACL’02, Philadelphia, PA.
Ian H. Witten and Eibe Frank. 2005. Data Mining: Practi-
cal Machine Learning Tools and Techniques. Morgan Kauf-
mann, San Francisco, 2 edition.
... FrameNet has been used in tasks ranging from question-answering (Shen and Lapata, 2007) and information extraction (Ruppenhofer and Rehbein, 2012) to semantic role labeling (Gildea and Jurafsky, 2002) and recognizing textual entailment (Burchardt and Frank, 2006), in addition to finding utility as a lexicographic compendium. As a manually created resource, FrameNet is limited by the size of its lexical inventory and number of annotations (Shen and Lapata, 2007;Pavlick et al., 2015). ...
Article
Full-text available
We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing datasets or the rapid creation of new datasets using a small, manually produced seed corpus. We demonstrate our approach with experiments on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. With four days of training data collection for a span alignment model and one day of parallel compute, we automatically generate and release to the community 495,300 unique (Frame,Trigger) pairs in diverse sentential contexts, a roughly 50-fold expansion atop FrameNet v1.7. The resulting dataset is intrinsically and extrinsically evaluated in detail, showing positive results on a downstream task.
... FrameNet has been applied to a variety of NLP tasks, including semantic role labeling (Gildea and Jurafsky, 2002), question-answering (Shen and Lapata, 2007), information extraction (Ruppenhofer and Rehbein, 2012), and recognizing textual entailment (Burchardt and Frank, 2006). As an entirely manually-created resource, FrameNet's utility is limited by the size of its lexical inventory and number of annotations (Shen and Lapata, 2007;Pavlick et al., 2015); an ideal candidate for augmentation. ...
Preprint
We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment. Our approach allows for the large-scale expansion of existing resources, or the rapid creation of new resources from a small, manually-produced seed corpus. We illustrate our framework on the Berkeley FrameNet Project, a large-scale language understanding effort spanning more than two decades of human labor. Based on roughly four days of collecting training data for the alignment model and approximately one day of parallel compute, we automatically generate 495,300 unique (Frame, Trigger) combinations annotated in context, a roughly 50x expansion atop FrameNet v1.7.
... In the last decade, semantic frames, such as FrameNet (Baker et al., 1998) and PropBank (Palmer et al., 2005), have been manually elaborated. These resources are effectively exploited in many natural language processing (NLP) tasks, including not only semantic parsing but also machine translation (Boas, 2002), information extraction (Surdeanu et al., 2003), question answering (Narayanan and Harabagiu, 2004), paraphrase acquisition (Ellsworth and Janin, 2007) and recognition of textual entailment (Burchardt and Frank, 2006). ...
Article
Full-text available
We present an unsupervised method for inducing semantic frames from verb uses in giga-word corpora. Our semantic frames are verb-specific example-based frames that are distinguished according to their senses. We use the Chinese Restaurant Process to automatically induce these frames from a massive amount of verb instances. In our experiments, we acquire broad-coverage semantic frames from two giga-word corpora, the larger comprising 20 billion words. Our experimental results indicate the effectiveness of our approach.
... Frame semantics describes a word in relation to real-world events, entities, and activities. Frame semantic analysis can improve natural language understanding (Fillmore and Baker, 2001), and has been applied to tasks like question answering (Shen and Lapata, 2007) and recognizing textual entailment (Burchardt and Frank, 2006;Aharon et al., 2010). FrameNet (Fillmore, 1982;Baker et al., 1998) is a widely-used lexical-semantic resource embodying frame semantics. ...
Article
Full-text available
We increase the lexical coverage of FrameNet through automatic paraphras-ing. We use crowdsourcing to manually filter out bad paraphrases in order to en-sure a high-precision resource. Our ex-panded FrameNet contains an additional 22K lexical units, a 3-fold increase over the current FrameNet, and achieves 40% better coverage when evaluated in a prac-tical setting on New York Times data.
... The GROUNDHOG System used in RTE-2 by (Hickl et al., 2006) combined 25 linguistically-naive probabilistic approaches with richer forms of lexico-semantic information and the system built by (Burchardt and Frank, 2006) used two probabilistic systems for frame and role annotation, Fred and Rosy (Erk and Pado, 2006). ...
Article
Full-text available
Recognising Textual Entailment (RTE) was proposed in 2005 as a generic task, aimed at building systems capable of capturing the semantic variability of texts and performing natural language inferences. These systems can be integrated in NLP applications, improving their performance in fine-grained semantic analysis. This thesis presents the relevant current work in the field and the author’s contribution regarding the creation of a Textual Entailment system. Given two text fragments, the task of Textual Entailment answers the question whether the meaning of one text can be inferred (entailed) from another text. The second chapter presents an overview of the main approaches used by systems participating in RTE competitions from 2005 to 2008, with their strengths and weaknesses. The methods used for identifying the entailment relation are varied, ranging from simple methods such word overlap, to complex methods that use semantic knowledge, logical provers, probabilistic models or learning models. Since the 2007 edition, the participating systems demonstrated their maturity and utility in others tracks such Question Answering and Answer Validation Exercise of CLEF2007. The author’s contribution is outlined in the third and fourth chapters, where a detailed presentation of the Textual Entailment developed system is given. The system’s performance was high in RTE competitions and was ranked as one of the top Textual Entailment applications. In order to improve on the running time, the system architecture adopted was the peer-to-peer model, and the system behaviour is similar to that of a computational Grid. Chapter five presents how the Textual Entailment system was used with success in the
... Another line of work has sought to extend the coverage of FrameNet by exploiting VerbNet and WordNet (Shi and Mihalcea 2005;Giuglea and Moschitti 2006;Pennacchiotti et al. 2008) and by projecting entries and annotations within and across languages (Boas 2002;Fung and Chen 2004;Pado and Lapata 2005;Fürstenau and Lapata 2009b). Others have explored the application of frame-semantic structures to tasks such as information extraction (Moschitti, Morarescu, and Harabagiu 2003;Surdeanu et al. 2003), textual entailment (Burchardt and Frank 2006;Burchardt et al. 2009), question answering (Narayanan andHarabagiu 2004;Shen and Lapata 2007), and paraphrase recognition (Padó and Erk 2005). ...
Article
Full-text available
Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets (i.e., content words and phrases) in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. This model uses latent variables and semi-supervised learning to improve frame disambiguation for targets unseen at training time. The second stage finds the target's locally expressed semantic arguments. At inference time, a fast exact dual decomposition algorithm collectively predicts all the arguments of a frame at once in order to respect declaratively stated linguistic constraints, resulting in qualitatively better structures than naïve local predictors. Both components are feature-based and discriminatively trained on a small set of annotated frame-semantic parses. On the SemEval 2007 benchmark data set, the approach, along with a heuristic identifier of frame-evoking targets, outperforms the prior state of the art by significant margins. Additionally, we present experiments on the much larger FrameNet 1.5 data set. We have released our frame-semantic parser as open-source software.
Chapter
Several works show that predicate-argument structure is a level of analysis relevant for addressing Natural Language Processing problems, such as Textual Entailment (another study on Textual Entailment can be found in this volume). Although large resources like FrameNet are available (see also the chapter on FrameNet in this volume), attempts to integrate this type of information into a system for textual entailment has not delivered the expected gain in performance. The reasons for this result are not fully obvious; candidates include FrameNet’s restricted coverage, limitations of semantic parsers, or insufficient modeling of FrameNet information. To enable further insight on this issue, in this paper we present FATE (FrameNet-Annotated Textual Entailment), a manually built, fully reliable frame-annotated RTE corpus. The annotation covers the 800 pairs of the RTE-2 test set. This dataset offers a safe basis for RTE systems to experiment, and enables researchers to develop clearer ideas on how to integrate frame knowledge effectively into semantic inference tasks like recognizing textual entailment. We describe and present statistics over the adopted annotation, which introduces a new schema based on full-text annotation of so called relevant frame-evoking elements. (This chapter is based on Burchardt, Pennacchiotti, Proceedings of the sixth international conference on language resources and evaluation (LREC’08) (2008) [7].)
Conference Paper
Previous researches on event relation classification primarily rely on lexical and syntactic features. In this paper, we use a Shallow Convolutional Neural Network (SCNN) to extract event-level and cross-event semantic features for event relation classification. On the one hand, the shallow structure alleviates the over-fitting problem caused by the lack of diverse relation samples. On the other hand, the utilization and combination of event-level and cross-event semantic information help improve relation classification. The experimental results show that our approach outperforms the state of the art.
Conference Paper
The Frame Labeling over Italian Texts (FLaIT) task held within the EvalIta 2011 challenge is here described. It focuses on the automatic annotation of free texts according to frame semantics. Systems were asked to label all semantic frames and their arguments, as evoked by predicate words occurring in plain text sentences. Proposed systems are based on a variety of learning techniques and achieve very good results, over 80% of accuracy, in most subtasks.
Article
Full-text available
This paper describes the design and implementation of a computational model for Arabic natural language semantics, a semantic parser for capturing the deep semantic representation of Arabic text. The parser represents a major part of an Interlingua-based machine translation system for translating Arabic text into Sign Language. The parser follows a frame-based analysis to capture the overall meaning of Arabic text into a formal representation suitable for NLP applications that need for deep semantics representation, such as language generation and machine translation. We will show the representational power of this theory for the semantic analysis of texts in Arabic, a language which differs substantially from English in several ways. We will also show that the integration of WordNet and FrameNet in a single unified knowledge resource can improve disambiguation accuracy. Furthermore, we will propose a rule based algorithm to generate an equivalent Arabic FrameNet, using a lexical resource alignment of FrameNet1.3 LUs and WordNet3.0 synsets for English Language. A pilot study of motion and location verbs was carried out in order to test our system. Our corpus is made up of more than 2000 Arabic sentences in the domain of motion events collected from Algerian first level educational Arabic books and other relevant Arabic corpora.
Article
Full-text available
We combine two methods to tackle the textual entailment challenge: a shallow method based on word overlap and a deep method using theorem proving tech- niques. We use a machine learning tech- nique to combine features derived from both methods. We submitted two runs, one using all features, yielding an ac- curacy of 0.5625, and one using only the shallow feature, with an accuracy of 0.5550. Our method currently suffers from a lack of background knowledge and future work will be focussed on that area.
Article
Full-text available
In this paper, we present a rule-based system for the assignment of FrameNet frames by way of a "detour via WordNet". The system can be used to overcome sparse-data problems of statistical systems trained on current FrameNet data. We devise a weighting scheme to select the best frame(s) out of a set of candidate frames, and present first figures of evaluation. In diesem Aufsatz beschreiben wir einen regelbasierten Ansatz zur Überbrückung fehlender Lexikoneinträge in FrameNet. WordNet Synsets dienen als "Umweg zu FrameNet". Auf der Basis von WordNet generieren wir für ein gegebenes Wort eine Menge von Kandidatenframes. Wir entwerfen ein Gewichtungsschema zur Auswahl des/der besten Frames und zeigen erste Evaluationsergebnisse.
Article
Full-text available
This paper describes one component of a system being developed at PARC for deriving knowledge representation from free text. The compo-nent takes compositionally derived semantic representations and converts them to knowledge representations, intended for use by a back-end reason-ing system. A resource sensitive rewriting system, originally developed for transfer in machine translation, is used for the conversion. This paper (i) dis-cusses the system background and the need for an explicit mapping between semantic and knowledge representations, (ii) describes the rewriting system, with emphasis on its use of packing to manage ambiguity, and (iii) discusses the advantages of using such a system for semantics to KR mapping. Above and beyond a system component description, the prime goal of this paper is to show how the free-choice packing mechanisms developed for manag-ing ambiguity in non-context free parsing [Maxwell and Kaplan, 1995] can profitably be extended beyond syntactic representations.
Article
Full-text available
This paper presents SHALMANESER, a software package for shallow semantic parsing, the automatic assignment of semantic classes and roles to free text. SHALMANESER is a toolchain of independent modules communicating through a common XML format. System output can be inspected graphically. SHALMANESER can be used either as a "black box" to obtain semantic parses for new datasets (classifiers for English and German frame-semantic analysis are included), or as a research platform that can be extended to new parsers, languages, or classification paradigms.
Conference Paper
Full-text available
We present an LFG syntax-semantics interface for the semi-automatic annotation of frame semantic roles for German in the SALSA project. The architecture is intended to support a bootstrapping cycle for the acquisition of stochastic models for frame semantic role assignment, starting from manual annotations on the basis of the syntactically annotated TIGER treebank, with smooth transition to automatic syntactic analysis and (semi-)automatic semantic annotation of a much larger corpus, on top of a free-running LFG grammar of German. Our study investigates the applicability of the LFG formalism for modeling frame semantic role annotation, and designs a flexible and extensible syntax-semantics architecture that supports the induction of stochastic models for automatic frame assignment. We propose a method familiar from example-based Machine Translation to translate between the TIGER and LFG annotation formats, thus enabling the transition from treebank annotation to large-scale corpus processing.
Conference Paper
Full-text available
This paper describes the PASCAL Net- work of Excellence Recognising Textual Entailment (RTE) Challenge benchmark 1. The RTE task is defined as recognizing, given two text fragments, whether the meaning of one text can be inferred (en- tailed) from the other. This application- independent task is suggested as capturing major inferences about the variability of semantic expression which are commonly needed across multiple applications. The Challenge has raised noticeable attention in the research community, attracting 17 submissions from diverse groups, sug- gesting the generic relevance of the task.
Article
Full-text available
Ontologies are becoming extremely useful tools for sophisticated software engineering. Designing applications, databases, and knowledge bases with reference to a common ontology can mean shorter development cycles, easier and faster integration with other software and content, and a more scalable product. Although ontologies...
Article
Full-text available
We present a stochastic parsing system consisting of a Lexical-Functional Grammar (LFG), a constraint-based parser and a stochastic disambiguation model. We report on the results of applying this system to parsing the UPenn Wall Street Journal (WSJ) treebank. The model combines full and partial parsing techniques to reach full grammar coverage on unseen data. The treebank annotations are used to provide partially labeled data for discriminative statistical estimation using exponential models. Disambiguation performance is evaluated by measuring matches of predicate-argument relations on two distinct test sets. On a gold standard of manually annotated f-structures for a subset of the WSJ treebank, this evaluation reaches 79% F-score. An evaluation on a gold standard of dependency relations for Brown corpus data achieves 76% F-score.
Article
Full-text available
This paper presents a new measure of semantic relatedness between concepts that is based on the number of shared words (overlaps) in their definitions (glosses). This measure is unique in that it extends the glosses of the concepts under consideration to include the glosses of other concepts to which they are related according to a given concept hierarchy. We show that this new measure reasonably correlates to human judgments. We introduce a new method of word sense disambiguation based on extended gloss overlaps, and demonstrate that it fares well on the SENSEVAL-2 lexical sample data.