Conference PaperPDF Available

Using Dialogue Corpora to Extend Information Extraction Patterns for Natural Language Understanding of Dialogue.

Authors:
Using Dialogue Corpora to Extend Information Extraction Patterns for Natural
Language Understanding of Dialogue
Roberta Catizone, Alexiei Dingli, Robert Gaizauskas
Department of Computer Science
University of Sheffield
E-mail: roberta@dcs.shef.ac.uk, alexiei@dingli.org, r.gaizauskas@dcs.sef.ac.uk
Abstract
This paper examines how Natural Language Process (NLP) resources and online dialogue corpora can be used to extend coverage of
Information Extraction (IE) templates in a Spoken Dialogue system. IE templates are used as part of a Natural Language Understanding
module for identifying meaning in a user utterance. The use of NLP tools in Dialogue systems is a difficult task given 1) spoken dialogue is
often not well-formed and 2) there is a serious lack of dialogue data. In spite of that, we have devised a method for extending IE patterns
using standard NLP tools and available dialogue corpora found on the web. In this paper, we explain our method which includes using a set
of NLP modules developed using GATE (a General Architecture for Text Engineering), as well as a general purpose editing tool that we
built to facilitate the IE rule creation process. Lastly, we present directions for futu re work in this area.
1. Information Extraction for Dialogue
Why use Information Extraction for Dialogue? Information
Extraction techniques for extracting meaning are generally
applied to text documents, for example newspaper reports,
scientific papers, or blogs, rather than to transcribed spoken
dialogues. However, we have chosen to apply IE to
dialogue for the following reason: Dialogue utterances
tend not to be well-formed sentences, yet convey meaning
to the hearer. Since utterances are not well-formed, a
full-parsing method is not as desirable as a pattern
matching approach with shallow syntactic parsing to
identify NPs and VPs. This lends itself to an IE
template-based approach. We devised our method when
developing a demonstrator for a dialogue system in the
domain of Office chat (for the EU-funded Companions
project), but it could be applied to any dialogue domains.
In our system, IE patterns are part of a Natural Language
Understanding module (Figure 1). While all inter-module
communication in the overall system takes place via a
blackboard, this module in effect takes input from the
upstream Speech Recognition and Dialogue Act tagging
modules for the current user utterance and from the
downstream Dialogue Manager for the system response to
the previous utterance. It outputs a shallow meaning
representation for the current user utterance which is passed
on to the Dialogue Manager for formulating a response to
the user.
For example, one sort of input our system must handle in
the domain of Office Chat is utterances that express the
user’s emotional attitude about their day or project or task.
Such utterances may be conceived of as ATTITUDE
relations between a person and a day, project or task, with
subtypes WORRY, HAPPY, ANNOY, etc. The WORRY
Relation is signaled by words such as ‘worry’, ‘be worried’,
‘be troubled’, ‘be concerned’ and ‘be afraid’ and so on.
Relations also have attributes which, in our domain, are
attitude-type (WORRY, HAPPY, etc), attitude-subject
(Person) and attitude-object (Person, Project or Task).
Given this framework, the NLU output for the sentence
I'm a bit worried.” is:
<entity id=e1 type=person user=yes>I</entity>’m a bit
<relation-signal id=rs1 type=attitude
subtype=worry>worried</relation-signal>
<relation id=R1 type=attitude subtype=worry
attitude-subject=e1 relation-signal=rs1>
which may be expressed more abstractly in a logical form
representation as:
object(user:person),
object (e1:Person),
attribute (e1, user, true),
object(r1:attitude),
attribute(r1,subtype,worry)
attribute(r1,attitude-subject, e1)
attribute(r1,attitude-object,unknown)
Such representations, in which both entities and relations
are reified, are convenient given partial information, which
may result either from imperfect analysis or from
information being distributed across multiple sentences.
To derive such meaning representations automatically we
use entity and relation extraction techniques. To develop
entity and relation extractors one can pursue either a
2136
manually authored rule-based approach or a machine
learning approach. The advantages and disadvantages of
each have been well-explored in the literature (McLeron et
al, 2006) (Feldman et al, 2006) (Riloff 1996). Given the
impossibility of obtaining significant amounts of annotated
training data, we decided to explore a data-driven approach
to manually creating rules for our IE system. This paper
describes this approach, focusing on relation extraction,
including the data sources we used and a special tool we
created to support the creation and modification of rules for
our system. The outcome is a methodology which supports
the rapid development of a wide cover age entity and
relation extractors for shallow dialogue understanding.
The existing research on using Information Extraction for
Dialogue is more akin to finding database style information
(Flycht-Eriksson, 2003) whereas our application to use
Information Extraction to extract meaning from Dialogue
chat that is not goal-directed is, we believe, novel.
2. Overview of the Methodology
We start by identifying all the Entities and Relations that
are significant in our domain and those words or phrases
that we believe trigger each Relation called Relation
Signals. We suppose entity extractors exist already these
being either well-known named entity types or other
semantic classes that can be extracted using standard entity
extraction techniques. For each relation, we proceed as
follows:
1. Gather a pool of examples of utterances expressing the
relation by searching the data sources (next section)
using the trigger/signal words already identified.
Negative examples are important here too (example
containing the trigger word(s) but not expressing the
intended relation.
2. Select an example from the pool and pre-process using
a pipeline of GATE modules.
3. Input the example and associated annotations resulting
from the preprocessing into the JAPE editor.
4. Manually create a JAPE pattern action rule in the
editor by specifying (a) which of the annotations
should form part of the pattern portion of the rule and
(b) what output annotations should be added by the
action part of the rule.
5. Run the new rule from 4 over the set of examples and
refine the rule by further generalization/specialization
to cover correctly as many of the positiv e examples as
possible and as few of the negative ones as possible.
6. Remove the examples covered by the rule from the
example pool and go to 2.
Figure 1 : Natural Language Understanding Architecture
2137
Note that this is a classic covering algorithm, well-known
in the machine learning community (see, e.g. Mitchell
(1997)). Here, however, we employ a human in the loop to
carry out the generalization.
3. Dialogue Data on the Web
Once we have manually identified the basic set of Entities,
Relations and Relation Signals, we can expand the
coverage by finding more examples of dialogue turns (from
a dialogue corpus) containing the relation signals for each
of our relations. Several useful internet sites that contain
dialogue corpora are :
The Linguistic Data Consortium’s corpus of
Spoken Office Dialogues1 :
The Dialogue Diversity Corpus 2;
Saarbrucken Corpus of Spoken English 3.
We have also done some preliminary work using movie/TV
scripts. We have used some of the scripts from the TV
series Friends4 which looks very promising for collecting
non-goal driven dialogue turns.
For example, for the ATTITUDE relation discussed above,
we found over 200 examples in two dialogue corpus
sources.
Here is a sample from LDC corpus of Spoken Office
Dialogue:
Um, th- what would - would - would - what would
<Emphasis> worry </Emphasis> me is that maybe we
might miss a little detail
we're not exactly sure either. So, don't worry too much
about it. <Comment Description="while laughing"/> The -
It's just self rating.
O_K. So, then in terms of people worrying about,
then we can start worrying about how we would
And <Emphasis> then </Emphasis> we can start worrying
about where to get this input, what -
<Emphasis> that </Emphasis> I don't think is
<Emphasis> even </Emphasis> worth us <Emphasis>
worrying
</Emphasis> about just yet. worry about converting it to
<Emphasis> English </Emphasis> and worry about how it
could ex- extract the parameters we need for the
<Emphasis> belief-net. </Emphasis>
1 www.ldc.upenn.edu/
2 www-rcf.usc.edu/~billmann/diversity/DDivers-site.htm
3 www.uni-saarland.de/fak4/norrick/scose.html
4 ufwebsite.tripod.com/scripts/scripts.htm
uh, although we haven't worried about this yet, you might
wanna worry about something that would
And here is a sample output from Friends scripts:
Friends15.txt:CHAN: It doesn't matter. I just don't want to
be one of those guys that's in his office until twelve o'clock
at night worrying about the WENUS.
Friends15.txt:RACH: No. But don't worry, I'm sure they're
still there.
Friends21.txt:Rachel: Oh God, oh. Great, Monica, y'know
what, you could've called, I have been up here, I've been
worried...
Friends21.txt:Fake Monica: Monica, I started my day by
peeing in front of twenty-five other women, and you're
worried about who's gonna take you to the Big Apple
Circus?
Friends21.txt:Monica: Well, not... worried, just...
wondering.
Friends23.txt:RACH: No, honey, they're not, but don't
worry, because we are going to find them, and
until we do, we are all here for you, ok?
Friends3.txt:MONICA: No, he'll be fine. It's the other five
I'm worried about.
4. Creating Extraction Rules
Given such examples we then apply our method, as
described above in section 2, for creating pattern action
rules which match the examples and extract key relational
content. Here we describe the creation of a single rule,
driven by a set of seed examples.
4.1 Preprocessing/Feature Extraction
First the seed example(s) is passed to a pipeline of GATE
(Cunningham et al, 1997) modules which performs
linguistic preprocessing, but which may also be thought of
as feature extraction. These modules include:
Gazetteer/Lexicon lookup matches token
sequences against pre-stored lists of named
entities (person, places, organizations) and lexical
items in various semantic categories
Sentence splitting splits document into sentences
POS tagging assigns POS tags to each token in
each sentence
Entity/signal recognition identifies and types
token sequences as entities or relation signals in
the domain of interest.
2138
NP+VP Chunk Parser identifies and types token
sequences as NP or VP chunks and determines
grammatical dependency relations between words
Relation extraction uses evidence from previous
stages to determine whether there are instances of
relations in the domain of interest
Temporal interpretation interprets information
about temporal expressions from the entity
recognition stage to assign calendrical time values,
where possible, to temporal expressions
4.2 The Rule Editor
The extraction rules are written in the JAPE language the
Java Annotation Patterns Engine is a pattern-action rule
language that supports creation of rules that match and add
annotations to linguistically annotated data5. The rules are
created and modified using the JAPE Editor, a tool
especially developed for the current application.
The editor (as seen in Figure 2) is composed of three parts;
the top part is the input box where the user is requested to
enter a textual example. Apart from entering a sample
sentence, he is also asked to highlight with the mouse
which of the elements needs to be extracted by the
algorithm. As soon as she is done and the “Process Text
button pressed, the sentence is sent to the GATE pipeline
and passed through the various functions described earlier.
The whole extraction rule process, including preprocessing,
is actually controlled from within the JAPE Ed itor.
5 See http://gate.ac.uk.
This brings us to the middle part of the rule construction
grids - made up of three grids. The middle grid represents
the analysis performed by the GATE pipeline on the word
selected in the top window by the user. This might include
the p art-of-speech value of the selected text, semantic
annotations (such as if the word is actually the name of a
person), etc. The other two grids perform exactly the same
function but they show only the analysis for the text before
the selection and the text after the selection. So if we look
at the example sentence “The deadline is Friday. where
Friday was selected, the first grid would show the
processing for “The deadline is “, the middle grid would
show the processing for “Friday” and the last grid would
simply process the full stop found after the word Friday. If
we just focus on the middle grid, we’ll find a variety of
information such as the fact that Friday is a word, it starts
with a capital letter, etc. This kind of information is so
general that it is useless for building the Information
Extraction pattern, so these features are simply discarded
by the user. To discard such information, the user simply
right clicks on the mouse and selects the delete option. The
system offers various options in its menu, such as the
facility to modify the existent values, add new conditions,
etc. If we go back to pruning the output of the IE, we
immediately notice that th e semantic tag which links Friday
to a date is a rather important clue, so that is retained as can
be seen in the diagram (Fig 2). The user continu es pruning
and modifying the pattern until she is happy with the result.
However, the effectiveness of such a rule can only be
evaluated on a test corpus.
This is the role of the last box in the interface, which is the
testing box whereby the rule is tested on a set of
documents containing examples from the test corpus.
Essentially, the user simply presses the Test button and the
rule constructed through the interface is automatically
applied to the document set in a specified directory. The
result is shown in the adjacent box together with the name
of the document from which it was extracted.
Once we obtain a simplified rule from the system, we then
need to bootstrap with more examples. To achieve this, our
dialogue corpora are used to find further examples. As
mentioned previously, we are not limited to traditional
dialogue corpora, but are also looking at film/tv examples
found on the web. These are a rich source of conversational
dialogues, albeit, they mimic real world dialogues within a
particular genre : humor, drama, tragedy, etc. Nonetheless,
they still contain a series of importan t examples which can
be used. To identify/organise the newly collected examples,
a concordance module is used to align the examples by the
Relation Signal. With these new examples, we make use of
the newly generated rule and we try to generalise over
further examples. The idea is that from a base rule, we will
be capable of covering more examples using simple
modifications of that rule. This is something we learned
from experience because when we handcrafted the rules, it
Figure 2 : Jape Editor Graphical User Interface
2139
was obvious that minor modifications of the base rules
allowed the system to increase its coverage by several
orders of magnitude. The most successful pattern is then
identified and stored. This pattern will be used as a new
JAPE rule inside our system. The examples which were
identified using our concordance system, but which were
not covered by the JAPE rule generated are extremely
important and ar e used as new seeds in the subsequent
iteration of the system.
5. Using the Paraphrase and Textual
Entailment work done in IE
The method we have outlined has a serious limitation it
aims to expand the different ways of expressing a Relation
given a Relation Signal, but does not identify other ways of
expressing the same meaning with other Relation Signals.
To approach this problem, we are studying the work done
in IE on paraphrases finding equivalent ways of
expressing the same concept . Some approaches make use
of identifying the same Named Entities in similar news
items (Yusuke et al. 2002) howev er we cannot use exactly
this approach since a dialogue is generally made up of a
limited number of rep eated Named Entities. Thus we
cannot find similarities between sentences using Named
Entities. Using the process described in section 4, we will
increase our number of seed examples and use these results
to discover new Relation Signals by applying work in
finding paraphrases for IE. We are particularly interested in
the work of discovering paraphrases using Machine
Translation (Finch et al. 2005) to help discover different
ways of saying the same thing : translating the target
sentence/turn in language A into language B and then
translating th e sentence in language B back to languag e A.
So using our example with Google Translator, if we
translate “I’m a bit worried into Czech and then translate
the Czech, “Mám trochu strach.” back to English, we get “I
am a little scared.” which does in fact given us another way
of saying the same thing. Our plan is to try to use this
approach for corpus examples containing our base Relation
Signals to discover new Relation Signals. This work
applied to our task looks promising and we hope to be ab le
to report results on this in the near future.
The other work we are looking at is the work on textual
entailment, which also addresses the issue of alternative
ways of conveying the same meaning, perhaps through
saying something which entails something else, i.e. where
an entailment holds between two text variants that express
the same target relation (Dagan et al. 2006)
6. Acknowledgements
This work was funded by the Companions project
(www.companions-project.org) sponsored by the European
Commission as part of the Information Society
Technologies (IST) programme under EC grant number
IST-FP6-034434.
7. References
Cunningham, H., Humphreys, K., Gaizauskas, R., Wilks, Y.
(1997). GATE -- a TIPSTER based General Architecture
for Text Engineering. In Proceedings of the TIPSTER
Text Program (Phase III) 6 Month Workshop. DARPA,
Morgan Kaufmann, California.
Dagan, I. Glickman, O. and Magnini. B. (2006). The
PASCAL Recognising Textual Entailment Challenge. In
Quiñonero-Candela, J.; Dagan, I.; Magnini, B.;
d'Alché-Buc, F. (Eds.) Machine Learning Challenges.
Lecture Notes in Computer Science , Vol. 3944, pp.
177-190, Springer.
Feldman, R., Sanger, J. (2006) The Text Mining
Handbook: Advanced Approaches in Analyzing
Unstructured Data. Cambridge University Press.
Finch, A., Hwang, Y, S, and Sumita, E. (2005). Using
machine translation evaluation techniques to determine
sentence-level semantic equivalence, 3rd International
Workshop on Paraphrasing (IWP2005), South Korea
Flycht-Eriksson, A. & Jönsson, A. (2003) Some Empirical
Findings on Dialogue Management and Domain
Ontologies in Dialogue System - Implications from an
Evaluation of BirdQuest. Proceedings of 4th SIGdial
Workshop on Discourse and Dialogue, Sapporo, Japan.
McLeron, B., Kushmerick N. (2006) Transductive Pattern
Learning for Information Extraction. In Proc. of EACL
2006 Workshop on Adaptive Text Extraction and
Mining, Trento.
Mitchell, T., (1997) Machine Learning, McGraw Hill
Publishers.
Riloff, E. (1996) Automatically generating extraction
patterns from untagged text, in Proc. AAAI-96, Portland,
OR, 1996, pp. 10441049.
Yusuke, S., Sekine, S. (2002), Automatic paraphrase
acquisition from news articles, Proceedings of the second
international confer ence on Human Language
Technology Research, San Diego, California, Pages: 313
318.
2140
... Different from the traditional information extraction in general domain text (Sarawagi et al., 2008;Li et al., 2020b;Han et al., 2020), dialogue transcirpts are more verbalized and loose with more irregular expressions and grammar errors. Previous works have explored how to extract user information (Catizone et al., 2010;Hirano et al., 2015;, clinical information (Kannan et al., 2018;Peng et al., 2021), and relations between speakers and mentioned entities in dialogues Jia et al., 2021). However, there is no previous work focusing on extracting information on real-world dialogue transcripts between real users and customer-service staffs. ...
... Different from the traditional information extraction in general domain text (Sarawagi et al., 2008;Li et al., 2020b;Han et al., 2020), dialogue transcirpts are more verbalized and loose with more irregular expressions and grammar errors. Previous works have explored how to extract user information (Catizone et al., 2010;Hirano et al., 2015;, clinical information (Kannan et al., 2018;Peng et al., 2021), and relations between speakers and mentioned entities in dialogues Jia et al., 2021). However, there is no previous work focusing on extracting information on real-world dialogue transcripts between real users and customer-service staffs. ...
Preprint
Recently, there have merged a class of task-oriented dialogue (TOD) datasets collected through Wizard-of-Oz simulated games. However, the Wizard-of-Oz data are in fact simulated data and thus are fundamentally different from real-life conversations, which are more noisy and casual. Recently, the SereTOD challenge is organized and releases the MobileCS dataset, which consists of real-world dialog transcripts between real users and customer-service staffs from China Mobile. Based on the MobileCS dataset, the SereTOD challenge has two tasks, not only evaluating the construction of the dialogue system itself, but also examining information extraction from dialog transcripts, which is crucial for building the knowledge base for TOD. This paper mainly presents a baseline study of the two tasks with the MobileCS dataset. We introduce how the two baselines are constructed, the problems encountered, and the results. We anticipate that the baselines can facilitate exciting future research to build human-robot dialogue systems for real-life tasks.
... DialogRE is the first human-annotated dialogue-based relation extraction dataset which annotates 10168 relational triples that exist in the 1788 dialogues. The original corpus originates from the complete transcripts of all ten seasons (263 episodes in total) of an American television situation comedy Friends, which covers a range of topics and has been widely employed in dialogue research in recent years [50][51][52][53]. All 37 relation types defined in this dataset and their distribution are listed in Table 1. ...
Article
Relation extraction has been an active research interest in the field of Natural Language Processing (NLP). The past works primarily focused on a corpus of formal text which is inherently non-dialogic. Recently, the dialogue-based relation extraction task, which detects relations among speaker-aware entities scattering in dialogues, has been gradually arousing people’s attention. Some sequence-based neural methods have been carried out to obtain the relevant information. However, identifying cross-sentence relations remains unsolved, especially in the context of a specific-domain dialogue system. In this paper, we propose a Relational Attention Enhanced Graph Convolutional Network (RAEGCN), which constructs the whole dialogue as a semantic interactive graph by emphasizing the speaker-related information and leveraging various inter-sentence dependencies. A dense connectivity mechanism is also introduced to empower the multi-hop relational reasoning across sentences, which can capture both local and non-local features simultaneously. Experiments show the significant superiority and robustness of our model on a real-world dataset DialogRE, as compared with previous approaches.
... In this paper, we take an initial step towards studying relation extraction in dialogues by constructing the first human-annotated dialogue-based relation extraction dataset, DialogRE. Specifically, we annotate all occurrences of 36 possible relation types that exist between pairs of arguments in the 1,788 dialogues originating from the complete transcripts of Friends, a corpus that has been widely employed in dialogue research in recent years (Catizone et al., 2010;Chen and Choi, 2016;Zhou and Choi, 2018;Rashid and Blanco, 2018;Yang and Choi, 2019). Altogether, we annotate 10,168 relational triples. ...
... In this paper, we take an initial step towards studying relation extraction in dialogues by constructing the first human-annotated dialogue-based relation extraction dataset, DialogRE. Specifically, we annotate all occurrences of 36 possible relation types that exist between pairs of arguments in the 1,788 dialogues originating from the complete transcripts of Friends, a corpus that has been widely employed in dialogue research in recent years (Catizone et al., 2010;Chen and Choi, 2016;Zhou and Choi, 2018;Rashid and Blanco, 2018;Yang and Choi, 2019). Altogether, we annotate 10,168 relational triples. ...
Preprint
Full-text available
We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE, aiming to support the prediction of relation(s) between two arguments that appear in a dialogue. We further offer DialogRE as a platform for studying cross-sentence RE as most facts span multiple sentences. We argue that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks. Considering the timeliness of communication in a dialogue, we design a new metric to evaluate the performance of RE methods in a conversational setting and investigate the performance of several representative RE methods on DialogRE. Experimental results demonstrate that a speaker-aware extension on the best-performing model leads to gains in both the standard and conversational evaluation settings. DialogRE is available at https://dataset.org/dialogre/.
Conference Paper
Full-text available
This paper describes the PASCAL Net- work of Excellence Recognising Textual Entailment (RTE) Challenge benchmark 1. The RTE task is defined as recognizing, given two text fragments, whether the meaning of one text can be inferred (en- tailed) from the other. This application- independent task is suggested as capturing major inferences about the variability of semantic expression which are commonly needed across multiple applications. The Challenge has raised noticeable attention in the research community, attracting 17 submissions from diverse groups, sug- gesting the generic relevance of the task.
Conference Paper
Full-text available
Appointment scheduling is a problem faced daily by many individuals and organizations. Cooperating agent systems have been developed to partially automate this task. In order to extend the circle of participants as far as possible we advocate the use ...
Article
Full-text available
In the 2002 State of the Union address and at the West Point graduation ceremony, President Bush articulated a new strategy for the United States in dealing with the threat of terrorism with weapons of mass destruction: preventive war. The president argued pessimistically that "time is not on our side" and that we could not afford to "wait on events, while dangers gather." Americans must be ready for "pre-emptive action" to defend our lives.
Article
Full-text available
Building on the work of the TIPSTER architecture group, the University of Sheffield Natural Language Processing group have developed GATE, a General Architecture for Text Engineering. GATE implements the TIPSTER document manager, and adds a rich set of graphical tools for the management of modules and the data they produce and consume, and the visualisation of data. This paper classifies and reviews current approaches to software infrastructure for research, development and delivery of NLP systems. The task is motivated by a discussion of current trends in the field of NLP and Language Engineering. The rest of the paper describes GATE and concludes with a status report on the system. 1 Introduction Building on the work of the TIPSTER architecture group, the University of Sheffield Natural Language Processing group have developed GATE, a General Architecture for Text Engineering. GATE implements the TIPSTER document manager, and adds a rich set of graphical tools for the manage...
Book
Text mining tries to solve the crisis of information overload by combining techniques from data mining, machine learning, natural language processing, information retrieval, and knowledge management. In addition to providing an in-depth examination of core text mining and link detection algorithms and operations, this book examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches. Finally, it explores current real-world, mission-critical applications of text mining and link detection in such varied fields as M&A business intelligence, genomics research and counter-terrorism activities.
Using machine translation evaluation techniques to determine sentence-level semantic equivalence Some Empirical Findings on Dialogue Management and Domain Ontologies in Dialogue System -Implications from an Evaluation of BirdQuest
  • A Finch
  • Y Hwang
  • S Sumita
  • South Korea Flycht-Eriksson
  • A Jönsson
Finch, A., Hwang, Y, S, and Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence, 3rd International Workshop on Paraphrasing (IWP2005), South Korea Flycht-Eriksson, A. & Jönsson, A. (2003) Some Empirical Findings on Dialogue Management and Domain Ontologies in Dialogue System -Implications from an Evaluation of BirdQuest. Proceedings of 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan.
Transductive Pattern Learning for Information Extraction
  • B Mcleron
  • N Kushmerick
McLeron, B., Kushmerick N. (2006) Transductive Pattern Learning for Information Extraction. In Proc. of EACL 2006 Workshop on Adaptive Text Extraction and Mining, Trento.
The PASCAL Recognising Textual Entailment Challenge
  • I Dagan
  • O Glickman
  • B Magnini
Dagan, I. Glickman, O. and Magnini. B. (2006). The PASCAL Recognising Textual Entailment Challenge. In Quiñonero-Candela, J.; Dagan, I.; Magnini, B.; d'Alché-Buc, F. (Eds.) Machine Learning Challenges. Lecture Notes in Computer Science, Vol. 3944, pp. 177-190, Springer.
Some Empirical Findings on Dialogue Management and Domain Ontologies in Dialogue System-Implications from an Evaluation of BirdQuest
  • A Finch
  • Y Hwang
  • S Sumita
Finch, A., Hwang, Y, S, and Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence, 3rd International Workshop on Paraphrasing (IWP2005), South Korea Flycht-Eriksson, A. & Jönsson, A. (2003) Some Empirical Findings on Dialogue Management and Domain Ontologies in Dialogue System-Implications from an Evaluation of BirdQuest. Proceedings of 4th SIGdial Workshop on Discourse and Dialogue, Sapporo, Japan.