Gary Geunbae Lee

Gary Geunbae Lee
  • Pohang University of Science and Technology

About

239
Publications
38,298
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,185
Citations
Current institution

Publications

Publications (239)
Article
Domain adaptation using source domain data is preferable in the development of Deep Q Network (DQN) based dialog systems, because of the training cost for additional target domains. However, the inherent domain shift hinders feature-space level generalization, which degrades performance. We introduce Adversarial Calibrator based Transfer learning (...
Preprint
To ensure satisfactory user experience, dialog systems must be able to determine whether an input sentence is in-domain (ID) or out-of-domain (OOD). We assume that only ID sentences are available as training data because collecting enough OOD sentences in an unbiased way is a laborious and time-consuming job. This paper proposes a novel neural sent...
Article
Full-text available
This paper presents a system to detect multiple intents (MIs) in an input sentence when only single-intent (SI)-labeled training data are available. To solve the problem, this paper categorizes input sentences into three types and uses a two-stage approach in which each stage attempts to detect MIs in different types of sentences. In the first stag...
Article
To ensure satisfactory user experience, dialog systems must be able to determine whether an input sentence is in-domain (ID) or out-of-domain (OOD). We assume that only ID sentences are available as training data because collecting enough OOD sentences in an unbiased way is a laborious and time-consuming job. This paper proposes a novel neural sent...
Conference Paper
Question answering (QA) using triples has been widely studied. One important aspect is answer ranking, that is, which answer candidates should be used to find correct answers. We are proposing a new method using type-matching information for ranking QA triples. We recommend using a new ranking method that includes type-matching scores from semantic...
Conference Paper
In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE dictionary, but collecting them manually is laborious and time-consuming. This paper proposes a...
Article
This paper proposes a sentence stress feedback system in which sentence stress prediction, detection, and feedback provision models are combined. This system provides non-native learners with feedback on sentence stress errors so that they can improve their English rhythm and fluency in a self-study setting. The sentence stress feedback system was...
Article
Previous studies in Open Information Extraction (Open IE) are mainly based on extraction patterns. They manually define patterns or automatically learn them from a large corpus. However, these approaches are limited when grasping the context of a sentence, and they fail to capture implicit relations. In this paper, we address this problem with the...
Chapter
This paper describes a method of ASR (automatic speech recognition) engine independent error correction for a dialog system. The proposed method can correct ASR errors only with a text corpus which is used for training of the target dialog system, and it means that the method is independent of the ASR engine. We evaluated our method on two test cor...
Article
We proposed automatic speech recognition (ASR) error management method using recurrent neural network (RNN) based syllable prediction for spoken dialog applications. ASR errors are detected and corrected by syllable prediction. For accurate prediction of a next syllable, we used a current syllable, previous syllable context, and phonetic informatio...
Conference Paper
We present a mathematical model that integrates multiparty/multimodal components with a stochastic dialog model based on a partially observable Markov decision process (POMDP). Our suggested model consists of three subcomponents: a discrete input streamer, a dialog stack manager, and a back-end dialog system. The discrete input streamer extracts di...
Chapter
Full-text available
This paper presents DietTalk, a diet and health assistant based on a spoken dialog system. The purpose of DietTalk is to help people to control their weight by consulting with it using natural language. DietTalk stores personal status, provides food and exercise information, and recommends appropriate food and exercise. To evaluate the effectivenes...
Chapter
In this demonstration, we present a multi-source hybrid Question Answering (QA) system. Our system consists of four sub-systems: (1) a knowledgebase based QA, (2) an information retrieval based QA, (3) a keyword QA and (4) an information-extraction to construct our own knowledgebase from web texts.With these sub-systems, we can query three types of...
Conference Paper
Full-text available
When implementing a conversational educational teaching agent, user-intent understanding and dialog management in a dialog system are not sufficient to give users educational information. In this paper, we propose a conversational educational teaching agent that gives users some educational information or triggers interests on educational contents....
Chapter
One of the main problems with partially observable Markov decision process (POMDP) in development of spoken dialog system (SDS) is lack of scalability. In development of an SDS with electronic program guide (EPG) domain, we devised a POMDP approach which is operated with summary spaces to respond accurately to multiple drifting goals and massive nu...
Article
Full-text available
This article presents an approach to nonnative pronunciation variants modeling and prediction. The pronunciation variants prediction method was developed by generalized transformation-based error-driven learning (GTBL). The modified goodness of pronunciation (GOP) score was applied to effective mispronunciation detection using logistic regression m...
Conference Paper
Full-text available
This study introduces a personalization framework for dialog systems. Our system automatically collects user-related facts (i.e. triples) from user input sentences and stores the facts in one-shot memory. The system also keeps track of changes in user interests. Extracted triples and entities (i.e. NP-chunks) are stored in a personal knowledge base...
Conference Paper
We proposed an automatic speech recognition (ASR) error correction method using hybrid word sequence matching and recurrent neural network for dialog system applications. Basically, the ASR errors are corrected by the word sequence matching whereas the remaining OOV (out of vocabulary) errors are corrected by the secondary method which uses a recur...
Article
In this paper, we propose a novel CALL (Computer-Assisted Language Learning) system, which is called POMY (POSTECH Immersive English Study). In our system, students can study English while talking to characters in a computer-generated virtual environment. POMY also supports haptic feedback, so students can study English in a more interesting manner...
Chapter
This paper discusses a domain selection method for multi-domain dialog systems to generate the most appropriate system utterance in response to a user utterance. We present a two-step approach for efficient domain selection. In our proposed approach, the domain candidates are listed in descending order of scores and then each domain is verified by...
Article
Full-text available
This study examines the dialog-based language learning game (DB-LLG) realized in a 3D environment built with game contents. We designed the DB-LLG to communicate with users who can conduct interactive conversations with game characters in various immersive environments. From the pilot test, we found that several technologies were identified as esse...
Conference Paper
In spoken English, vowels in non-stressed syllables are often reduced to a brief neutral vowel (e.g, e or ι). Non-native speakers of English may not use this 'vowel reduction' correctly, so their utterances may sound unnatural. We propose an automatic system to provide feedback about vowel-reduction to non-native speakers of English. The system has...
Patent
Disclosed is an apparatus and method for processing documents to extract expressions and descriptions. The apparatus for processing documents includes a document collection unit, which collects documents from websites and divides each of the collected documents into a script portion and a description portion to thus generate a script document and a...
Article
Although researchers have conducted extensive studies on relation extraction in the last decade, statistical systems based on supervised learning are still limited, because they require large amounts of training data to achieve high performance level. In this article, we propose cross-lingual annotation projection methods that leverage parallel cor...
Conference Paper
We propose a method to automatically answer SAT-style sentence completion questions using web-scale data. Web-scale da-ta have been used in many language studies and have been found to be a very useful resource for improving accuracy in sentence completion task. Our method employs assorted N-gram probability information for each candidate word. We...
Conference Paper
Multi-domain dialog systems often encounter user requests for out-of-domain (OOD) service. This paper focuses on detecting these requests. The proposed OOD detection method is included in a multi-domain detection component naturally. This component consists of multiple in-domain verifiers: an in-domain verifier accepts a user utterance when it belo...
Article
This paper proposes an unsupervised spoken language understanding (SLU) framework for a multi-domain dialog system. Our unsupervised SLU framework applies a non-parametric Bayesian approach to dialog acts, intents and slot entities, which are the components of a semantic frame. The proposed approach reduces the human effort necessary to obtain a se...
Article
Associative classification methods have been recently applied to various categorization tasks due to its simplicity and high accuracy. To improve the coverage for test documents and to raise classification accuracy, some associative classifiers generate a huge number of association rules during the mining step. We present two algorithms to increase...
Article
This paper presents a new hybrid dialog management framework that integrates a statistical ranking algorithm into an example-based dialog management approach for chat-like dialogs. The proposed model uses ranking features that consider various aspects of dialogs, including the relative importance of speech acts, dialog history sequences, and the ca...
Conference Paper
Full-text available
This paper examines how grammar questions are automatically generated for L2 learning by applying a sequential labeling technique to learner corpora. We developed a model that helps detect possible error positions and select the most appropriate form among choices. Discriminant models such as conditional random field and maximum entropy are used to...
Article
This article presents a prosodic phrasing model for a general purpose Korean speech synthesis system. To reflect the factors affecting prosodic phrasing in the model, linguistically motivated machine-learning features were investigated. These features were effectively incorporated using a stacking model. The phrasing performance was also improved t...
Conference Paper
Full-text available
We introduce a novel method for grammatical error correction with a number of small corpora. To make the best use of several corpora with different characteristics, we employ a meta-learning with several base classifiers trained on different corpora. This research focuses on a grammatical error correction task for article errors. A series of experi...
Article
Full-text available
Although there have been enormous investments into English education all around the world, not many differences have been made to change the English instruction style. Considering the shortcomings for the current teaching-learning methodology, we have been investigating advanced computer-assisted language learning (CALL) systems. This paper aims at...
Article
This paper presents a new extraction pattern, called a modified Document Type Definition (mDTD), which relies on an analytical interpretation method to identify textual fragments of documents from the Web. We make two major modifications which differ from the conventional DTD. Regarding syntax, we introduced an extended content model with more oper...
Conference Paper
In this paper, we propose an error correction interface for a voice word processor. This correction interface includes user intention understanding and automatic error region detection. For accurate correction, we include a confirmation process that includes an error region control command and a re-uttering command. We evaluate the performance of t...
Conference Paper
In data-driven spoken dialog system development, developers should prepare a dialog corpus with semantic annotation. However, the labeling process is a laborious and time consuming task. To reduce human efforts, we propose an unsupervised approach based on non-parametric Bayesian Hidden Markov Model to the problem of modeling user actions. With the...
Article
Although researchers have conducted exten-sive studies on relation extraction in the last decade, supervised approaches are still limited because they require large amounts of training data to achieve high performances. To build a relation extractor without significant anno-tation effort, we can exploit cross-lingual an-notation projection, which l...
Article
Word alignment is a crucial component in applications that use bilingual resources. Statistical methods are widely used because they are portable and allow simple system building. However, pure statistical methods often incorrectly align functional words in the English–Korean language pair due to differences in the typology of the languages and a l...
Article
Computational methods for predicting protein subcellular localization have used various types of features, including N-terminal sorting signals, amino acid compositions, and text annotations from protein databases. Our approach does not use biological knowledge such as the sorting signals or homologues, but use just protein sequence information. Th...
Chapter
As the importance of learning foreign languages increases, the demand for language learning materials is increasing. A dialog system can be one of useful multimedia resources for language learning. Contrary to information-seeking dialog systems, ranking user/system dialog acts can be very useful for language tutoring purpose. Therefore we propose a...
Article
The demand for computer-assisted language learning systems that can provide corrective feedback on language learners’ speaking has increased. However, it is not a trivial task to detect grammatical errors in oral conversations because of the unavoidable errors of automatic speech recognition systems. To provide corrective feedback, a novel method t...
Article
This paper presents an automated method to generate realistic grammatical errors that can perform crucial functions for advanced technologies in computer-assisted language learning (CALL), including generating corrective feedback in dialog-based CALL (DB-CALL) systems, simulating a language learner to optimize tutoring strategies, and generating co...
Article
In this paper, we address the problem of relation extraction of multiple arguments where the relation of entities is framed by multiple attributes. Such complex relations are successfully extracted using a syntactic tree-based pattern matching method. While induced subtree patterns are typically used to model the relations of multiple entities, we...
Article
Full-text available
This study introduces the educational assistant robots that we developed for foreign language learning and explores the effectiveness of robot-assisted language learning (RALL). To achieve this purpose, a course was designed in which students have meaningful interactions with intelligent robots in an immersive environment. A total of 24 elementary...
Article
This paper proposes a novel user intention simulation method which is data-driven but can integrate diverse user discourse knowledge to simulate various types of user behaviors. A method of data-driven user intention modeling based on logistic regression is introduced in the Markov logic framework. Human dialog knowledge is designed into two layers...
Conference Paper
We describe Let's Buy Books, a dialog system that helps users search for eBook titles. In this paper we compare different vector space approaches to voice search and find that a hybrid approach using a weighted sub-space model smoothed with a general model provides the best performance over different conditions and evaluated using both synthetic qu...
Conference Paper
This study introduces the speech and language technologies used in the educational assistant robots that we developed for language learning and exploring the affective effects of robot-assisted language learning (RALL). To achieve this purpose, a course was designed in which students have meaningful interaction with intelligent robots in an immersi...
Article
This study introduces the educational assistant robots that we developed for foreign language learning and explores the effectiveness of robot-assisted language learning (RALL) which is in its early stages. To achieve this purpose, a course was designed in which students have meaningful interactions with intelligent robots in an immersive environme...
Conference Paper
Full-text available
This demonstration will illustrate an interactive immersive computer game, POMY, designed to help Korean speakers learn English. This system allows learners to exercise their visual and aural senses, receiving a full immersion experience to increase their memory and concentration abilities to a greatest extent. In POMY, learners can have free conve...
Conference Paper
The demand for computer-assisted language learning systems that can provide corrective feedback on language learners' speaking has increased. However, it is not a trivial task to detect grammatical errors in oral conversations because of the unavoidable errors of automatic speech recognition systems. To provide corrective feedback, a novel method t...
Article
Full-text available
Although there have been enormous investments into English education all around the world, not many differences have been made to change the English instruction style. Considering the shortcomings for the current teaching-learning methodology, we have been investigating advanced computer-assisted language learning (CALL) systems. This paper aims at...
Book
Spoken Dialogue Systems Technology and Design covers all key topics in spoken language dialogue interaction, through perspectives from a variety of leading researchers. This volume brings together valuable information in the areas of spoken dialogue analysis, processing emotions in dialogue, multimodality, as well as resources and evaluation. The b...
Article
Spoken language understanding (SLU) has received recent interest as a component of spoken dialog systems to infer intentions of the speaker and to provide natural human-centric interfaces for ambient intelligence. The goal of SLU is to map natural language speech onto a frame structure that encodes its meaning. While most ap-proaches based on rule-...
Article
Full-text available
This paper proposes a method to model confirmations for example-based dialog management. To enable the system to provide a confirmation to the user in an appropriate time, we employed a multiple dialog state representation approach for keeping track of user input uncertainty and implemented a confirmation agent which decides when the information ga...
Article
Spoken dialog systems have difficulty selecting which action to take in a given situation because recognition and understanding errors are prevalent due to noise and unexpected inputs. To solve this problem, this paper presents a hybrid approach to improving robustness of the dialog manager by using agenda-based and example-based dialog modeling. T...
Conference Paper
One of the enduring problems in developing high-quality TTS (text-to-speech) system is pitch contour generation. Considering language specific knowledge, an adjusted Fujisaki model for Korean TTS system is introduced along with refined machine learning features. The results of quantitative and qualitative evaluations show the validity of our system...
Article
Full-text available
A field of spoken dialog systems is a rapidly growing research area because the performance improvement of speech technologies motivates the possibility of building systems that a human can easily operate in order to access useful information via spoken languages. Among the components in a spoken dialog system, the dialog management plays major rol...
Conference Paper
Full-text available
Query relaxation refers to the process of reducing the number of constraints on a query if it returns no result when searching a database. This is an important process to enable extraction of an appropriate number of query results because queries that are too strictly constrained may return no result, whereas queries that are too loosely constraine...
Conference Paper
Full-text available
While extensive studies on relation extraction have been conducted in the last decade, statistical systems based on supervised learning are still limited because they require large amounts of training data to achieve high performance. In this paper, we develop a cross-lingual annotation projection method that leverages parallel corpora to bootstrap...
Conference Paper
Full-text available
One of the best effective way to learn a language is having a conversation with a native speaker. However it is often very expensive way. A good alternative way is using Dialog-Based Computer Assisted Language Learning (DB-CALL) systems. The feedback quality in DB-CALL systems is very important. Therefore, to provide various expressions as feedback...
Conference Paper
Full-text available
In order to facilitate language acquisition, when language learners speak incomprehensible utterances, a Dialog-based Computer Assisted Language Learning (DB-CALL) system should provide matching fluent utterances by inferring the actual learner's intention both from the utterance itself and from the dialog context as human tutors do. We propose a h...
Conference Paper
Full-text available
Recently various data-driven spoken language technologies have been applied to spoken dialog system development. However, high cost of maintaining the spoken dialog systems is one of the biggest challenges. In addition, a fixed corpus collected by human is never enough to cover diverse real user’s utterances. The concept of a daydreaming dialog sys...
Article
Spoken language understanding (SLU) aims to map a user's speech into a semantic frame. Since most of the previous works use the semantic structures for SLU, we verify that the structure is useful even for noisy input. We apply a structured prediction method to SLU problem and compare it to an unstructured one. In addition, we present a combined met...
Article
This paper proposes a novel integrated dialog simulation technique for evaluating spoken dialog systems. A data-driven user simulation technique for simulating user intention and utterance is introduced. A novel user intention modeling and generating method is proposed that uses a linear-chain conditional random field, and a two-phase data-driven d...
Article
This paper presents a data-driven Korean grapheme-to-phoneme conversion method including alignment, rule extraction, and rule pruning procedures. Novel rule extraction and pruning techniques are introduced to effectively handle the exceptional pronunciation of speech databases. The performances with the full rules and the reduced rules are 99.22% a...
Conference Paper
Full-text available
In the grapheme to phoneme conversion problem for Korean, two main approaches have been discussed: knowledge-based and data-driven methods. However, both camps have limita- tions: the knowledge-based hand-written rules cannot handle some of the pronunciation changes due to the lack of capabili- ty of linguistic analyzers and many exceptions; data-d...
Article
This paper proposes a generic dialog modeling framework for a multi-domain dialog system to simultaneously manage goal-oriented and chat dialogs for both information access and entertainment. We developed a dialog modeling technique using an example-based approach to implement multiple applications such as car navigation, weather information, TV pr...
Article
This paper addresses the problem of multi-domain spoken language understanding (SLU) where domain detection and domain-dependent semantic tagging problems are combined. We present a transfer learning approach to the multi-domain SLU problem in which multiple domain-specific data sources can be incorporated. To implement multi-domain SLU with transf...
Conference Paper
Full-text available
Various knowledge sources are used for spo- ken dialog systems such as task model, do- main model, and agenda. An agenda graph is one of the knowledge sources for a dialog management to reflect a discourse structure. This paper proposes a clustering and linking method to automatically construct an agenda graph from human-human dialogs. Prelimi- nar...
Conference Paper
Full-text available
This paper presents a new soft pattern match- ing method which aims to improve the recall with minimized precision loss in information extraction tasks. Our approach is based on a local tree alignment algorithm, and an effec- tive strategy for controlling flexibility of the pattern matching will be presented. The ex- perimental results show that th...
Conference Paper
Full-text available
This paper proposes a novel user intention si- mulation method which is a data-driven ap- proach but able to integrate diverse user dis- course knowledge together to simulate various type of users. In Markov logic framework, lo- gistic regression based data-driven user inten- tion modeling is introduced, and human dialog knowledge are designed into...
Conference Paper
Full-text available
This paper presents an efficient inference algo- rithm of conditional random fields (CRFs) for large-scale data. Our key idea is to decompose the output label state into an active set and an inactive set in which most unsupported tran- sitions become a constant. Our method uni- fies two previous methods for efficient infer- ence of CRFs, and also d...
Conference Paper
Full-text available
The development of Dialog-Based Computer- Assisted Language Learning (DB-CALL) sys- tems requires research on the simulation of language learners. This paper presents a new method for generation of grammar errors, an important part of the language learner simula- tor. Realistic errors are generated via Markov Logic, which provides an effective way...
Conference Paper
Full-text available
In this paper, we present a semi-supervised method for automatic speech act recogni- tion in email and forums. The major chal- lenge of this task is due to lack of labeled data in these two genres. Our method leverages labeled data in the Switchboard- DAMSL and the Meeting Recorder Dia- log Act database and applies simple do- main adaptation techni...
Article
Sequential modeling is a fundamental task in scientific fields, especially in speech and natural language processing, where many problems of sequential data can be cast as a sequential labeling or a sequence classification. In many applications, the two problems are often correlated, for example named entity recognition and dialog act classificatio...
Conference Paper
Associative classification is a novel and powerful method originating from association rule mining. In the previous studies, a relatively small number of high-quality association rules were used in the prediction. We propose a new approach in which a large number of association rules are generated. Then, the rules are filtered using a new method wh...
Conference Paper
Full-text available
Many natural language dialog systems have been developed with relational database (RDB) as a machine-readable knowledge source. However, RDB has some problems for answering the questions which need complex domain-specific information. In addition to the typical problems of RDB such as dependency and redundancy problems, limitations of meaning repre...
Article
Recently, data-driven speech technologies have been widely used to build speech user interfaces. However, developing and managing data-driven spoken dialog systems are laborious and time consuming tasks. Spoken dialog systems have many components and their development and management involves numerous tasks such as preparing the corpus, training, te...
Conference Paper
Full-text available
Google entered China market as a late-comer in late-2005, with no local employees, an inadequate product line, and small market share. This talk will discuss Google China's efforts to build up a team, learn about local user needs, apply its global innovation ...
Article
Spoken language understanding (SLU) addresses the problem of mapping natural language speech to frame structure encoding of its meaning. The statistical sequential labeling method has been successfully applied to SLU tasks; however, most sequential labeling approaches lack long-distance dependency information handling method. In this paper, we expl...
Article
Full-text available
Spoken dialog tasks incur many errors including speech recognition errors, understanding errors, and even dialog management errors. These errors create a big gap between the user's intention and the system's understanding, which eventually results in a misinterpretation. To fill in the gap, people in human-to-human dialogs try to clarify the major...
Article
The main issues of practical spoken-language applications forhuman-computer interface are how to overcome speech recognitionerrors and guarantee the reasonable end-performance ofspoken-language applications. Therefore, handling the erroneouslyrecognized outputs is a key in developing robust spoken-languagesystems. To address this problem, we presen...
Conference Paper
Full-text available
We present an alignment-based approach to semi-supervised relation extraction task including more than two arguments. We concentrate on improving not only the precision of the extracted result, but also on the coverage of the method. Our relation extraction method is based on an alignment-based pattern matching approach which provides more flexibi...
Conference Paper
Error handling has become an important issue in spoken dialog systems. We describe an example-based approach to detect and repair errors in an example-based dialog modeling framework. Our approach to error recovery is focused on the re-phrase strategy with a system and a task guidance to help the novice users to re-phrase well-recognizable and well...
Conference Paper
Full-text available
This work presents an agenda-based approach to improve the robustness of the dialog man- ager by using dialog examples and n-best recognition hypotheses. This approach sup- ports n-best hypotheses in the dialog man- ager and keeps track of the dialog state us- ing a discourse interpretation algorithm with the agenda graph and focus stack. Given the...
Article
Full-text available
This paper proposes a probabilistic framework for spoken dialog management using dialog examples. To overcome the complexity prob- lems of the classic partially observable Mar- kov decision processes (POMDPs) based dialog manager, we use a frame-based belief state representation that reduces the complexi- ty of belief update. We also used dialog ex...

Network

Cited By