Wen-tau Yih

University of Illinois, Urbana-Champaign, Urbana, IL, USA

Are you Wen-tau Yih?

Claim your profile

Publications (21)0 Total impact

  • Source
    Article: The Importance of Syntactic Parsing and Inference in Semantic Role Labeling.
    Vasin Punyakanok, Dan Roth, Wen-tau Yih
    Computational Linguistics. 01/2008; 34:257-287.
  • Source
    Conference Proceeding: Generalized Inference with Multiple Semantic Role Labeling Systems
    Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL); 06/2005
  • Conference Proceeding: Integer linear programming inference for conditional random fields.
    Dan Roth, Wen-tau Yih
    Machine Learning, Proceedings of the Twenty-Second International Conference (ICML 2005), Bonn, Germany, August 7-11, 2005; 01/2005
  • Source
    Conference Proceeding: Demonstrating an Interactive Semantic Role Labeling System.
    HLT/EMNLP 2005, Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, 6-8 October 2005, Vancouver, British Columbia, Canada; 01/2005
  • Source
    Conference Proceeding: The Necessity of Syntactic Parsing for Semantic Role Labeling.
    Vasin Punyakanok, Dan Roth, Wen-tau Yih
    IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30-August 5, 2005; 01/2005
  • Source
    Conference Proceeding: Learning and Inference over Constrained Output.
    IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30-August 5, 2005; 01/2005
  • Article: A Linear Programming Formulation for Global Inference in Natural
    Dan Roth, Wen-tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: Given a collection of discrete random variables representing outcomes of learned local predictors in natural language, e.g., named entities and relations, we seek an optimal global assignment to the variables in the presence of general (non-sequential) constraints. Examples of these constraints include the type of arguments a relation can take, and the mutual activity of different relations, etc. We develop a linear programming formulation for this problem and evaluate it in the context of simultaneously learning named entities and relations. Our approach allows us to efficiently incorporate domain and task specific constraints at decision time, resulting in significant improvements in the accuracy and the "human-like" quality of the inferences.
    06/2004;
  • Source
    Article: Semantic Role Labeling Via Generalized Inference Over Classifiers
    [show abstract] [hide abstract]
    ABSTRACT: We present a system submitted to the CoNLL2004 shared task for semantic role labeling.
    04/2004;
  • Article: Probabilistic Reasoning for Entity Relation Recognition
    Dan Roth, Wen-tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: This paper develops a method for recognizing relations and entities in sentences, while taking mutual dependencies among them into account. E.g., the kill (Johns, Oswald) relation in: "J. V. Oswald was murdered at JFK after his assassin, K. F. Johns..." depends on identifying Oswald and Johns as people, JFK being identified as a location, and the kill relation between Oswald and Johns; this, in turn, enforces that Oswald and Johns are people.
    03/2004;
  • Source
    Article: A Linear Programming Formulation for Global Inference in Natural Language Tasks
    Dan Roth, Wen-tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processing is a crude approximation to a process in which interactions occur across levels and down stream decisions often interact with previous decisions.
    01/2004;
  • Source
    Article: Mapping Dependencies Trees: An Application to Question Answering
    Vasin Punyakanok, Dan Roth, Wen-tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: We describe an approach for answer selection in a free form question answering task. In order to go beyond the key-word based matching in selecting answers to questions, one would like to incorporate both syntactic and semantic information in the question answering process. We achieve this goal by representing both questions and candidate passages using dependency trees, and incorporating semantic information such as named entities in this representation. The sentence that best answers a question is determined to be the one that minimizes the generalized edit distance between it and the question tree, computed via an approximate tree matching algorithm. We evaluate the approach on question-answer pairs taken from previous TREC Q/A competitions. Preliminary experiments show its potential by significantly outperforming common bag-of-word scoring methods.
    01/2004;
  • Source
    Article: Question-Answering via Enhanced Understanding of Questions
    [show abstract] [hide abstract]
    ABSTRACT: Representation Construction Relation Extraction Question Classification Abstract Representation Construction Relation Extraction Question Set Indexing Extracting Satisfaction Answer Selection Rule-Based R/E Selection Extracting Satisfaction Answer Selection Rule-Based R/E Term Selection and Query Formulation Passage Retrieval Term Selection and Query Formulation Knowledge Base Answer:1559 NYT19990902.0264 Lincoln Memorial Preprocessing Files Question Analysis Record Question Preprocessing Record Indexed Documents Question Analysis Record Retrieved Passages the answer phrase with the top score NE Tagger POS Tagger Shallow Parser NE Tagger POS Tagger Figure 1: System Architecture NE: Who was the [Num first] woman killed in the [Event Vietnam War] ? Like the shallow parser, the named entity recognition process centers around the SNoW based CSCL [ 13 ] , with the addition of some predefined lists for some of the semantic categories. One of the major setbacks in the development of this tool was a lack of su#cient training data. Since our decision process crucially depends on this categorization process, both in question classification and answer selection, we are planning to work on improving the accuracy of this tool.
    09/2003;
  • Source
    Article: Learning Components for A Question-Answering System
    [show abstract] [hide abstract]
    ABSTRACT: We describe a machine learning approach to the development of several key components in a question answering system and the way they were used in the UIUC QA system.
    06/2002;
  • Source
    Article: Relational Learning via Propositional Algorithms: An Information Extraction Case Study
    Dan Roth, Wen-tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: This paper develops a new paradigm for relational learning which allows for the representation and learning of relational information using propositional means. This paradigm suggests different tradeoffs than those in the traditional approach to this problem -- the ILP approach -- and as a result it enjoys several significant advantages over it. In particular, the new paradigm is more flexible and allows the use of any propositional algorithm, including probabilistic algorithms, within it. We evaluate the new approach on an important and relation-intensive task - Information Extraction - and show that it outperforms existing methods while being orders of magnitude more efficient. 1
    05/2001;
  • Conference Proceeding: Learning Components for A Question-Answering System.
    01/2001
  • Source
    Article: Inference & Learning with Linear Constraints
    [show abstract] [hide abstract]
    ABSTRACT: We present a discriminatory learning framework for the problem of assigning globally optimal values to a set of variables with complex and expressive de-pendencies among them. The problem is modeled as an integer linear program (ILP) where the cost values associated with the variables are represented and trained as linear classifiers. The framework unifies and extends several ex-isting discriminatory approaches; most importantly, it supports more complex dependencies among variables than existing ones. This presentation concen-trates on the benefits of the additional expressivity and on comparing different training paradigms – with and without global feedback – in the context of se-mantic role labeling.
  • Source
    Article: Semantic role labeling via integer linear programming inference
    [show abstract] [hide abstract]
    ABSTRACT: We present a system for the semantic role la-beling task. The system combines a machine learning technique with an inference procedure based on integer linear programming that sup-ports the incorporation of linguistic and struc-tural constraints into the decision process. The system is tested on the data provided in CoNLL-2004 shared task on semantic role labeling and achieves very competitive results.
  • Source
    Article: Learning via inference over structurally constrained output
    [show abstract] [hide abstract]
    ABSTRACT: We experimentally analyze learning structured output in a discriminative framework where values of the output variables are estimated by local classifiers. In this framework, complex dependencies among the output variables are captured by constraints that dictate how global labels can be inferred. We compare two strategies, learning plus inference and inference based training, by observing their behaviors in different conditions. We conclude that using inference during learning helps when the local classifiers are difficult to learn but requires more examples.
  • Source
    Article: Global inference for entity and relation identification via a linear programming formulation
    Dan Roth, Wen-Tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: Natural language decisions often involve assigning values to sets of variables, rep-resenting low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. In this work we study a specific instantiation of this problem in the context of identifying named entities and relations between them in free form text. Given a collection of discrete random variables representing outcomes of learned local predictors for entities and relations, we seek an optimal global assignment to the variables that respects multiple constraints, including constraints on the type of arguments a relation can take, and the mutual activity of different relations. We develop a linear programming formulation to address this global inference problem and evaluate it in the context of simultaneously learning named entities and relations. We show that global inference improves stand-alone learning; in addition, our approach allows us to efficiently incorporate expressive domain and task specific constraints at decision time, resulting, beyond significant improvements in the accuracy, in "coherent" quality of the inference.
  • Source
    Article: 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation
    Dan Roth, Wen-Tau Yih
    [show abstract] [hide abstract]
    ABSTRACT: Natural language decisions often involve assigning values to sets of variables, rep-resenting low level decisions and context dependent disambiguation. In most cases there are complex relationships among these variables representing dependencies that range from simple statistical correlations to those that are constrained by deeper structural, relational and semantic properties of the text. In this work we study a specific instantiation of this problem in the context of identifying named entities and relations between them in free form text. Given a collection of discrete random variables representing outcomes of learned local predictors for entities and relations, we seek an optimal global assignment to the variables that respects multiple constraints, including constraints on the type of arguments a relation can take, and the mutual activity of different relations. We develop a linear programming formulation to address this global inference problem and evaluate it in the context of simultaneously learning named entities and relations. We show that global inference improves stand-alone learning; in addition, our approach allows us to efficiently incorporate expressive domain and task specific constraints at decision time, resulting, beyond significant improvements in the accuracy, in "coherent" quality of the inference.