Conference Paper

Parser Combination by Reparsing.

DOI: 10.3115/1614049.1614082 Conference: Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 4-9, 2006, New York, New York, USA
Source: DBLP


We present a novel parser combination scheme that works by reparsing input sen- tences once they have already been parsed by several different parsers. We apply this idea to dependency and constituent parsing, generating results that surpass state-of-the- art accuracy levels for individual parsers.

5 Reads
  • Source
    • "It has been long identified in NLP that a diverse set of solutions from a decoder can be reranked or recombined in order to improve the accuracy in various problems (Henderson and Brill, 1998). Such problems include machine translation (Macherey and Och, 2007), syntactic parsing (Charniak and Johnson, 2005; Sagae and Lavie, 2006; Fossum and Knight, 2009; Zhang et al., 2009; Petrov, 2010) and others (Van Halteren et al., 2001). "
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe an approach to incorporate diversity into spectral learning of latent-variable PCFGs (L-PCFGs). Our approach works by creating multiple spectral models where noise is added to the underlying features in the training set before the estimation of each model. We describe three ways to decode with multiple models. In addition, we describe a simple variant of the spectral algorithm for L-PCFGs that is fast and leads to compact models. Our experiments for natural language parsing, for English and German, show that we get a significant improvement over baselines comparable to state of the art. For English, we achieve the $F_1$ score of 90.18, and for German we achieve the $F_1$ score of 83.38.
  • Source
    • "To our knowledge, the first works 3 on predicting both MWEs and dependency trees are those presented to the SPMRL 2013 Shared Task that provided scores for French (which is the only dataset containing MWEs). Constant et al. (2013) proposed to combine pipeline and joint systems in a reparser (Sagae and Lavie, 2006), and ranked first at the Shared Task. Our contribution with respect to that work is the representation of the internal syntactic structure of MWEs, and use of MWE-specific features for the joint system. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate various strategies to predict both syntactic dependency parsing and contiguous multiword expression (MWE) recognition, testing them on the dependency version of French Treebank (Abeillé and Barrier, 2004), as instantiated in the SPMRL Shared Task (Seddah et al., 2013). Our work focuses on using an alternative representation of syntactically regular MWEs, which captures their syntactic internal structure. We obtain a system with comparable performance to that of previous works on this dataset, but which predicts both syntactic dependencies and the internal structure of MWEs. This can be useful for capturing the various degrees of semantic compositionality of MWEs.
    Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); 06/2014
  • Source
    • "sentence Regarding the system combination study, Henderson and Brill [8] proposed two parser constituents combination methods, it breaks each parse tree into constituents, calculates the count of each constituent, then applies the majority voting to decide which constituent would appear in its final tree and get an F1-Score of 90.6. Sagae and Lavie [9] improve this approach by presented a reparsing scheme that produces results with accuracy higher than the best individual parsers available by combining their results. They report that F1-Score reached a score of 91.0. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a simple approach for Chinese syntactic parsing postprocess in this paper. It uses verb subcategorization syntactic mode to match n-best candidate parsing trees outputed from baseline parser system. We extract various features of verb subcategorization from train corpora. And use those features of verb subcategorization extracted from train corpus to rerank the n-best list via a similar pattern matching approach, and with the rule-based method, but no use statistic information. We called this method as rule-based reranking. The result shows our approach reaches a good performance.
Show more

Similar Publications


5 Reads
Available from