Conference Paper

Nonparametric Bayesian Models for Unsupervised Event Coreference Resolution.

Conference: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, Vancouver, British Columbia, Canada.
Source: DBLP

ABSTRACT We present a sequence of unsupervised, nonparametric Bayesian models for clus- tering complex linguistic objects. In this approach, we consider a potentially infi- nite number of features and categorical outcomes. We evaluated these models for the task of within- and cross-document event coreference on two corpora. All the models we investigated show significant improvements when c ompared against an existing baseline for this task.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We define a probability distribution over equivalence classes of binary ma- trices with a finite number of rows and an unbounded number of columns. This distribution is suitable for use as a prior in probabilistic models that represent objects using a potentially infinite array of features. We derive the distribution by taking the limit of a distribution over N × K binary matrices as K ! 1, a strategy inspired by the derivation of the Chinese restaurant process (Aldous, 1985; Pitman, 2002) as the limit of a Dirichlet-multinomial model. This strategy preserves the exchangeability of the rows of matrices. We define several simple generative processes that result in the same distri- bution over equivalence classes of binary matrices, one of which we call the Indian buffet process. We illustrate the use of this distribution as a prior in an infinite latent feature model, deriving a Markov chain Monte Carlo algo- rithm for inference in this model and applying this algorithm to an artificial dataset.
    Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, NIPS 2005, December 5-8, 2005, Vancouver, British Columbia, Canada]; 01/2005
  • Source
    The Annals of Statistics 09/2000; 31(3):705-767. · 2.53 Impact Factor
  • Source
    Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics; 01/2007

Full-text (2 Sources)

Available from
Jun 2, 2014