Conference Proceeding
Improvements to the Sequence Memoizer.
01/2010; In proceeding of: Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 69 December 2010, Vancouver, British Columbia, Canada.
Source: DBLP
 Citations (13)
 Cited In (0)

Conference Proceeding: A Neural Probabilistic Language Model.
[show abstract] [hide abstract]
ABSTRACT: A goal of statistical language modeling is to learn the joint probabilit y function of sequences of words. This is intrinsically difficult because o f the curse of dimensionality: we propose to fight it with its own weap ons. In the proposed approach one learns simultaneously (1) a distributed r ep resentation for each word (i.e. a similarity between words) along with (2) the probability function for word sequences, expressed with these repr e sentations. Generalization is obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar to words forming an already seen sentence. We report on experiments using neural networks for the probability function, sh owing on two text corpora that the proposed approach very significantly im proves on a stateoftheart trigram model.Advances in Neural Information Processing Systems 13, Papers from Neural Information Processing Systems (NIPS) 2000, Denver, CO, USA; 01/2000 
Conference Proceeding: A Note on the Implementation of Hierarchical Dirichlet Processes.
[show abstract] [hide abstract]
ABSTRACT: The implementation of collapsed Gibbs samplers for nonparametric Bayesian models is nontrivial, requiring con siderable bookkeeping. Goldwater et al. (2006a) presented an approximation which significantly reduces the storage and computation overhead, but we show here that their formulation was incorrect and, even after correction, is grossly inac curate. We present an alternative formula tion which is exact and can be computed easily. However this approach does not work for hierarchical models, for which case we present an efficient data structure which has a better space complexity than the naive approach.ACL 2009, Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 27 August 2009, Singapore, Short Papers; 01/2009 
Article: Hierarchical Dirichlet processes
[show abstract] [hide abstract]
ABSTRACT: We consider problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the wellknown clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of a stickbreaking process, and a generalization of the Chinese restaurant process that we refer to as the "Chinese restaurant franchise." We present Markov chain Monte Carlo algorithms for posterior inference in hierarchical Dirichlet process mixtures and describe applications to problems in information retrieval and text modeling.Teh, Y.W. and Jordan, M.I. and Beal, M.J. and Blei, D.M. (2006) Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101 (476). pp. 15661581. ISSN 01621459. 01/2006;
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.