
Kai Chen- Google Inc.
Kai Chen
- Google Inc.
About
8
Publications
94,204
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
90,706
Citations
Introduction
Current institution
Publications
Publications (8)
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wher...
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a model using parameter server shards. One of the methods includes receiving, at a parameter server shard configured to maintain values of a disjoint partition of the parameters of the model, a succession of respective requests for parameter...
The recently introduced continuous Skip-gram model is an efficient method for
learning high-quality distributed vector representations that capture a large
number of precise syntactic and semantic word relationships. In this paper we
present several extensions that improve both the quality of the vectors and the
training speed. By subsampling of th...
We propose two novel model architectures for computing continuous vector
representations of words from very large data sets. The quality of these
representations is measured in a word similarity task, and the results are
compared to the previously best performing techniques based on different types
of neural networks. We observe large improvements...
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large num- ber of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of...
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements...
Recent work in unsupervised feature learning and deep learning has shown that be-ing able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can ut...
We consider the problem of building high- level, class-specific feature
detectors from only unlabeled data. For example, is it possible to learn a face
detector using only unlabeled images? To answer this, we train a 9-layered
locally connected sparse autoencoder with pooling and local contrast
normalization on a large dataset of images (the model...