G.s. CorradoGoogle Inc. | Google · Machine Intelligence
G.s. Corrado
PhD
About
49
Publications
131,986
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
129,549
Citations
Publications
Publications (49)
Recent work in unsupervised feature learning and deep learning has shown that be-ing able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can ut...
The recently introduced continuous Skip-gram model is an efficient method for
learning high-quality distributed vector representations that capture a large
number of precise syntactic and semantic word relationships. In this paper we
present several extensions that improve both the quality of the vectors and the
training speed. By subsampling of th...
We propose two novel model architectures for computing continuous vector
representations of words from very large data sets. The quality of these
representations is measured in a word similarity task, and the results are
compared to the previously best performing techniques based on different types
of neural networks. We observe large improvements...
Model explanation techniques play a critical role in understanding the source of a model's performance and making its decisions transparent. Here we investigate if explanation techniques can also be used as a mechanism for scientific discovery. We make three contributions: first, we propose a framework to convert predictions from explanation techni...
TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript e...
Each year, the treatment decisions for more than 230,000 breast cancer patients in the U.S. hinge on whether the cancer has metastasized away from the breast. Metastasis detection is currently performed by pathologists reviewing large expanses of biological tissues. This process is labor intensive and error-prone. We present a framework to automati...
We propose a simple, elegant solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of th...
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficu...
Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. W...
Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. W...
In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of...
In this paper we propose and investigate a novel end-to-end method for automatically generating short email responses, called Smart Reply. It generates semantically diverse suggestions that can be used as complete email responses with just one tap on mobile. The system is currently used in Inbox by Gmail and is responsible for assisting with 10% of...
TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hund...
A method, computer readable storage device, and apparatus for determining the distance a computing device is located from a user's face. An image of an individual is obtained. A first pupil location and a second pupil location are identified based on the obtained image. A first distance between the identified first and second pupil location is dete...
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wher...
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of...
We introduce BilBOWA ("Bilingual Bag-of-Words without Alignments"), a simple
and computationally-efficient model for learning bilingual distributed
representations of words which can scale to large datasets and does not require
word-aligned training data. Instead it trains directly on monolingual data and
extracts a bilingual signal from a smaller...
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a model using parameter server shards. One of the methods includes receiving, at a parameter server shard configured to maintain values of a disjoint partition of the parameters of the model, a succession of respective requests for parameter...
Several recent publications have proposed methods for mapping images into
continuous semantic embedding spaces. In some cases the semantic embedding
space is trained jointly with the image transformation, while in other cases
the semantic embedding space is established independently by a separate task,
such as a natural language processing task on...
In general, the subject matter described in this specification can be embodied in methods, systems, and program products. A computing system presents graphical content on a display device. The computing system determines a change in distance between a user of the computing system and a camera by tracking a visible physical feature of the user throu...
Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources - such as tex...
The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large num- ber of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of...
We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements...
We consider three hypotheses concerning the primate neocortex which have influenced computational neuroscience in recent years. Is the mind modular in terms of its being profitably described as a collection of relatively independent functional units? Does the regular structure of the cortex imply a single algorithm at work, operating on many differ...
We consider the problem of building high- level, class-specific feature
detectors from only unlabeled data. For example, is it possible to learn a face
detector using only unlabeled images? To answer this, we train a 9-layered
locally connected sparse autoencoder with pooling and local contrast
normalization on a large dataset of images (the model...
Neural responses are typically characterized by computing the mean firing rate, but response variability can exist across trials. Many studies have examined the effect of a stimulus on the mean response, but few have examined the effect on response variability. We measured neural variability in 13 extracellularly recorded datasets and one intracell...
This chapter focuses on challenge of studying the neurobiology of decision-making. Establishing causal links between neural responses and perceptual or cognitive phenomena is a fundamental challenge faced by researchers not only in neuroeconomics, but also in all of cognitive neuroscience. Historically, support for links between anatomy and functio...
We present a new approach to learning sparse, spatiotemporal codes in which the number of basis vectors, their orientations, velocities and the size of their receptive fields change over the duration of unsupervised training. The algorithm starts with a relatively small, initial basis with minimal temporal extent. This initial basis is obtained thr...
The study of decision making poses new methodological challenges for systems neuroscience. Whereas our traditional approach linked neural activity to external variables that the experimenter directly observed and manipulated, many of the key elements that contribute to decisions are internal to the decider. Variables such as subjective value or sub...
We have measured the transverse asymmetry AT' in the quasielastic 3He(e,e') process with high precision at Q2 values from 0.1 to 0.6 (GeV/c)2. The neutron magnetic form factor G was extracted at Q2 values of 0.1 and 0.2 (GeV/c)2 using a nonrelativistic Faddeev calculation which includes both final-state interactions (FSI) and meson-exchange current...
Archival paper, 17 pages, 10 figures, 5 tables, submitted to Physical Review C. v2: shortened considerably, updated comparison to theory
The equilibrium phenomenon of matching behavior traditionally has been studied in stationary environments. Here we attempt to uncover the local mechanism of choice that gives rise to matching by studying behavior in a highly dynamic foraging environment. In our experiments, 2 rhesus monkeys (Macacca mulatta) foraged for juice rewards by making eye...
To make adaptive decisions, animals must evaluate the costs and benefits of available options. The nascent field of neuroeconomics has set itself the ambitious goal of understanding the brain mechanisms that are responsible for these evaluative processes. A series of recent neurophysiological studies in monkeys has begun to address this challenge u...
Psychologists and economists have long appreciated the contribution of reward history and expectation to decision-making.
Yet we know little about how specific histories of choice and reward lead to an internal representation of the “value” of
possible actions. We approached this problem through an integrated application of behavioral, computationa...
A high precision measurement of the transverse spin-dependent asymmetry A_T' in ^3He(e,e') quasielastic
scattering was performed in Hall A at Jefferson Lab at values of the squared four-momentum transfer, Q^2,
between 0.1 and 0.6 (GeV/c)^2. A_(T') is sensitive to the neutron magnetic form factor, G_M^n . Values of G_M^n at
Q^2 = 0.1 and 0.2 (GeV/c)...
We present the first precision measurement of the spin-dependent asymmetry in the threshold region of 3He(e,e') at Q2 values of 0.1 and 0.2 (GeV/c)2. The agreement between the data and nonrelativistic Faddeev calculations which include both final-state interactions and meson-exchange current effects is very good at Q2 = 0.1 (GeV/c)2, while a small...
We present the first precision measurement of the spin-dependent asymmetry in the threshold region of {sup 3}He(e,e{prime}) at Q{sup 2}-values of 0.1 and 0.2 (GeV/c){sup 2}. The agreement between the data and non-relativistic Faddeev calculations which include both final-state interactions (FSI) and meson-exchange currents (MEC) effects is very goo...
The anterior prefrontal cortex (APC) is known to subserve higher cognitive functions like planning. However, little is known about the functional specialization within this region in humans. Using functional magnetic resonance imaging we report a double dissociation: the medial APC and the ventral striatum were engaged when subjects executed tasks...
We have measured the transverse asymmetry A(T') in 3He(e,e(')) quasielastic scattering in Hall A at Jefferson Laboratory with high precision for Q2 values from 0.1 to 0.6 (GeV/c)(2). The neutron magnetic form factor G(n)(M) was extracted based on Faddeev calculations for Q2 = 0.1 and 0.2 (GeV/c)(2) with an experimental uncertainty of less than 2%.
We have measured the transverse asymmetry AT′ in 3H + (Combining right arrow above sign)e(e + (Combining right arrow above sign), e′) quasielastic scattering in Hall A at Jefferson Laboratory with high precision for Q2 values from 0.1 to 0.6 (GeV/c)2. The neutron magnetic form factor GMn was extracted based on Faddeev calculations for Q2 = 0.1 and...
The anterior prefrontal cortex is known to subserve higher cognitive functions such as task management and planning. Less is known, however, about the functional specialization of this cortical region in humans. Using functional MRI, we report a double dissociation: the medial anterior prefrontal cortex, in association with the ventral striatum, wa...
We have measured the transverse asymmetry from inclusive scattering of longitudinally polarized electrons from polarized 3He nuclei at quasi-elastic kinematics in Hall A at Jefferson Lab with high statistical and systematic precision. The neutron magnetic form factor was extracted based on Faddeev calculations with an experimental uncertainty of le...