About
8
Publications
32,395
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
53,386
Citations
Introduction
Skills and Expertise
Publications
Publications (8)
We propose a new way of incorporating temporal information present in videos
into Spatial Convolutional Neural Networks (ConvNets) trained on images, that
avoids training Spatio-Temporal ConvNets from scratch. We describe several
initializations of weights in 3D Convolutional Layers of Spatio-Temporal
ConvNet using 2D Convolutional Weights learned...
We conduct an in-depth exploration of different strategies for doing event
detection and action recognition in videos using convolutional neural networks
(CNNs) trained for image classification. We study different ways of performing
frame calibration, spatial and temporal pooling, feature normalization, choice
of CNN layer as well as choice of clas...
We use multilayer Long Short Term Memory (LSTM) networks to learn
representations of video sequences. Our model uses an encoder LSTM to map an
input sequence into a fixed length representation. This representation is
decoded using single or multiple decoder LSTMs to perform different tasks, such
as reconstructing the input sequence, or predicting t...
A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a p...
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for address...
Attention has long been proposed by psychologists as important for
effectively dealing with the enormous sensory stimulus available in the
neocortex. Inspired by visual attention models in computational neuroscience
and by the need for deep generative models to learn on object-centric data, we
describe a framework for generative learning using atte...
We introduce a Deep Boltzmann Machine model suitable for modeling and
extracting latent semantic representations from a large unstructured collection
of documents. We overcome the apparent difficulty of training a DBM with
judicious parameter tying. This parameter tying enables an efficient
pretraining algorithm and a state initialization scheme th...
When a large feedforward neural network is trained on a small training set,
it typically performs poorly on held-out test data. This "overfitting" is
greatly reduced by randomly omitting half of the feature detectors on each
training case. This prevents complex co-adaptations in which a feature detector
is only helpful in the context of several oth...