Alex Krizhevsky's research while affiliated with Google Inc. and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (16)
Our goal is to train a policy for autonomous driving via imitation learning that is robust enough to drive a real vehicle. We find that standard behavior cloning is insufficient for handling complex driving scenarios, even when we leverage a perception system for preprocessing the input and a controller for executing the output on the car: 30 milli...
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independentl...
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural network to predict the probability that task-space motion of the gripper will result in successful grasps, using only monocular camera images and independentl...
Pedestrian detection is of crucial importance to autonomous driving applications. Methods based on deep learning have shown significant improvements in accuracy, which makes them particularly suitable for applications, such as pedestrian detection, where reducing the miss rate is very important. Although they are accurate, their runtime has been at...
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for address...
I present a new way to parallelize the training of convolutional neural
networks across multiple GPUs. The method scales significantly better than all
alternatives when applied to modern convolutional neural networks.
When a large feedforward neural network is trained on a small training set,
it typically performs poorly on held-out test data. This "overfitting" is
greatly reduced by randomly omitting half of the feature detectors on each
training case. This prevents complex co-adaptations in which a feature detector
is only helpful in the context of several oth...
April 8, 2009Groups at MIT and NYU have collected a dataset of millions of tiny colour images from the web. It is, in principle, an excellent dataset for unsupervised training of deep generative models, but previous researchers who have tried this have found it di cult to learn a good set of lters from the images. We show how to train a multi-layer...
We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny images dataset. When training a convolutional DBN, one must decide what to do with the edge pixels of teh images. As the pixels near the edge of an image contribute to the fewest convolutional lter outputs, the model may
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60...
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 m...
The artificial neural networks that are used to recognize shapes typically use one or more layers of learned feature detectors
that produce scalar outputs. By contrast, the computer vision community uses complicated, hand-engineered features, like SIFT
[6], that produce a whole vector of outputs including an explicit representation of the pose of t...
We show how to learn many layers of features on color images and we use these features to initialize deep autoencoders. We then use the autoencoders to map images to short binary codes. Using semantic hashing [6], 28-bit codes can be used to retrieve images that are similar to a query image in a time that is independent of the size of the database....
Deep belief nets have been successful in modeling handwritten characters, but it has proved more difficult to apply them to real images. The problem lies in the restricted Boltzmann machine (RBM) which is used as a module for learning deep belief nets one layer at a time. The Gaussian-Binary RBMs that have been used to model real-valued data are no...
Citations
... Researchers have also investigated how the performance of imitation learning can be further improved to achieve robust driving in reality [34]. They claimed that complex scenarios can't be handled by cloning of standard behaviours only (even 30 million examples would not be sufficient). ...
... By training the CCN on four datasets, our method achieves improved accuracy and robustness in COVID-19 detection, making it a highly promising, innovative approach for medical image analysis. • Leveraging Transfer Learning Technique: To detect COVID- 19 effectively in patients, transfer learning is employed with seven commonly used CNN architectures, namely, GoogleNet [10], SqueezeNet [11], ResNet-18, ResNet-50 [12], AlexNet [13], DarkNet [14], and ShuffleNet [15], along with four diverse datasets for X-ray and CT images. The advantage of our technique lies in its adaptability and scalability because it eliminates the need for the manual assignment of hyperparameter values for the CNN architectures. ...
... As part of the meal preparation task in healthcare facilities, packaged food is picked from various large boxes and placed onto a moving tray on the conveyor belt. The challenge of grasping objects from the cluttered scenes in such bins for automation purposes has been an active area of research in recent years [4,6,[9][10][11][12][13]20]. Approaches for finding robotic grasping poses in cluttered scenes are usually divided into two categories: model-based and model-free techniques [4,5]. ...
... Automatic resetting real-world environments after each epoch poses a significant challenge, often requiring human intervention. For example, in Kalashnikov et al. (2018), Levine et al. (2018) and Zeng et al. (2018), the authors created convex workspaces, restricted view to top-down or limited robot movements to address this problem. In contrast, simulation's resets can be done easily, enabling faster, safer, and cheaper prototyping. ...
... Angelova et al. (2015) [24] designed an intelligence-based system for real-time pedestrian detection architecture using a cascade of DCNN. The tiny model was used to reject a huge DNN was used to classify the hard proposals of a large number of easy negatives were identified. ...
... Dropout means that some neurons in the network are dropped out. However, instead of being deleted, they are temporarily dropped out of the network, including their output and input connections [65], as shown in Figure 2. The neurons that are dropped out are randomly selected. Each neuron is left with a fixed probability p. ...
... Nowadays, owing to the fast growth of computer technology, neural networks are increasingly being employed not only in experimental tasks, but also in practical solutions to a variety of everyday issues [27,28]. Application tasks of the type of information processing, classification of patterns [29] or situations [30,31]; prediction tasks, and optimization problems are found in a variety of applications in industry, finance, environmental monitoring [32], telecommunications [33], healthcare, military technology, and other fields. ...
... However, as mentioned earlier, one of the biggest limitations in the medical community is the inability to access larger, labelled, high-quality data that are sensitive, confidential, and difficult to collect. Due to the insufficient amount of data, in this work, in addition to using data augmentation techniques to augment data, we also incorporate a simplified version of the AlexNet [34] architecture, which consists of two main parts (convolutional layers for feature extraction and fully connected layers for classification), similar to [20]. Furthermore, it is worth pointing out that for fair comparisons, the one-, two-, and three-dimensional convolutional neural networks use the same network architecture, and the only difference is the size of the convolution kernel and the convolution method. ...
... Dataset description. For the general classification setting, we consider three real-world datasets: Cifar10, Cifar100 [12], and TinyImageNet [14]. For the out-of-distribution setting, we consider the corrupted version of the Cifar10 and Cifar100 datasets which are named Cifar10-C and Cifar100-C [10]. ...
... Considering the relatively large size of the VGG16 model [18], a modified version of VGG (referred to as "VGGnet") is employed as the initial hardware implementation for performance analysis. The image classification task will be performed using the CIFAR-10 dataset [19]. This modification is required to relax the requirement to perform the simulations, especially the circuit-level simulations. ...