Zhouhan Lin

Zhouhan Lin
Université de Montréal | UdeM · Department of Computer Science and Operations Research

About

31
Publications
44,234
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,470
Citations
Additional affiliations
August 2008 - July 2014
Harbin Institute of Technology
Position
  • Student

Publications

Publications (31)
Article
Full-text available
Recently, a great many deep convolutional neural network (CNN)-based methods have been proposed for hyperspectral image (HSI) classification. Although the proposed CNN-based methods have the advantages of spatial feature extraction, they are difficult to handle the sequential data and CNNs are not good at modeling the long-range dependencies. Howev...
Preprint
It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally efficiently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well a...
Preprint
Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introdu...
Preprint
Full-text available
Humans observe and interact with the world to acquire knowledge. However, most existing machine reading comprehension (MRC) tasks miss the interactive, information-seeking component of comprehension. Such tasks present models with static documents that contain all necessary information, usually concentrated in a single short substring. Thus, models...
Preprint
Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most of these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence mod...
Preprint
Full-text available
In this work, we propose a novel constituency parsing scheme. The model predicts a vector of real-valued scalars, named syntactic distances, for each split position in the input sentence. The syntactic distances specify the order in which the split points will be selected, recursively partitioning the input, in a top-down fashion. Compared to tradi...
Article
Full-text available
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval m...
Article
Full-text available
We propose a neural language model capable of unsupervised syntactic structure induction. The model leverages the structure information to form better semantic representations and better language modeling. Standard recurrent neural networks are limited by their structure and fail to efficiently use syntactic information. On the other hand, tree-str...
Article
Full-text available
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval m...
Article
Full-text available
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the m...
Article
Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high. Therefore, there is a great interest in optimizing the computations performed with these models especially when considering development of specialized low-power har...
Article
Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high. Therefore, there is a great interest in optimizing the computations performed with these models especially when considering development of specialized low-power har...
Article
Full-text available
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being...
Article
Full-text available
In this paper, we systematically analyse the connecting architectures of recurrent neural networks (RNNs). Our main contribution is twofold: first, we present a rigorous graph-theoretic framework describing the connecting architectures of RNNs in general. Second, we propose three architecture complexity measures of RNNs: (a) the recurrent depth, wh...
Article
Full-text available
We propose ways to improve the performance of fully connected networks. We found that two approaches in particular have a strong effect on performance: linear bottleneck layers and unsupervised pre-training using autoencoders without hidden unit biases. We show how both approaches can be related to improving gradient flow and reducing sparsity in t...
Article
Full-text available
For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these. Our method consists of two parts: First we stochastically binarize weights to con...
Article
Full-text available
Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation learning, focusing here mostly on unsupervised learning but developing a learning mechanism that could account for supervised, unsupervised and reinforcement lea...
Conference Paper
The paper presents a novel ensemble system which unites Adaboost with multifeature to increase diversity among individual classifiers. Adaboost gives rise to convenience for hyperspectral data classification. To improve the method further, we propose joint Adaboost and multifeature based ensemble (JAME), which assigns different multifeature sets to...
Article
Full-text available
Classification is one of the most popular topics in hyperspectral remote sensing. In the last two decades, a huge number of methods were proposed to deal with the hyperspectral data classification problem. However, most of them do not hierarchically extract deep features. In this paper, the concept of deep learning is introduced into hyperspectral...
Article
Full-text available
In hyperspectral remote sensing image classification, ensemble systems with support vector machine (SVM), such as the Random Subspace SVM Ensemble (RSSE), have significantly outperformed single SVM on the robustness and overall accuracy. In this paper, we introduce a novel subspace mechanism, the Optimizing Subspace SVM Ensemble (OSSE), to improve...
Conference Paper
Full-text available
Hyperspectral image (HSI) classification is a hot topic in the remote sensing community. This paper proposes a new framework of spectral-spatial feature extraction for HSI classification, in which for the first time the concept of deep learning is introduced. Specifically, the model of autoencoder is exploited in our framework to extract various ki...
Conference Paper
The existence of nonlinear characteristics in hyperspectral data is considered as an influential factor curtailing the classification accuracy of canonical linear classifier like k-nearest neighbor (k-NN). To deal with the problem, we investigated approaches to combine manifold learning methods and the k-NN classifier to preserve nonlinear characte...
Conference Paper
The nonlinear characteristics in hyperspectral data is considered as an influential factor curtailing the classification accuracy. To deal with the problem, a new method for classification is developed, especially for hyperspectral imagery (HSI). It is a supervised method based on Locally Linear Embedding (LLE) and k-Nearest Neighbor (KNN), named w...
Article
Full-text available
Advances in sensor and computer technology are revolutionizing the way that remote sensing data with hundreds or even thousands of channels for the same area on the surface of the earth is collected, managed and analyzed. In this paper, the classical Spectral Angle Mapper (SAM) algorithm, which is fit for parallel and distributed computing, is impl...
Article
The simplex volume algorithm (SVA)1 is an endmember extraction algorithm based on the geometrical properties of a simplex in the feature space of hyperspectral image. By utilizing the relation between a simplex volume and its corresponding parallelohedron volume in the high-dimensional space, the algorithm extracts endmembers from the initial hyper...
Article
PCA (principal components analysis) algorithm is the most basic method of dimension reduction for high-dimensional data1, which plays a significant role in hyperspectral data compression, decorrelation, denoising and feature extraction. With the development of imaging technology, the number of spectral bands in a hyperspectral image is getting larg...
Conference Paper
Full-text available
Advances in sensor and computer technology are revolutionizing the way that remote sensing data with hundreds or even thousands of channels for the same area on the surface of the earth is collected, managed and analyzed. In this paper, the classical Spectral Angle Mapper (SAM) algorithm, which is fit for parallel and distributed computing, is impl...

Network

Cited By