- A preview of this full-text is provided by Springer Nature.
- Learn more
Preview content only
Content available from SN Applied Sciences
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
SN Applied Sciences (2019) 1:1124 | https://doi.org/10.1007/s42452-019-1165-1
Research Article
Hierarchical LSTM network fortext classication
KeivanBorna1 · RezaGhanbari1
© Springer Nature Switzerland AG 2019
Abstract
Text classication has always been an important and practical issue so that we need to use the computer to classify and
discover the information in the text. If we want to recognize the oending words in a text without human intervention,
we should use this. In this article we will compare recurrent neural networks, convolutional neural networks and hierar-
chical attention networks with detailed information about each of which. We will represent a HAN model using Theano
framework, which indicates more accurate validation for large datasets. For text classication problem in large datasets,
we will use hierarchical attention networks to get a better result.
Keywords Computer science· Machine learning· Text classication· Hierarchical attention network
1 Introduction
Text classication is an important and practical issue that
can be used in many cases, like spam detection, smart
automatic customer reply, sentiment analysis. These are
commonly known as the most important topics in natural
language processing (NLP) and natural language genera-
tion (NLG). The main goal in text classication is to assign
text to one or more categories. Suppose in a profanity
check problem we have to nd the oensive words in
document. Nowadays, machine learning is the outstand-
ing way to create such classiers. These classiers are upon
classication rules. So with the help of labeled documents
we can create classiers. There are a lot of traditional meth-
ods for text classication, such as n-grams with a linear
model. Recent researches are using supervised and unsu-
pervised machine learning methods, such as convolutional
neural network (CNN) [1], recurrent neural network (RNN)
or hierarchical neural network (HAN). In this article we
benchmark these three methods with creating a general
text classier using these three methods on GloVe d-300
dataset. Our primary contribution is benchmark these
methods and building a Hierarchical LSTM network, which
the input tensor is 3D rather than 2D to demonstrate doc-
uments as a hierarchical model and retrieve categories.
The key dierence to previous works is that our algorithm
uses tokens that are taken from context (not just ltering
sequences of tokens). In order to check the performance of
our model, we looked at three datasets, to compare CNN,
RNN and HAN. Our model uses hierarchical LSTM network.
2 Convolutional neural networks
Convolutional neural networks are group of neurons with
weights and biases that we can learn them. With the score
function, for example for a classication problem, from raw
text to categories, it receives inputs calculate a dierenti-
able score. For a common 3-layer neural network, a convo-
lutional neural network put its neurons in 3 dimensions (x,
y, z), in a Euclidean space. The duty of every layer in a CNN
is converting a 3-dimension input to a 3-dimension output
set of neurons. Actually the input layer is according to the
problem, which means the input layer value is 2D document
(rows, columns) and the other layers will hold characteristic
values for input properties. We can nd that every CNN is a
Received: 21 May 2019 / Accepted: 26 August 2019 / Published online: 30 August 2019
* Keivan Borna, borna@khu.ac.ir; Reza Ghanbari, reza91@aut.ac.ir | 1Department ofComputer Science, Faculty ofMathematics
andComputer Science, Kharazmi University, Tehran, Iran.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.