Content uploaded by Abdul Hasib Uddin
Author content
All content in this area was uploaded by Abdul Hasib Uddin on Feb 27, 2024
Content may be subject to copyright.
Sentiment Analysis of Bangla text using Gated
Recurrent Neural Network
Nasif Alvi1, Kamrul Hasan Talukder1, Abdul Hasib Uddin1
1Khulna University, Khulna, Bangladesh
Emails: nasif.cse12@gmail.com, khtalukder@gmail.com, abdulhasibuddin@gmail.com
Abstract. Sentiment analysis is a fundamental part of Natural Language
Processing. There are numerous works on this topic in English and other
languages. However, it is still a comparatively new practice in Bangla. The
absence of a suitable Bangla corpus is the primary obstacle for sentiment
analysis tasks in Bangla. Nonetheless, Long Short-Term Memory (LSTM) is a
common technique for resolving sentiments from a dataset containing a large
amount of text data. However, Gated Recurrent Unit (GRU) is very efficient for
datasets with a low amount of text data. In this manuscript, we present a 5 -
layered GRU neural network model, each layer comprising of 48 neurons,
applied the model on an existing Bangla corpus. We have implemented the 10-
folds cross-validation approach and repeated the same processes three times.
Each time, we have considered the averages of the ten validation accuracy and
losses and compared the results with the state-of-the-art published outcome
(77.85% highest accuracy) for Bi-directional LSTM (BLSTM). The highest
accuracy for our model is 78.41%, while the lowest accuracy is 76.34%.
Keywords: Sentiment analysis, Natural Language Processing, Corpus, Neural
Network, Text data, LSTM, GRU, BLSTM.
1 Introduction
Sentiment Analysis (SA) is a technique to find as well as to classify thoughts
expressed in a portion of perusal based on several types of terminologies like
computer technology especially in order to decide if the conduct of the writer against
a particular subject, upshots and so on is definitive, opposite or indifferent. Sentiment
analysis frequently applies on management among ideas, thoughts as well as temporal
texts. SA offers detailed information pertaining to universal judgments since it runs
into the entire various forms of prattles, ratings as well as feedback. SA is basically
kind of validated tactic for forecasting a variety of important situations, such as box
office film reviews as well as universal and provincial particles. Universal views are
applied to worth a particular motive like an individual, commodity or venue, as well
as it can be seen on various websites such as Amazon and Yelp. It is possible to
define emotions in definitive, opposite or indifferent classes as well as major tribes. If
the conductor has a satisfying and affirmative experience and bad impact, SA will
instantly discover the articulate course of user feedback or opinions. In the area of
classification of emotion, views or people's emotions are analyzed. In social media
and in virtually any system, these kinds of programs are used. The views or emotions
are the resemblance of the values, choices and actions of individuals. With these
techniques it is possible for corporations to make political decisions. In current years,
a great number of individuals are sharing their opinions or thoughts through the
internet using Bangla [1].
The development of restaurants across numerous online channels can be observed
over the last few years. Websites have turned out as the most common forum where
restaurants are upheld on the principle of the opinions of customers. The
representation of consumer sentiment results from such online customer feedback that
magnify a restaurant's overall quality. The contact between customers and owners
through the online portal provides the ability to examine the response of the
customer's insights. It is therefore necessary to be able to measure consumer opinion
in order to improve quality according to the demands of consumers. The advantage of
potential research will be given by a qualified computer by labelled data. Works such
as the CNN model for Bangla reviews for factor extraction, mixed machine learning
models for forecasting reviews. Sentiment analysis has already been a common form
of forecasting consumer ratings. Few research has been performed on the Bangla text
in particular, but not so effectively. A Sentiment Analysis model for restaurant
ranking was developed by authors on the basis of food price, quality, operation,
ambience and special meaning [2].
SA is basically an implementation of physical dialect technology. It is recognized as
concept mining, sentiment elimination. At present, Sentiment Analysis vastly refers to
the countable provision of thoughts or analysis, computational linguistics, natural
language processing as well as biometrics for the systematic detection, retrieval,
quantification, as well as study of affective states and subjective knowledge.
Moreover, recent advances in research into machine learning, especially deep
research, methods focused on learning, for example recurrent neural network (RNN),
accept advantage of the ability to infer choices through formulating a diagram in SA
[3].
Micro-blogging platforms such as Twitter, YouTube, Facebook, etc. have now
become very popular for social connections. Through social media, people
communicate their sadness, which can be studied to determine the reasons behind
their depression. Most studies on the study of emotions as well as depression are
focused on inquisitions as well as scholarly interviews in non-Bengali languages,
especially English. In identifying human depression, these conventional approaches
are not always sufficient. Artificial Intelligence's aim is to mimic human habits, then
evaluate them. Machine learning as well as Deep Learning approaches are nowadays
being vastly used for analyzing human behavior as well as human sentiment.
Detecting emotion and analyzing sentiment has become an important part and for this
several types of learning methods are used. It is possible to further study the
classification of feelings and emotions from two separate viewpoints, especially the
detection of feelings as well as emotions from image data, and the detection of
feelings as well as emotions from textual data. The total field of definitive, opposite
as well as mystical emotion alignment works is covered in a common way by
sentiment analysis. Emotions, e.g. happiness, grief, depression, disgust, etc. are very
profound emotions that are often harder to analyze. Any of those thoughts are
stronger than others, needing research of high-level clinical experience as well as
much specialized empirical methods. For this reason; sentiment analysis is foremost
important [4].
In this research study, we have used Gated Recurrent Unit (GRU) for the sentiment
analysis with 7,000 Bangla text data. With the 10-folds cross-validation approach; the
process has been implemented and highest accuracy (78.41%) has been obtained for
the sentiment analysis.
2 Related Work
Hoque et al. [3] examined execution of various ML approaches along with doc2vec
for categorizing sentiment of Bangla regular dialects. They streamed a doc2vec model
utilizing a corpus developed with seven thousand Bangla sentence and with 120
component of highlight vectors with two kinds of information: positive and negative.
Then they utilized a few ML algorithms (LR, SGD, SVM, K-Neighbors Classifier,
DT, LDA, SM, BLSTM and GaussianNB) for analysis where BLSTM acquired
highest accuracy. The information was split 80% as training and rest 20% as testing
haphazardly.
Uddin et al. [5] established a Gated Recurrent Unit model based on depression
detection method by analysis. All of the data culled from Bangla information from
Twitter, Facebook and different sources. There were 4 hyper- parameters,
specifically, number of GRU layers 5, group size 10 and number of epochs 5. They
had collected 5,000 Bangla information from Twitter and 210 depressed Bangla
statements from local Bengali speakers utilizing google structure. They utilized GRU
size 64, 128, 256, 512, and 1024 for this investigation.
Hossain et al. [6] proposed a joint model with CNN-LSTM to conduct sentiment
analysis on online restaurant surveys. They utilized the dataset into 80% for training
with CONV size 256 and LSTM size 128. They collected the information of those
restaurants that were related with online platform like FoodPanda and Shohoz Food
consisting of 1000 reviews Review and category were two sections. At last the recall,
precision and f1-score average values were 0.70, 0.70 and 0.71.
Sharfuddin et al. [1] accomplished their work on sentiment classification of Bangla
content utilizing RNN with BLSTM(Bidirectional LSTM) where contained around
15000 comments got from Facebook and at that point kept 10000 comments
consisting of 5000 negative comments and 5000 positive comments and all the
symbols, emojis, stickers, numbers were erased to work on plain Bangla content.
Hasan et al. [7] developed a model that recognized the sentiment assessment from
Bangla text utilizing logical valence examination. In this investigation, utilized the
WorldNet to get the feelings of each word as per its grammatical features (POS) and
SentiWordNet to get the earlier valence of each word. At that point determined the
total positivity, negativity and neutrality of sentence or archive regarding all out
sense. They made a XML document to store the Bangla word and its related POS and
take the assessment of 20-30 people groups about the sentiment of the section.
Tripto et al. [8] introduced an extensive group of methods to recognize sentiment and
concentrated on feelings from Bangla texts. In this study, LSTM, SVM, NB, CNN
classifiers and the dataset of Bangla sentence along with a 3 class that were
affirmative, neutral, negative and a 5 class that were strongly positive, negative,
neutral, positive, strongly negative of the estimation name with six fundamental
feelings (anger, fear, disgust, sadness, joy and surprise) were used. They assessed the
exhibition of the model utilizing another dataset of Bangla, Romanized Bangla and
English comments from various sorts of YouTube recordings. Their mentioned
methods indicated 54.24% and 65.97% accuracy in 3 and 5 names feeling
individually.
Al-Amin et al. [9] analyzed a methodology of sentiment characterization and
sentiment extraction of words and Bangla comments with word2vec. The data set had
multiline comments and 16,000 Bangla single lines that were gathered from popular
blogging websites and tagged every comment to one or the other positive or
pessimistic by taking suppositions from various kinds of individuals by overviews.
They prepared 90% of the tagged comments picked arbitrarily as well as the leftover
10% because of testing.
In our previous analysis [10] we classified English tweets into five categories: happy,
surprise, sad, disgust and neutral. We used total 4000 tweets as our dataset: 3750 as
training set and 250 as test set. Conducting unigram model and unigram using POS
tag model 66% and 64.8% accuracies were achieved respectively.
3 Methodology
The flowchart of the research methodology is shown in Fig. 1.
3.1 Dataset Collection:
The dataset was collected from Hoque et. al [3] for sentiment analysis in Bangla text
including Positive and Negative sentiment. The total number of samples were 7000
where 3500 samples were positive sentiment and rest of 3500 samples were negative
sentiment.
3.2 Features Extraction:
To extract the feature information “Integer encode” method was used in this study.
The integer values have a characteristic arranged connection between one another and
AI calculations might have the option to comprehend and saddle this relationship. The
total length of dataset was obtained 21889. Total vector size was taken same as the
maximum length of sentence. After that Zero padding was used to keep the length of
each text same.
Fig. 1. Working procedure of our system
3.3 Dataset Training:
10-fold cross-validation is the most popular technique to train the dataset. It is a re-
examining procedure to evaluate predictive models by parceling the first instance into
a preparation set to build the model and a test set to evaluate it. It rearranges the
dataset haphazardly, sections dataset into 10 set lastly compact the aptitude of the
model using the case of model assessment scores. In this study 10-fold cross-
validation (6300 training data and 700 test data) was used and it was calculated 3
times. Each time the training data was shuffled to learn efficiently.
3.4 Gated Recurrent Unit Network (GRU):
GRU network is the streamlined structure of the repetitive neural organization.
Notwithstanding, at the point when the info data is expanded to a specific length, the
RNN can't associate with the significant data. GRU network is pointed toward
tackling the issue of long-range reliance as well as slope vanishing of RNN. The GRU
neural organization along with smaller edge structure as well as better productivity is
straightforwardly chosen for the determination of stuff pitting shortcoming. Like
GRU, an intermittent unit in RNN is recognized as long transient memory. LSTM as
well as GRU have the similar objective of following long haul conditions viably while
alleviating the disappearing/detonating inclination issues [11]. The GRU neural
organization model adjusts to the issue of reliance on an assortment of time scales by
arranging a wide range of cycle units which balance the progression of data with the
door unit [12].
In our case, we have used 5 GRU layers with each layer containing 48 neurons. Then
the flatten method has been used to convert the whole matrix into 1D vector before
the dense layer which is defined as output layer. In dense layer there remains 2
neurons named positive sentiment and negative sentiment. The activation function has
been used “tanh” in hidden layers and in the final activation the “softmax” function
has been used. For reducing the loss or error rate the “Categorical_crossentropy_loss”
function has been used.
4 Result and Analysis
We have applied 10 fold cross validation three times on our dataset to compare the
results. In each iteration we have applied shuffling on our training dataset to train the
network properly.
Table 1 shows all the validation accuracies and validation losses along with the
number of epochs in each fold in the three times running. Fig. 2 and Fig. 3 represents
the graphical view of validation accuracy and validation loss in each fold in all our
three times running. The lowest validation accuracies were found in fold10 for all the
three times. On the other hand, we achieved highest validation accuracy of 90.71% in
fold3 in our first iteration using 13 epochs. We achieved average accuracy of 78.41%,
78.04% and 76.34% in our three times running respectively. Fig. 4 represents the
graphical view of the average accuracy and average loss of all the three times running.
Table 1. Results of tuning GRU Hyper-parameter
RUN 1
RUN 2
RUN 3
Implementation
no.
No. of epochs
Validation
accuracy
Validation loss
Implementation
no.
No. of epochs
Validation
accuracy
Validation loss
Implementation
no.
No. of epochs
Validation
accuracy
Validation loss
Fold1
9
76.14%
0.5033
Fold1
10
73.71%
0.5760
Fold1
9
72.14%
0.5551
Fold2
11
85.14%
0.3924
Fold2
10
84.86%
0.3868
Fold2
10
85.71%
0.3614
Fold3
13
90.71%
0.2626
Fold3
12
89.86%
0.2828
Fold3
15
90.00%
0.2892
Fold4
11
84.71%
0.3270
Fold4
12
85.29%
0.3138
Fold4
11
84.86%
0.3477
Fold5
10
86.86%
0.3193
Fold5
11
87.00%
0.3458
Fold5
11
86.29%
0.3570
Fold6
10
86.71%
0.3429
Fold6
10
84.57%
0.3582
Fold6
12
85.57%
0.3554
Fold7
10
82.14%
0.4216
Fold7
9
81.43%
0.4031
Fold7
9
81.14%
0.4154
Fold8
7
62.14%
0.6760
Fold8
8
63.86%
0.6811
Fold8
7
61.57%
0.6781
Fold9
13
78.71%
0.5899
Fold9
8
71.71%
0.5966
Fold9
8
70.00%
0.6233
Fold10
6
50.86%
0.6971
Fold10
8
58.14%
0.6883
Fold10
6
46.14%
0.6939
Fig. 2. Validation accuracy of 10 fold cross validation in three times running
Fig. 3. Validation loss of 10 fold cross validation in three times running
(a)
(b)
Fig. 4. Graphical view of (a) average accuracy and (b) average loss in three times run
Table 2 shows the comparison of our system with Hoque et. al [3]. For analysis
Hoque et. al [3] randomly split their data 80% as training set and 20% as test set.
However, randomly splitting a dataset is not the most standard way to learn a model.
Because, it does not ensure the participation of all data for training. Hence, in our
analysis, we applied 10 fold cross validation three times by shuffling the dataset each
times to achieve more accurate result. 10 fold cross validation is one of the most
popular and accepted technique to learn a model while 10 fold cross validation
ensures the participation of all data in training the model. We achieved the superior
average accuracy of 78.41% as well as lowest average accuracy of 76.34%. On the
other hand, Hoque et. al [3] achieved highest accuracy of 77.85% using BLSTM and
lowest accuracy of 59.21% using GaussianNB.
Table 2. Comparison of our system with Hoque et. al [3]
Our system
Hoque et. al [3]
Highest accuracy
78.41%
77.85%
Lowest accuracy
76.34%
59.21%
5 Conclusion
There exists few researches on Bangla text sentiment analysis. For this reason, the
dataset on Bangla text is rarely available. In the field of research, sentiment analysis is
an emerging topic, we should try to build more accurate model on native languages.
In our research we have used an existing dataset and we have trained this dataset
using GRU. Our system outperforms the previous one. There remains some
limitations in our research. In future, we want to apply more preprocessing techniques
and other feature extraction method into our data to get better result. Also to compare
the performances, we want to use other classification algorithms. We want to further
continue our study to develop multi class sentiment analysis or emotion analysis.
Acknowledgement:
This research work has been funded by Information and Communication Technology
(ICT) Division, Ministry of Post, Telecommunication, and Information Technology,
Government of the People’s Republic of Bangladesh through ICT fellowship.
References:
1. A. Aziz Sharfuddin, M. Nafis Tihami and M. Saiful Islam, "A Deep Recurrent Neural
Network with BiLSTM model for Sentiment Classification," International Conference on
Bangla Speech and Language Processing (ICBSLP), Sylhet, 2018, pp. 1-4, doi:
10.1109/ICBSLP.2018.8554396.
2. N. Hossain, M. R. Bhuiyan, Z. N. Tumpa and S. A. Hossain, "Sentiment Analysis of
Restaurant Reviews using Combined CNN-LSTM," 11th International Conference on
Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India,
2020, pp. 1-5, doi: 10.1109/ICCCNT49239.2020.9225328.
3. M. T. Hoque, A. Islam, E. Ahmed, K. A. Mamun and M. N. Huda, "Analyzing
Performance of Different Machine Learning Approaches With Doc2vec for Classifying
Sentiment of Bengali Natural Language," International Conference on Electrical,
Computer and Communication Engineering (ECCE), Cox'sBazar, Bangladesh, 2019, pp.
1-5, doi: 10.1109/ECACE.2019.8679272.
4. X. Wang, C. Zhang, Y. Ji, L. Sun, L. Wu and Z. Bao, "A Depression Detection Model
Based on Sentiment Analysis in Micro-blog Social Network", Lecture Notes in
Computer Science, 2013, pp. 201-213. Available: 10.1007/978-3-642-40319-4_18.
5. A. H. Uddin, D. Bapery and A. S. Mohammad Arif, "Depression Analysis of Bangla Social
Media Data using Gated Recurrent Neural Network," 1st International Conference on
Advances in Science, Engineering and Robotics Technology (ICASERT), Dhaka,
Bangladesh, 2019, pp. 1-6, doi: 10.1109/ICASERT.2019.8934455.
6. N. Hossain, M. R. Bhuiyan, Z. N. Tumpa and S. A. Hossain, "Sentiment Analysis of
Restaurant Reviews using Combined CNN-LSTM," ICCCNT, Kharagpur, India, 2020, pp.
1-5, doi: 10.1109/ICCCNT49239.2020.9225328.
7. K. M. A. Hasan, Mosiur Rahman and Badiuzzaman, "Sentiment detection from Bangla text
using contextual valency analysis," 2014 17th International Conference on Computer and
Information Technology (ICCIT), Dhaka, 2014, pp. 292-295, doi:
10.1109/ICCITechn.2014.7073151.
8. N. Irtiza Tripto and M. Eunus Ali, "Detecting Multilabel Sentiment and Emotions from
Bangla YouTube Comments," ICBSLP, Sylhet, 2018, pp. 1-6, doi:
10.1109/ICBSLP.2018.8554875.
9. M. Al-Amin, M. S. Islam and S. Das Uzzal, "Sentiment analysis of Bengali comments with
Word2Vec and sentiment information of words," ECCE, Cox's Bazar, 2017, pp. 186-190,
doi: 10.1109/ECACE.2017.7912903.
10. A. Z. Riyadh, N. Alvi and K. H. Talukder, "Exploring human emotion via Twitter," ICCIT,
Dhaka, 2017, pp. 1-5, doi: 10.1109/ICCITECHN.2017.8281813.
11. X. Li, J. Li, Y. Qu and D. He, "Gear Pitting Fault Diagnosis Using Integrated CNN and
GRU Network with Both Vibration and Acoustic Emission Signals", Applied Sciences,
2019, vol. 9, no. 4, p. 768. Available: 10.3390/app9040768.
12. B. Liu, C. Fu, A. Bielefield and Y. Liu, "Forecasting of Chinese Primary Energy
Consumption in 2021 with GRU Artificial Neural Network", Energies, 2017, vol. 10, no.
10, p. 1453. Available: 10.3390/en10101453.