Conference PaperPDF Available

Sentiment analysis on large scale Amazon product reviews


Abstract and Figures

The world we see nowadays is becoming more digitalized. In this digitalized world e-commerce is taking the ascendancy by making products available within the reach of customers where the customer doesn't have to go out of their house. As now a day's people are relying on online products so the importance of a review is going higher. For selecting a product, a customer needs to go through thousands of reviews to understand a product. But in this prospering day of machine learning, going through thousands of reviews would be much easier if a model is used to polarize those reviews and learn from it. We used supervised learning method on a large scale amazon dataset to polarize it and get satisfactory accuracy.
Content may be subject to copyright.
2018 IEEE International Conference on Innovative Research and Development (ICIRD)
978-1-5386-5283-1/18/$31.00 ©2018 IEEE
Sentiment Analysis on Large Scale
Amazon Product Reviews
AbstractThe world we see nowadays is becoming more
digitalized. In this digitalized world e-commerce is taking the
ascendancy by making products available within the reach of
customers where the customer doesn’t have to go out of their
house. As now a day’s people are relying on online products so
the importance of a review is going higher. For selecting a
product, a customer needs to go through thousands of reviews to
understand a product. But in this prospering day of machine
learning, going through thousands of reviews would be much
easier if a model is used to polarize those reviews and learn from
it. We used supervised learning method on a large scale amazon
dataset to polarize it and get satisfactory accuracy.
KeywordsSentiment analysis, pool based active learning,
feature extraction, text classification, machine learning.
As the commercial site of the world is almost fully undergone
in online platform people is trading products through different
e-commerce website. And for that reason reviewing products
before buying is also a common scenario. Also now a day,
customers are more inclined towards the reviews to buy a
product. So analyzing the data from those customer reviews to
make the data more dynamic is an essential field nowadays. In
this age of increasing machine learning based algorithms
reading thousands of reviews to understand a product is rather
time consuming where we can polarize a review on particular
category to understand its popularity among the buyers all over
the world.
The objective of this paper is to categorize the positive and
negative feedbacks of the customers over different products
and build a supervised learning model to polarize large amount
of reviews. A study on amazon last year revealed over 88% of
online shoppers trust reviews as much as personal
recommendations. Any online item with large amount of
positive reviews provides a powerful comment of the
legitimacy of the item. Conversely, books, or any other online
item, without reviews puts potential prospects in a state of
distrust. Quite simply, more reviews look more convincing.
People value the consent and experience of others and the
review on a material is the only way to understand others
impression on the product. Opinions, collected from users‟
experiences regarding specific products or topics,
straightforwardly influence future customer purchase decisions
[1]. Similarly, negative reviews often cause sales loss [2]. For
those understanding the feedback of customers and polarizing
accordingly over a large amount of data is the goal. There are
some similar works done over amazon dataset. In [5] did
opinion mining over small set of dataset of Amazon product
reviews to understand the polarized attitudes towards the
In our model, we used both manual and active learning
approach to label our datasets. In the active learning process
different classifiers are used to provide accuracy until reaching
satisfactory level. After getting satisfactory result we took
those labeled datasets and processed it. From the processed
dataset we extracted features that are then classified by
different classifiers. We used combination of two kinds of
approaches to extract features: the bag of words approach and
tf-idf & Chi square approach for getting higher accuracy.
So far, much of the research papers related to product reviews,
sentiment analysis or opinion mining has been done recently.
In the work [3] Elli, Maria and Yi-Fan extracted sentiment
from the reviews and analyze the result to build up a business
model. They have claimed that demonstrated tools were robust
enough to give them high accuracy. The use of business
analytics made their decision more appropriate. They also
worked on detecting emotions from review, gender based on
the names, also detecting fake reviews. The commonly used
programming language was python and R. They mainly used
Multinomial Naïve Bayesian (MNB) and support vector
machine (SVM) as their main classifiers. In paper [4] the
author applied existing supervised learning algorithms to
predict a reviews rating on a given numerical scale using only
text. They have used hold out cross validation using 70% data
as training data and 30% data as testing data. In this paper the
author used different classifiers to determine the precision and
recall values. The author in Paper [5] applied and extended the
current work in the field of natural language processing and
sentiment analysis to data from Amazon review datasets. Naïve
Bayesian and decision list classifiers were used to tag a given
review as positive or negative. They have selected books and
kindle section review from amazon. The author in [6] aimed to
build a system that visualizes the reviews sentiment in the form
Tanjim Ul Haque
Nudrat Nawal Saber
Faisal Muhammad Shah
Department of
Computer Science & Engineering
Department of
Computer Science & Engineering
Department of
Computer Science & Engineering
Ahsanullah University of Science
& Technology
Ahsanullah University of Science
& Technology
Ahsanullah University of Science
& Technology
Dhaka, Bangladesh
Dhaka, Bangladesh
Dhaka, Bangladesh
of charts. They have used data scraping from amazon url to get
the data and preprocessed it. In this paper they have applied
NB, SVM and maximum entropy. AS the paper claims that
they summarize the product review to be the main point so
there is no accuracy showed. They showed their result in
statistical chart. In the paper [7] authors built a model for
predicting the product ratings based on rating text using a bag-
of-words. These models tested utilized unigrams and bigrams.
They used a subset Amazon video game user reviews from
UCSD Time-based models didn‟t work well as the variance in
average rating between each year month, or day was relatively
small. Between unigrams and bigrams, unigrams produced the
most accurate result. And popular unigrams were extremely
useful predictor for ratings for their larger variance. Unigram
results had a 15.89% better performance than bigrams. In paper
[8] various feature extraction or selection techniques for
sentiment analysis are performed. They collected Amazon
dataset at first and then performed preprocessing for stop words
and special characters‟ removal. They applied phrase level,
single word and multiword feature selection or extraction
technique. Naive Bayes is used as the classifier. They
concluded that Naive Bayes gives better result for phrase level
than single word and multiword. The main cons of this paper
are, they used only naive Bayes classifier algorithm from
which we cannot get a sufficient result. In paper [9] it has used
easier algorithms so it is easy to understand. The system gives
high accuracy on svm and so it cannot work properly on huge
dataset. They used support vector machine (svm), logistic
regression, decision trees method. In paper [10] tfidf is used
here as an additional experiment. It can predict rating by using
bag of words. But Classifiers used here are only few. They
used root mean square error, linear regression model. So, those
are some related works mentioned above, we tried to make our
work more efficient by choosing best ideas from them and
applied those together.
In our system, we used large amount of datasets so it gave
efficient result and we could take better decision. Moreover, we
have used active learning approach to label datasets which can
dramatically accelerate many machine learning tasks. Our
system also consists of several types of feature extraction
methods. To the best of our knowledge, our proposed approach
gave higher accuracy than the existing research works.
Amazon is one of the largest E-commerce site as for that there
are innumerous amount of reviews that can be seen. We used
data named Amazon product data which was provided by
researchers from [14]. The dataset was unlabeled and to use it
in a supervised learning model we had to label the data. We
used three JSON files where the structure of the data is as
"reviewerID": ID of the reviewer
"asin": ID of the product
"reviewerName": name of the reviewer
"helpful": helpfulness rating of the review
"reviewText": text of the review
"overall": rating of the product
"summary": summary of the review
"reviewTime": time of the review (raw)
For data we selected three categories from Amazon products
Electronics reviews, Cell Phone and Accessories Reviews and
Musical Instruments product reviews which consists of
approximately 48500 product reviews. Where 21600 reviews
are from mobile phones, 24352 are from electronics & 2548
from musical instruments data. From the formats used for
analyzing the review polarity we used review Text & Overall
from it. We can see an overview of our methodology:
Figure 1: Work Process
A. Data Acquisition
We acquired our dataset of 3 different JSON formats and
labeled our dataset. As we have a large amount or reviews
manually labeling was quite impossible for us. Therefor we
preprocessed our data and used Active learner to label the
datasets. As amazon reviews comes in 5-star rating based
generally 3 star ratings are considered as neutral reviews
meaning neither positive nor negative. So we discard any
review which contains a 3-star rating from our dataset and take
the other reviews and proceed to next step labeling the dataset.
Pool Based Active Learning:
Active learning is a special case in semi-supervised learning
algorithm. The main fact is that the performance will be better
with less training if the learning algorithm is allowed to choose
the data from which it learns [2]. Active learning system tries
to solve data labeling bottleneck by querying for unlabeled
instance to be properly labeled by an expert or oracle. As
manually labeling the dataset is quite an impossible task so that
to reduce time complexity we use a special kind of semi-
supervised learning approach known as pull based active
learning. In the process of our active learning we need to
provide it some pre labeled datasets as training and testing and
take unlabeled dataset. For using active learning, we need to
provide some manually labeled reviews as training testing
sets. Then from a pool of unlabeled dataset learning method
will ask oracle or user to label few data. And it will run some
classifiers to calculate the accuracy. Accuracy shows whether
the decision boundary is separating most the values in two
classes. Higher the accuracy higher the data is being labeled. If
the accuracy is greater or equal to 90% then we take those data
and combined it with already pre-labeled data to get our
labeled dataset. If not, we again consider help from the oracle
to label some more data. After the accuracy is greater than 90%
we considered the data to be labeled.
B. Data Pre-Processing
Tokenization: It is the process of separating a sequence of
strings into individuals such as words, keywords, phrases,
symbols and other elements known as tokens. Tokens can be
individual words, phrases or even whole sentences. In the
process of tokenization, some characters like punctuation
marks are discarded. The tokens work as the input for different
process like parsing and text mining.
Removing Stop Words: Stop words are those objects in a
sentence which are not necessary in any sector in text mining.
So we generally ignore these words to enhance the accuracy of
the analysis. In different format there are different stop words
depending on the country, language etc. In English format
there are several stop words.
POS tagging: The process of assigning one of the parts of
speech to the given word is called Parts of Speech tagging. It is
generally referred to as POS tagging. Parts of speech generally
contain nouns, verbs, adverbs, adjectives, pronouns,
conjunction and their sub-categories. Parts of Speech tagger or
POS tagger is a program that does this job.
C. Feature Extraction
Bag of Words: Bag of word is a process of extracting features
by representing simplified text or data, used in natural language
processing and information retrieval. In this model, a text or a
document is represented as the bag (multiple set) of its words.
So, simply bag of words in sentiment analysis is creating a list
of useful words. We have used bag of words approach to
extract our feature sets. After preprocessed dataset we used pos
tagging to separate different parts of speech and from that we
select nouns and adjectives and use those to create a bag of
words. Then we run it through a supervised learning and find
our results and also the top used words from the review dataset.
TF-IDF:TF-IDF is an information retrieval technique which
weighs a term‟s frequency (TF) and also inverse document
frequency (IDF). Each word or term has its own TF and IDF
score. The TF and IDF product scores of a term is referred to
the TF*IDF weight of that term. Simply we can state that the
higher the TF*IDF score (weight) the rarer the term and vice
versa. TF of a word is the frequency of a word.
IDF of a word is the measure of how significant that term is
throughout the corpus.
When words do have high TF*IDF weight in content, content
will always be amongst the top search results, so anyone can:
1. Stop worrying about using the stop-words,
2. Successfully find words with higher search volumes
and lower competition.
Chi Square: Chi square(X^2) is a calculation that is used to
determine how smaller the difference between the observed
data and the expected data .
In this approach we have preprocessed our dataset then we
have divided data into training and testing set. We used
pipeline method to apply TF-IDF, Chi square and other
classifiers onto our dataset and got the results.
Algorithm for proposed approach
Labeled Data=labeled data obtained after active learning
Accuracy of classifiers;
Precision,Recall,F-1Measure for positive and deceptive values.
//product review polarity accuracy
1. Load labeled data positive & negative
2. Preprocessed labeled data
3. for every X= {X1…Xn} in labeled data
4. Extractfeature(Xi)
5. Cross validate into training & testing set
6. Classifier.train()
7. Accuracy= classifier.accuracy()
8. majority_voting(accuracy) using vote classifier
9. show result(accuracy,precision,recall,f1measure)
extractfeature(text) return n-gram feature
majority_voting(accuracy) return highest accuracy
D. Evaluating Measures:
Evaluate metrics play an important role to measure
classification performance. Accuracy measure is the most
common for this purpose. The accuracy of a classifier on a
given test dataset is the percentage of those dataset which are
correctly classified by the classifier [48]. And for the text
mining approach always the accuracy measure is not enough to
give proper decision so we also took some other metrics to
evaluate classifier performance. Three important measures are
commonly used precision, recall, F-measure. Before discussing
with different measures there are some terms we need to get
comfortable with-
TP (True Positive) represents numbers of data
correctly classified
FP (False Positive) represents numbers of correct data
FN (False Negative) represents numbers of incorrect
data classified as correct
TN (True Negative) is the numbers of incorrect data
Precision: Precision measures the exactness of a classifier,
how many of the return documents are correct. A higher
precision means less false positives, while a lower precision
means more false positive. Precision (P) is the ratio of numbers
of instance correctly classified from total. It can be defined as-
Recall: Recall calculates the sensitivity of a classifier; how
many positive data it returns. Higher recall means less false
negatives. Recall is the ratio of number of instance accurately
classified to the total number of predicted instance. This can be
shown as-
F-Measure: Combining precision and recall produces single
metrics known as F-measure, and that is the weighted harmonic
mean of precision and recall. It can be defined as
Accuracy: Accuracy predicts how often the classifier makes
the correct prediction. Accuracy is the ratio between the
number of correct predictions and the total number of
There were several machine learning algorithms used in our
experiment like Naïve Bayesian, Support vector Machine
Classifier (SVC), Stochastic Gradient Descent (SGD), Linear
Regression (LR), Random Forest and Decision Tree. We have
conducted cross validation methods and 10 fold gave the best
accuracy. We conduct the best classifiers on 3 categories of
product reviews and see the results according to the evaluation
measures. The classifiers were applied on different feature
selection process where the common features from TF-IDF and
bag of words gave best results for all the datasets.
5 Fold
Linear support
Vector machine
Naïve Bayes
Stochastic Gradient
Random Forest
Logistic regression
Decision tree
Table-1: Experiment result for cellphone & accessories data
5 Fold
Linear support
Vector machine
Naïve Bayes
Stochastic Gradient
Random Forest
Logistic regression
Decision tree
Table-2: Experiment result for musical Instruments data
10 Fold
5 Fold
Linear support
Vector machine
Naïve Bayes
Gradient Descent
Random Forest
Decision tree
Table-3: Experiment result for electronics data
From all the experiments it can be seen that support vector
machine provided with greater accuracy in every dataset. As
the working dataset is quite larger and support vector machine
works better with large scale dataset without over fitting it.
And from these results highest accuracy was 94.02%.
In this section our research was tried to be compared with
other related works. The comparative analysis was based on
accuracy. The comparison can be seen in the table below-
Paper Title
Year &
Amazon Reviews,business
analytics with sentiment
analysis [11]
Review of
Sentimetn Analysis in
Amazon Reviews Using
Probalbilistic Machine
Learning [5]
2013 (6)
reviews of books
reviews of Kindle
Mining somparative
opinions from customer
reviews for competitive
intelligence [12]
2011 (234)
Customer product
Amazing: A sentiment
mining & Retrieval System
2009 (125)
E commerce
"Feature Selection Methods
in Sentiment Analysis and
Sentiment Classification of
Amazon Product Reviews"
Review on books
Review on music
Review on
Review of
Review of
Reviews of music
Table-4: Comparative Analysis
Different researches listed in the table have conducted
different pre-processing steps and feature extraction processes.
As in our research we tied to improvise all the extraction
processes and preprocessing steps and pick the best accuracy
from it. Pull based active learning process have contributed
labeling and selecting the best reviews as our training and
testing data. Use of different preprocessing process helped
sorting out unnecessary words. And finally taking the best
features extracted from the datasets and learning through
proper classifiers it was possible to attain greater accuracy.
From the table it can be decided that the approaches used in
approaches our proposed model shows more effectiveness and
could achieve a better result than some of the related works.
In this research we proposed a supervised learning model to
polarize a large amount of product review dataset which was
unlabeled. We proposed our model which is a supervised
learning method and used a mix of 2 kinds of feature extractor
approach. We described the basic theory behind the model,
approaches we used in our research and the performance
measure for the conducted experiment over quite a large data.
We also compared our result with some of the similar works
regarding product review. We also went through different
kinds of research papers regarding sentiment analysis over a
text based dataset. We were able to achieve accuracy over 90%
with the F1 measure, precision and recall over 90%. We tried
different simulation using cross validation, training-testing
ratio, and different feature extraction process for comparing
varying amount of data to achieve promising results. In most of
the cases 10 fold provided a better accuracy while Support
Vector Machine (SVM) provided best classifying results. It is
hard to gather huge amount of gold standard dataset for this
purpose as e-commerce sites have their limitations on giving
data publicly. Also scraping data can be a problem as we can‟t
scrape enough data to consider it as real-life public reviews
over different products.
Some future works which can be included to improve the
model and also to make it more effective in practical cases.
Our future works include applying PCA (Principal Component
Analysis) in active learning process to fully automate data
labeling process with less assistance from the oracle. The
model can be incorporate with programs that can interact with
customer seeking a score of a particular product. As we used a
large scale dataset we can apply the model on local market
sites to get better accuracy and usability. And lastly we will
try to continue this research until we generalize this model to
all kinds of text based reviews and comments.
[1] Samha,Xu,Xia, Wong & Li “Opinion Annotation in Online
Chinese Product Reviews.” In Proceedings of LREC
conference, 2008.
[2]. Nina Isabel Holleschovsky, “The social influence factor:
Impact of online product review characteristics on consumer
purchasing decisions”, 5 th IBA Bachelor Thesis Conference,
Enschede, The Netherlands 2015
[3]Elli, Maria Soledad, and Yi-Fan Wang. "Amazon Reviews,
business analytics with sentiment analysis." 2016
[4]Xu, Yun, Xinhui Wu, and Qinxia Wang. "Sentiment
Analysis of Yelp„s Ratings Based on Text Reviews." (2015).
[5] Rain, Callen. "Sentiment Analysis in Amazon Reviews
Using Probabilistic Machine Learning."Swarthmore College
[6] Bhatt, Aashutosh, et al. "Amazon Review Classification
and Sentiment Analysis." International Journal of Computer
Science and Information Technologies 6.6 (2015): 5107-5110.
[7]Chen, Weikang, Chihhung Lin, and Yi-Shu Tai."Text-Based
Rating Predictions on Amazon Health & Personal Care Product
Review." (2015)
[8]Shaikh, Tahura, and DeepaDeshpande. "Feature Selection
Methods in Sentiment Analysis and Sentiment Classification of
Amazon Product Reviews.",(2016)
[9]Nasr, Mona Mohamed, Essam Mohamed Shaaban, and
Ahmed Mostafa Hafez. "Building Sentiment analysis Model
using Graphlab." IJSER, 2017
[10]Text mining for yelp dataset challenge; Mingshan Wang;
University of California San Diego, (2017)
[11] Elli, Maria Soledad, and Yi-Fan Wang. "Amazon
Reviews, business analytics with sentiment analysis." 2016
[12] Xu, Kaiquan, et al. "Mining comparative opinions from
customer reviews for Competitive Intelligence." Decision
support systems 50.4 (2011): 743-754.
[13] Miao, Q., Li, Q., & Dai, R. (2009). AMAZING: A
sentiment mining and retrieval system. Expert Systems with
Applications, 36(3), 7192-7198.
[14] He, Ruining, and Julian McAuley. "Ups and downs:
Modeling the visual evolution of fashion trends with one-
class collaborative filtering." Proceedings of the 25th
International Conference on World Wide Web.International
World Wide Web Conferences Steering Committee, 2016.
... E-commerce refers to the purchase and sale of products and services through the internet. It contains a large number of data, processes, and tools for customers and sellers, such as smart device shopping, cash on delivery, and online payment encryption [1]. According to a research report, due to the covid-19 pandemic, online sales have increased [16]. ...
... Consumers value other people's opinions and experiences and reading a review on a product is the sole method to learn what other customers think about it. Opinions derived from consumers' experiences with certain products have a direct impact on future customer purchases [1]. Negative and slightly negative ratings, on the contrary, frequently result in sales loss. ...
... In [1], authors used amazon reviews data to perform research only in English language and applied six machine learning algorithms, where Linear SVM achieved the highest accuracy of 94.02%. By observing amazon review section, we noticed that customers express a variety of sentiments in their reviews. ...
Conference Paper
Customers of e-commerce platforms exchange their thoughts with such kinds of languages. In the age of the present competitive business world, sentiment analysis is widely used in the e-commerce industry to improve efficiency and better understand to make business decisions. Earlier research on sentiment analysis was in English but there is no such significant work in Bangla language and Romanized Bangla language reviews. Therefore, we have developed a machine learning model where reviews on three different languages (Bangla, English, and Romanized Bangla) are used and applied six machine learning algorithms. We have demonstrated a comparative analysis with existing work and have discussed the detailed accuracy, precision, recall, F1 scores, and ROC area. We have prepared three datasets and labeled all the reviews data as Negative, Positive, Neutral, Slightly Negative, and Slightly Positive sentiment. To perform the analysis, the preprocessed datasets were trained using machine learning techniques, and the model performances is evaluated. For the Bangla dataset, Support Vector Machine(SVM) algorithm performed best by achieving 94% accuracy and for the English and Romanized Bangla dataset, Random Forest algorithm performed best by achieving 93% and 94% accuracy respectively.
... A customer first goes through several products reviews before making the decision of buying that product. In today's where machine learning is assuming great importance, the models which would polarize reviews into positive or negative were developed [2]. So supervised learning methods were used on large scale amazon datasets to polarize it and get its outcomes. ...
... So supervised learning methods were used on large scale amazon datasets to polarize it and get its outcomes. The best accuracy of 93.2% was obtained using Linear Support Vector Machine algorithm in [2]. ...
Conference Paper
Full-text available
Amazon is the most popular online shopping market for most people in the world today. Anything from daily necessities to luxurious items can be bought from here. And especially in recent times where people have to avoid going out to crowded places, platforms like Amazon have emerged as the go-to solution. So, when people want to buy products from these platforms it is important for them to have a look at the reviews before being assured about it. But every product has thousands of reviews for it and it's not easy to analyze them quickly. This paper presents an implementation of a Amazon review sentiment analysis with the web application. Various algorithms are implemented for the experimentation purpose. The combination of logistic regression with CountVectorizer performed well in the term of accuracy. Using the proposed methodology, the user can search for a product on this web-based App and analysis of product reviews, price ranges, ratings and much more will be displayed to the user. The accuracy of the different algorithms is reported in this paper.
... Sentiment classification is an application of natural language processing (NLP) that analyzes subjective texts with emotions to determine the views, preferences, and tendencies of a text. Most prior works [7,[17][18][19]31] belong to supervised sentiment learning, which typically requires a training phase that trains a classifier with labelled training texts followed by an inference phase that identifies the polarity of a testing text based on the trained classifier. Their primary concern is to obtain a high accuracy classifier from a large set of collected and labelled datasets, assuming infinite time and 1 ...
... It is noteworthy that, PLStream is continuously updated and can be used to label sentences with even unforeseen vocabularies. This differs significantly from offline approaches [7,19], where the pre-trained model can quickly become outdated, as both the vocabulary and the polarity model evolve, requiring periodically re-training with labelled datasets. ...
Full-text available
Many of the existing sentiment analysis techniques are based on supervised learning, and they demand the availability of valuable training datasets to train their models. When dataset freshness is critical, the annotating of high speed unlabelled data streams becomes critical but remains an open problem. In this paper, we propose PLStream, a novel Apache Flink-based framework for fast polarity labelling of massive data streams, like Twitter tweets or online product reviews. We address the associated implementation challenges and propose a list of techniques including both algorithmic improvements and system optimizations. A thorough empirical validation with two real-world workloads demonstrates that PLStream is able to generate high quality labels (almost 80% accuracy) in the presence of high-speed continuous unlabelled data streams (almost 16,000 tuples/sec) without any manual efforts.
... Such rare feature problems are also prevalent in a variety of fields. Some other examples include the predication of user ratings with absence/presence indicators of hundreds of keywords extracted from customer reviews (Haque et al., 2018), and the studies of the so-called gut-brain axis with absence/presence data of a large number of microbes (Schloss et al., 2009). ...
Statistical learning with a large number of rare binary features is commonly encountered in analyzing electronic health records (EHR) data, especially in the modeling of disease onset with prior medical diagnoses and procedures. Dealing with the resulting highly sparse and large-scale binary feature matrix is notoriously challenging as conventional methods may suffer from a lack of power in testing and inconsistency in model fitting while machine learning methods may suffer from the inability of producing interpretable results or clinically-meaningful risk factors. To improve EHR-based modeling and utilize the natural hierarchical structure of disease classification, we propose a tree-guided feature selection and logic aggregation approach for large-scale regression with rare binary features, in which dimension reduction is achieved through not only a sparsity pursuit but also an aggregation promoter with the logic operator of ``or''. We convert the combinatorial problem into a convex linearly-constrained regularized estimation, which enables scalable computation with theoretical guarantees. In a suicide risk study with EHR data, our approach is able to select and aggregate prior mental health diagnoses as guided by the diagnosis hierarchy of the International Classification of Diseases. By balancing the rarity and specificity of the EHR diagnosis records, our strategy improves both prediction and model interpretation. We identify important higher-level categories and subcategories of mental health conditions and simultaneously determine the level of specificity needed for each of them in predicting suicide risk.
... The main task of sentiment analysis is to determine the polarity expressed in a review (positive, negative, or neutral). In the last twenty years, many sentiment analysis approaches have been applied in various domains: automobiles (Turney, 2002), tourism (Shi and Li, 2011;Valdivia et al., 2017), movies (Pang et al., 2002;Ghorbel and Jacot, 2011;Kennedy and Inkpen, 2006), banks (Turney, 2002) and products (Yang et al., 2020;Cernian et al., 2015;Haque et al., 2018). Backing to automatic text summarization, the main objective of these summarizers is to produce a summary that includes the main ideas in the input document (El-Kassas et al., 2021) in less space (Radev et al., 2002) and to keep repetition to a minimum (Moratanch and Chitrakala, 2017). ...
Full-text available
By dint of the massive daily production of user-generated content (textual reviews) in E-commerce platforms, the need to automatically process it and extract different types of knowledge from it becomes a necessity. In this work, an attempt has been made to summarize some studies that aim to propose systems, which automatically mine textual reviews expressed in natural languages for the purpose of supporting customers’ decision-making process in E-commerce (buying, renting, and booking). The given review is the first work of this type and it includes 44 studies (30 aspect/feature-based summarizers and 14 reputation systems) published from 2004 to 2021. First, it investigates aspect and feature-based summarizers that aim to help customers in making an informed decision toward online entities (products, movies, hotels, services …). Second, it introduces reputation generation systems that seek to provide valuable information about online items. Finally, it provides recommendations for future research directions and open problems.
... In text classification, dealing with a sequences of long-time-stamp is a major challenge [32]. Rao et al. [19] have also identified this problem in traditional neural networks and proposed the LSTM-based model SR-LSTM. ...
Full-text available
Deep neural networks have emerged as a leading approach towards handling many natural language processing (NLP) tasks. Deep networks initially conquered the problems of computer vision. However, dealing with sequential data such as text and sound was a nightmare for such networks as traditional deep networks are not reliable in preserving contextual information. This may not harm the results in the case of image processing where we do not care about the sequence, but when we consider the data collected from text for processing, such networks may trigger disastrous results. Moreover, establishing sentence semantics in a colloquial text such as Roman Urdu is a challenge. Additionally, the sparsity and high dimensionality of data in such informal text have encountered a significant challenge for building sentence semantics. To overcome this problem, we propose a deep recurrent architecture RU-BiLSTM based on bidirectional LSTM (BiLSTM) coupled with word embedding and an attention mechanism for sentiment analysis of Roman Urdu. Our proposed model uses the bidirectional LSTM to preserve the context in both directions and the attention mechanism to concentrate on more important features. Eventually, the last dense softmax output layer is used to acquire the binary and ternary classification results. We empirically evaluated our model on two available datasets of Roman Urdu, i.e., RUECD and RUSA-19. Our proposed model outperformed the baseline models on many grounds, and a significant improvement of 6% to 8% is achieved over baseline models. Citation: Chandio, B.A.; Imran, A.S.; Bakhtiar, M.; Daudpota, S.M.; Baber, J. Attention-Based RU-BiLSTM Sentiment Analysis Model for Roman Urdu. Appl. Sci. 2022, 12, 3641.
For assessing customer sentiment in Amazon product reviews, this article compares two machine learning algorithms and a deep learning method, BERT (Bidirectional Encoder Representations from Transformer). Machine learning is the most practical approach in the current era of artificial intelligence for training a neural network to handle the majority of real-world issues. In this paper, the real-world scenario of sentiment analysis is considered, ideally the classification problem. Firstly, the data is provided into a model, which evaluates the feature that uses the Term Frequency (TF) and Inverse Document Frequency (IDF) pre-processing methods. Secondly, the algorithms, Naive Bayes classifier and Support Vector Machine are used to analyze the sentiment of the consumer comments and compute metrics like F1 score. Finally, the input data is fed for BERT to process and compute the F1 score. To summarize, this study is to provide a detailed comparative analysis of machine learning techniques and deep learning algorithms.
The Internet is becoming the most useful source of information, ideas, and product evaluations. The ubiquitous use of the Internet has made e-commerce transactions prominent and very practical. There are growing numbers of daily user reviews for different items. These wide range of reviews are useful for both producers and customers. The process of reading all reviews to choose a better choice in product is difficult task for a possible consumer. It is advantageous to use customer reviews on popular items from several sources as feedback for the manufacturers to improve their product and help tackle the variety of problems consumer faces. From a user viewpoint, users may submit their own reviews via various social media, including social networking sites, micro blogs, and forums. For a manufacturer, many social media Websites provide their application programming interfaces (APIs), leading their developers to gather and analyze data to mine the user’s opinion. To that goal, several opinion mining approaches were developed, in which identifying the direction of a review phrase is one of the main difficulties (e.g., positive versus negative). We are proposing a new framework for the classification of sentiments for product evaluation that uses the most broadly utilized ratings. Experimental findings on the Amazon product review dataset indicate that our model can leverage the previous domains’ expertise to lead learning in new fields and can handle ongoing upgrades in multiple areas of products.
Full-text available
Sentiment analysis is called opinion mining which is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes. Starting from the importance of the sentiment analysis generally for individuals and more specifically for gigantic organizations, we started digging in this paper. Graphlab was used to build the sentiment models. Many algorithms were used along with text features selection techniques to predict the positive and negative sentiments like “SVM”, “logistic regression” and “boosted trees”. The mentioned classifiers were applied to a Hotel reviews dataset got from Trip Advisor website to emulate real customer opinions. The results showed that using SVM classifier along with N-grams features selection technique was superior to others.
Conference Paper
Full-text available
This paper presents the design and construction of a Chinese opinion corpus. Based on the observation on the characteristics of opinion expression in Chinese online product reviews, which is quite different from in the formal texts such as news, an annotation framework is proposed to guide the construction of an opinion corpus based on online product reviews. The opinionated sentences are manually identified from the review text. Furthermore, for each comment in the opinionated sentences, its 13 describing elements are annotated including the expressions related to the target product attributes and user opinion expressions as well as the polarity and degree of the opinions. Currently, 12,724 comments are annotated in 10,935 sentences from product reviews. Through statistical observation on the opinion corpus, some interesting characteristics of Chinese opinion expression are presented. This corpus is helpful to support systematic research on Chinese opinion analysis.
Building a successful recommender system depends on understanding both the dimensions of people's preferences as well as their dynamics. In certain domains, such as fashion, modeling such preferences can be incredibly difficult, due to the need to simultaneously model the visual appearance of products as well as their evolution over time. The subtle semantics and non-linear dynamics of fashion evolution raise unique challenges especially considering the sparsity and large scale of the underlying datasets. In this paper we build novel models for the One-Class Collaborative Filtering setting, where our goal is to estimate users' fashion-aware personalized ranking functions based on their past feedback. To uncover the complex and evolving visual factors that people consider when evaluating products, our method combines high-level visual features extracted from a deep convolutional neural network, users' past feedback, as well as evolving trends within the community. Experimentally we evaluate our method on two large real-world datasets from, where we show it to outperform state-of-the-art personalized ranking measures, and also use it to visualize the high-level fashion trends across the 11-year span of our dataset.
With the rapid growth of e-commerce, there are a great number of customer reviews on the e-commerce websites. Generally, potential customers usually wade through a lot of on-line reviews in order to make an informed decision. However, retrieving sentiment information relevant to customer’s interest still remains challenging. Developing a sentiment mining and retrieval system is a good way to overcome the problem of overloaded information in customer reviews. In this paper, we propose a sentiment mining and retrieval system which mines useful knowledge from consumer product reviews by utilizing data mining and information retrieval technology. A novel ranking mechanism taking temporal opinion quality (TOQ) and relevance into account is developed to meet customers’ information need. Besides the trend movement of customer reviews and the comparison between positive and negative evaluation are presented visually in the system. Experimental results on a real-world data set show the system is feasible and effective.
Competitive Intelligence is one of the key factors for enterprise risk management and decision support. However, the functions of Competitive Intelligence are often greatly restricted by the lack of sufficient information sources about the competitors. With the emergence of Web 2.0, the large numbers of customer-generated product reviews often contain information about competitors and have become a new source of mining Competitive Intelligence. In this study, we proposed a novel graphical model to extract and visualize comparative relations between products from customer reviews, with the interdependencies among relations taken into consideration, to help enterprises discover potential risks and further design new products and marketing strategies. Our experiments on a corpus of Amazon customer reviews show that our proposed method can extract comparative relations more accurately than the benchmark methods. Furthermore, this study opens a door to analyzing the rich consumer-generated data for enterprise risk management.
Amazon Review Classification and Sentiment Analysis
  • Aashutosh Bhatt
Bhatt, Aashutosh, et al. "Amazon Review Classification and Sentiment Analysis." International Journal of Computer Science and Information Technologies 6.6 (2015): 5107-5110.
Amazon Reviews, business analytics with sentiment analysis
  • Maria Elli
  • Yi-Fan Soledad
  • Wang
Elli, Maria Soledad, and Yi-Fan Wang. "Amazon Reviews, business analytics with sentiment analysis." 2016
Sentiment Analysis in Amazon Reviews Using Probabilistic Machine Learning
  • Callen Rain
Rain, Callen. "Sentiment Analysis in Amazon Reviews Using Probabilistic Machine Learning."Swarthmore College (2013).
Text-Based Rating Predictions on Amazon Health & Personal Care Product Review
  • Weikang Chen
  • Chihhung Lin
  • Yi-Shu Tai
Chen, Weikang, Chihhung Lin, and Yi-Shu Tai."Text-Based Rating Predictions on Amazon Health & Personal Care Product Review." (2015)