Content uploaded by Shabib Aftab
Author content
All content in this area was uploaded by Shabib Aftab on Jul 23, 2017
Content may be subject to copyright.
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 28
Abstract— Sentiment analysis and opinion mining is closely
coupled with each other. An extensive research work is being
carried out in these areas by using different methodologies.
Sentiments in a given text are identified by these methodologies
as either positive, negative or neutral. Tweets, facebook posts,
user comments about certain topics and reviews regarding
product, software and movies can be the good source of
information. Sentiment Analysis techniques can be used on such
data by businesses executives for future planning and forecasting.
As the data is obtained from multiple sources and it depends
directly on the user which can be from any part of the world so
the noisiness in data is a common issue such as mistake in
spellings, grammatical errors and improper punctuation.
Different approaches are available for sentiment analysis which
can automatically sort and categorize the data. These approaches
are mainly categorized as Machine Learning based, Lexicon
based and Hybrid. A hybrid approach is the combination of
machine learning and lexicon based approach for the optimum
results, this approach generally yields better results. In this
research work different hybrid techniques and tools have been
discussed and analyzed from different aspects.
Keywords— Hybrid Technique for Sentiment Analysis, Opinion
Mining, Polarity Detection and Social Media
I. INTRODUCTION
he combination of the lexicon based approach and
machine learning approach have improved the
classification performance compared to machine learning
and lexicon approach alone. Due to rapid increase and
globalization of internet, millions of users come online daily
and the amount of user-generated information and data is
increasing with the same pace. The internet has become the
need for several services and businesses in our daily lives. A
lot of textual data is generated by people using social websites
such as facebook and twitter in the form of posts and tweets.
Some of the websites and blogs today contain the section of
user’s comments or feedback so valuable information can also
be taken from these sites to get the sentiments of the users
about any particular topic or the feedback about new product
or software etc. Extraction of sentiments from such data can
yield valuable information about any particular topic, movie
and product services etc [1]. Several tools and techniques are
available now days to extract and classify the sentiments from
the provided data as either positive, negative or neutral. Tools
and techniques from Lexicon based approach uses domain
specific dictionary and lexicons as the major source of lookup
for sentiment classification[2]. These lexicons have
predefined semantic orientations that are later compared with
the input data set for classification as explained by [1]–[7].
Machine learning based approach on the other hand follow the
supervised learning algorithms such as Naive Bayes and
Support Vector Machine to create the training data set
[8]–[10]. Then on the basis of this trained dataset the inputs
are compared and classified as either positive, negative or any
other sentiment [11], [12]. The Hybrid approach which uses
the combination of both lexicon based approach and machine
learning approach. The basic goal of this combination is to
yield the best and optimum results using the effective feature
set of both lexicon and machine learning based techniques,
and to overcome the deficiencies and limitations of both
approaches. Many researchers have combined different
lexicon and machine learning based techniques to generate
better and effective hybrid tools [13]–[17]. In this research,
we will study, analyze and compare different hybrid tools and
techniques for sentiment classification and will discuss
different feature sets and accuracies of the studied approaches.
II. HYBRID TOOLS AND TECHNIQUES
A. pSenti
pSenti is a concept-level sentiment analysis tool that was
presented by [18], it combines lexicon and learning based
sentiment classification methods. As compared to the pure
lexicon based methods pSenti achieved greater accuracy in
sentiment strength detection and polarity classification. On the
other hand, when the tool was compared against pure machine
learning based methods it yielded slightly lower accuracy.
Extensive experiments on two different datasets i.e., CNet
Software Reviews Dataset and IMDB Movie Reviews Dataset
for the evaluation of the proposed approach were performed.
Learning based approach used in the proposed method is not
only responsible for tiny tasks like adjustment of sentiment
values or sentiment words detection but it is also responsible
for evaluation of all aspects of sentiment system.
The main component of the system measures the given
opinionated text and gives the output in terms of collective
Hybrid Tools and Techniques for Sentiment
Analysis: A Review
T
Munir Ahmad1, Shabib Aftab2, Iftikhar Ali3 and Noureen Hameed4
1-4Department of Computer Science, Virtual University of Pakistan
1munirahmad@gmail.com, 2shabib.aftab@gmail.com
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 29
sentiment, such as customer feedback. The final results are
shown with a real valued score between -1 and +1 that can be
transformed as either positive/negative or into a score between
1-5 stars in a latter stage. Advantages of the proposed
approach are that the system can be extended by adding new
linguistic rules or sentiment lexicon can be expanded at any
instance/level. The proposed system is not sensitive to the
changes in the topic. It works better than SentiStrength [5] and
lexicon only as well but its accuracy is slightly lower than
learning only.
B. Combining Lexicon Based and Learning Based
methods for Twitter Sentiment Analysis
For entity level sentiment analysis, [19] used an augmented
lexicon based method. First, they obtained additional
opinionated indicator, i.e. words and symbols, by applying
Chi-square test on results gathered from the lexicon-based
method. Additional opinionated tweets were identified with
the help of new opinionated indicators. For entities in the
newly identified tweets, a sentiment classification algorithm is
employed to assign sentiment polarity scores. The result of the
lexicon method is basically the training data for the classifier
and the whole process has no manual labeling except test set.
This research used five datasets based on the query entities
Obama, Harry Potter, Tangled, iPad and Packer. Proposed
method achieved 85.4% accuracy on the five datasets used in
this research. In the proposed technique (LMS) a relative
improvement over the lexicon-based method was observed.
However, it performed worse in comparison to the pure
learning-based technique but having advantage that it does not
require pre-labeled data. Therefore, the proposed approach is
easy in implementation but cost some performance.
C. SAIL
Another hybrid methodology was developed by [15]. This
study proposed a system for twitter and SMS sentiment
analysis based on hierarchical model, affective lexicon and a
language modeling approach. It is observed that language
model was not good alone but an improved performance was
noticed when using with lexicon-based model. The
hierarchical model proved very successful even using the
n-grams, affective ratings and part-of-speech. The proposed
tool uses an affective lexicon that was spontaneously
generated from massive corpora of raw web data. Words and
bigrams are used for affective ratings calculations and
statistics. As far as the unconstrained data is concerned the
lexicon models were combined with a learning classifier that
is based on the Max-Ent language models that are primarily
taught on a huge external dataset. These two classification
methods for sentiment analysis are then combined to
formulate the final results. The combination of the two proved
to be affective and yielded better results.
D. NILC_USP
The researchers in [14] describes NILC_USP system in
SemEval-2013 and proposed a trio classification process that
combines three classification approaches i.e. the rule-based
approach, the lexicon-based approach and the machine
learning based approach. The proposed algorithm has five
steps.
Normalization: The first step is normalization of the given
input dataset, it can also be referred as pre-processing, it
basically cleans and normalizes the input text, and following
operations are performed by this step.
- Hashtags, URLs and mentions are formulated in
consistent set of codes
- Emoticons are categorized as per their physical
appearance as either happy, sad, laugh, etc. and
assigned with particular codes
- Exaltation signals are detected and marked such as
multiple signs of exclamation
- Misspelled words are corrected
- Part-of-speech tagging is performed
Rule Based Classifier: In this step the pre-processed text is
handed over to the rule based classifier, the only rule applied
by this classifier uses emoticons which are present in the given
text. Empirically it was noticed that the presence of the
positive emoticons in the sentences and tweets are the
indicator of an overall positivity in the text. Likewise, the
presence of negative and bad emoticons refers to negative
aspects in the given text. This step returns a number of
appearances of positive and negative emoticons in the result.
Lexicon-based Classifier: In the proposed system the
lexicon provided by SentiStrength [5] was used. This lexicon
provides a vocabulary of emotions, an emoticons list, negation
and boosting words list. The semantic orientation of every
single word in the given text is calculated in the proposed
algorithm. The polarity of the word is decreased if the words
are negated, likewise the polarity is increased when the words
are intensified, the classifier labels the text as positive,
negative or neutral.
Machine Learning Classifier: Labeled examples are used
by the Machine learning classifiers to learn and classify the
given text, SVM algorithm provided by CLiPS pattern was
used. In the proposed model, bag of words, part of speech sets
and the existence of negation in the sentences were used as the
feature set by the classifier.
The results of this study showed that the hybrid classifier
approach can improve results based on the advantage of
multiple sentiment analysis techniques over rule-based,
lexicon-based and machine learning methods.
E. Combining Lexicon based and Learning based
approaches for improved performance and convenience in
sentiment classification
[16] proposed a hybrid approach to improve the
performance of sentiment analysis process. The programing
language chosen for the implementation of this algorithm was
Python. The proposed algorithm is composed of three steps
after pre-processing, the first part refers to the lexicon-based
model and it deals with finding the optimum parameters for
the classifier. While the second part refers to the learning-
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 30
based model and deals with the analysis of the model that
performs better. Lastly, the third part refers to the hybrid
model that analyze and decides the optimal MID ratio.
The Lexicon-Based Model: A training set is not required by
the lexicon-based model. Only a lexicon is required from
which the classifier fetches the sentiment classification and
negation words, and the aforementioned test set that the
classifier runs on for further processing. The lexicon based
model used in the proposed research work was AFINN as
described by [20].
The Learning Based Model: At this stage, it utilizes the
aforementioned SciKit Learn framework, that provides a
pipeline structure and allows several transformations to be
applied to the data and formulate it as needed, creating a final
model that classifies the data. By replacing the modeling part
of the pipeline structure it can be tested with different
classifiers to evaluate and calculate which classifier yields the
best and optimal results. Following three classifiers were
tested by the researchers Multinomial Naive Bayes, Bernoulli
naive Bayes and SVM.
F. A Hybrid approach for sentiment classification of
Egyptian Dialect Tweets
A hybrid approach was proposed by [21] that was crafted to
improve the performance measures of sentiment analysis for
the Arabic Language. This study focused on tweets sentiment
classification for Egyptian dialect. Arabic is one of the widely
used languages on the web [22]. Many researchers have
worked on Arabic language sentiment analysis on different
data sets with different tools and algorithms [23].
Following steps were carried out by the researcher for the
implementation of the hybrid technique:
- Step 1: The features to be used by the machine
learning approach are identified and separated.
- Step 2: The annotated corpus to be used for training
and validation of the best classifier at different
corpus sizes is built by the system.
- Step 3: Sentiment lexicon of different sizes is built
using the annotated corpus
- Step 4: Theses different approaches are combined and
tested for better and optimized results
- Step 5: Straight forward and simple method is crafted
to detect negations in the hybrid approach
The results obtained by this study using hybrid approach
showed better performance than other sentence-level
classification systems,
G. Sentiment Analysis: A Review and Comparative
Analysis of Web Services
The authors in [24] conducted the comparison of 15
sentiment analysis techniques/tools , many of these tools were
based on hybrid approach (combining the Machine Learning
based algorithm and Lexicon based algorithms). According to
researcher tools like Alchemy and Semantria can be used for
any kind of text classification even if the texts are large in
size. These tools can be the good option if the text is ironic.
Moreover, other tools such as Wingify and Viralheat may not
be the good options due to the less effective results however
further testing of these tools on different data sets is needed.
They have pointed out that there are many interlinked and
closely coupled tasks which are observed during the sentiment
analysis; it is difficult to separate them clearly as most of them
are quite close to each other and share common aspects. Some
of the important tasks are as under:
- Sentiment Classification: Each text, sentence or document
represents some sentiments which may be positive, negative or
neutral. Searching for these sentiments are sometimes referred
as sentiment orientation or sentiment polarity detection as
described by [25].
- Subjectivity Classification: An objective sentence may
contain factual information while the subjective sentence may
contain opinion, emotion and belief etc. Subjectivity detection
is a crucial task in sentiment analysis. This process is deemed
to be even more complex than normal sentiment classification
(positive, negative or neutral) as explained by [26]–[29].
- Opinion Summarization: It is an important task to
summarize the opinion within a text and detects the major
features of an object shared within one or multiple documents
as explained by [30].
Other than these three, there are other tasks such as Opinion
retrieval [31], Sarcasm and Irony detection [32] and
others [33].
H. Alchemy API
Alchemy API [34] is offered as a service and used for
enriching the text content using automated tagging, semantic
analysis and semantic mining. It is a hybrid tool based on NLP
and machine learning algorithms. It offers features like named
entity extraction, concept tagging, keyword extraction,
sentiment analysis, relation extraction, automatic language
identification, structured data extraction and many other
features[35]. IBM acquired Alchemy API in 2015 and this
technology in now a core component of cognitive APIs
offered on IBM’s Watson developer cloud. All the services
are accessed via HTTP REST interface and different SDKs
are available for Java, C# or Perl. The researcher explained
the usage of Alchemy API for enterprise grade text analysis in
[36]. It classifies the sentiment from text being analyzed into
three categories: Positive, Negative and Neutral. The degree
of sentiment is measured in the range of [-1,1] and it supports
English and German languages. The API is capable of
performing sentiment analysis on document, entity or
keywords level and it is able to detect directional sentiment
for subject-action-object relations.
I. Building Large-Scale Twitter-Specific Sentiment
Lexicon: A Regression Learning Approach
The study [37] proposed TS-Lex, that is a large scale
twitter specific lexicon and it is based on a representation
learning approach. The proposed methodology was comprised
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 31
of two parts. In the first part a representation learning
algorithm was used for effective learning of phrases
embedding, which were later used as features for
classification. In the second part a seed expansion algorithm
was used. This algorithm expands a small list of sentiment
seeds to obtain the training data from them which will be
further used for building the phrase-level classifier. Precisely
the tailored neural architecture was introduced that integrated
the sentiment information of tweets with its hybrid loss
function and then it was used for learning sentiment-specific
phrase embedding (SSPE). SSPE was obtained by looking for
positive and negative emoticons in the tweets, no manual
annotation was made on it. To further collect the training data,
alike phrases from Urban Dictionary were used to expand a
trivial list of sentiment seeds that were later used to build
phrase level classifier. TS-Lex experimental results showed
that sentiment lexicons that were previously introduced were
out performed by this algorithm and further it adds
improvements to the top-performing system in SemEval 2013
by combining features.
J. Sentiment Analysis on Twitter
A hybrid approach was proposed by [38], both the corpus
based and the dictionary based approaches were used in it to
detect semantic orientation of the opinion words from twitter
dataset. To obtain the sentiment polarity, opinion words were
taken from the dataset (Combination of adjectives, verbs and
adverbs). Adjectives score was calculated using log linear
classifier whereas verbs and adverbs score was calculated
using word seed list. If the verbs and adverbs are not
recognized by the WordNet then they are rejected because
they may not be the legitimate words. Afterwards the corpus
based approach was used to find the linguistic orientation of
the adjectives while the dictionary based method was used to
find semantic orientation of verbs and adverbs. If the
orientation was not calculated, these would be de-listed from
the opinion word list. An emotion intensifier was applied
through a linear equation and overall sentiment of the tweet
was calculated. A case study of a tweet was presented for
illustration purposes to verify the effectiveness of the
suggested method. The experimental results proved that the
proposed system has the features of recognizing the semantic
orientation and served as a partial view of the occurrence.
Study recommends more research using larger samples to
validate or invalidate these findings.
K. Sentiment Analysis using Sentiment Features
The study [39] proposed a hybrid approach for twitter
sentiment analysis. Sentiment lexicons were used to generate a
new feature set and this lexicon was used to train a linear
SVM classifier. The results showed that the suggested hybrid
method outperformed the state of the art unigram baseline. It
was evaluated in perspective of sentiment analysis that moving
towards sentiment features is optimal than conventional text
processing features. All the features can be computed in a very
short time and it performs better than unigram feature set. The
proposed system has a low memory and time complexity
because of very small feature set size. The baseline SVM
unigram model with emoticons and stop words was selected
because it performed better than all other combinations. The
SVM achieved an overall accuracy of 86.7% as our baseline
and it performed better than Naïve Bayes and likewise Naïve
Bayes performed better than MaxEnt. The proposed method
showed the accuracy of 89.13% with significant margin with
the baseline.
L. Sentiment Analysis using Support Vector Machines
with diverse information sources
Tony Mullen and Nigel Collier worked on the sentiment
analysis with the help of support vector machine. In this study
[40] they used diverse information sources. For the
classification of text, author introduced negative and positive
approach using SVM. SVM is powerful and well known tool
that allows to classify the vectors of real valued feature. The
proposed method was applied on Movie reviews data set from
Epinions.com and the results showed that the hybrid SVM
which combines unigram styled feature based on SVMs
showed better result as compared to the SVMs that are based
on real-valued favorability measures. Current techniques
emphasize on the use of variety of random information
sources and SVM helps as an ideal tool to bring the sources
together. Researchers used different techniques of assigning
semantic importance to words & phrases available in the text.
In this approach the researchers concluded that words within
the text worked in an efficient way as compared to the old
approach (bag-of-word). The model is further combined with
unigram models that have shown effective results in the past
as explained by [41].
M. Improving Twitter Sentiment Analysis with Topic-
Based Mixture Modeling and Semi-Supervised Training
Multiple approaches to improve Twitter sentiment analysis
were studied by Bing Xiang & Liang Zhou [42]. They
proposed improvement of twitter sentiment Analysis with the
help of topic based mixture modeling approach along with
semi supervised training. The aim of this study was the
presentation of different approaches for advanced Twitter
sentiment analysis. In this study initially they built a state of
the art baseline for rich feature set then a topic-based
sentiment mixture model was built having the topic specified
data arranged in a semi supervised training structure. The
information regarding topic is generated with the help of topic
modeling which is based on an application of LDA (Latent
Dirichlet Allocation).The proposed approach performed better
than the top system in the task SemEval-2013 in terms of
averaged F-Scores. Several experiments were carried out on
data from the task B of Sentiment Analysis in Twitter in
SemEval-2013. They used data distributed in positive,
negative and neutral to tune parameters and features of
classification. Experiments showed that weighting adds 2% of
improvement and the universal sentiment mode achieved 69.7
average F-Score with all features combined.
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 32
N. MSA-COSRs
Multi-aspect sentiment analysis was analyzed by Xianghua
et. al [43] for the Chinese online social reviews that was based
on topic modeling and the HowNet lexicon. In this research
authors proposed an efficient way to spontaneously find the
aspects that are under discussion in Chinese social reviews.
They called this approach as a Multi-aspect Sentiment
Analysis for Chinese Online Social Reviews (MSA-COSRs).
In this study first they applied the Latent Dirichlet Allocation
(LDA) model to find out the multi aspect global topics of
social reviews, after that they extracted the local topics and
sentiment associated with it. Multi aspect analysis is
composed of two subtasks: first is pulling out the aspects and
the second subtask is orientation of sentiment calculation of
aspect. The LDA trained model identified the aspects of local
topics and polarity of sentiment related with text is classified
by HowLexicon. Results of this approach help in improving
the sentiment analysis. Multi fine grained topics and linked
sentiments are identified by it. This is very helpful to tackle
the sentiment analysis and it helps to study the sentiment
orientation with deep accuracy. With the success of this
method it is difficult to train the LDA model for a suitable
topic. Experimental results showed that the proposed model
not only gain optimal topic partitioning results, but it also
helps in the improvement of sentiment analysis accuracy.
III. DISCUSSION
Sentiment Analysis and classification is partially dependent
on the sentimental separation of the text, reviews, comments
or any input datasets. The lexicon based approach works
better when there is a clear boundary between the positive and
negative sentiments within the input dataset. When there are
no clear boundaries between the specific sentiments on the
target dataset the machine learning based approach works
better. One of the main reasons behind the poor sentiment
separation in the text obtained from different sources on the
web like Facebook posts, tweets, product and movie reviews
is that these are user entered data and may contain wrong
punctuations, grammatical mistakes, fuzzy and noisy texts.
We've discussed different hybrid techniques in this paper
which performed better than the lexicon based approach and
the learning based techniques. The ease of implementation
which makes the hybrid approach a substantial and affective
option for sentiment analysis. Comparison of the feature list
and the results obtained on different data sets have been
arranged and presented in this research for a better
understanding of the hybrid approach and for future reference.
IV. CONCLUSION
There are a lot of studies available on the hybrid methods
for sentiment classification but comprehensive and compact
information on this particular topic was required. In our
research we have discussed different hybrid techniques and
tools. Significant outcomes and results have been obtained
while comparing these hybrid techniques and tools. Our study
will serve the researchers to have a better view of the hybrid
approach for sentiment classification. A comparative analysis
of the techniques by using different dataset is also available in
the research that can be further extended.
Table 1: Tools / Techniques, Features and their accuracy
REFERENCES
[1] S. J. M. Modha, Jalaj S. , Gayatri S. Pandi, “Automatic
Sentiment Analysis for Unstructured Data,” Int. J. Adv. Res.
Comput. Sci. Softw. Eng., vol. 3, no. 12, pp. 91–97, 2013.
[2] F. M. Kundi, A. Khan, S. Ahmad, and M. Z. Asghar,
“Lexicon-Based Sentiment Analysis in the Social Web,” J.
Basic. Appl. Sci. Res, vol. 4, no. 6, pp. 238–248, 2014.
[3] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede,
“Lexicon-Based Methods for Sentiment Analysis,” Comput.
Linguist., vol. 37, no. 2, pp. 267–307, 2011.
[4] M. Ahmad, S. Aftab, S. S. Muhammad, and U. Waheed,
“Tools and Techniques for Lexicon Driven Sentiment
Analysis : A Review,” Int. J. Multidiscip. Sci. Eng., vol. 8, no.
1, pp. 17–23, 2017.
[5] M. Thelwall, K. Buckley, G. Paltoglou, and D. Cai,
“Sentiment Strength Detection in Short Informal Text,” Am.
Soc. Informational Sci. Technol., vol. 61, no. 12, pp. 2544–
2558, 2010.
[6] M. Thelwall and K. Buckley, “Topic-based sentiment analysis
for the social web: The role of mood and issue-related words,”
J. Am. Soc. Inf. Sci. Technol., vol. 64, no. 8, pp. 1608–1617,
2013.
[7] X. Ding, X. Ding, B. Liu, B. Liu, P. S. Yu, and P. S. Yu, “A
holistic lexicon-based approach to opinion mining,” Proc. Int.
Conf. Web search web data Min. - WSDM ’08, p. 231, 2008.
[8] J. Fang and B. Chen, “Incorporating Lexicon Knowledge into
INTERNATIONAL JOURNAL OF MULTIDISCIPLINARY SCIENCES AND ENGINEERING, VOL. 8, NO. 4, JUNE 2017
[ISSN: 2045-7057] www.ijmse.org 33
SVM Learning to Improve Sentiment Classification,” Proc.
Work. Sentim. Anal. where AI meets Psychol., pp. 94–100,
2011.
[9] N. Vasfisisi, M. Reza, and F. Derakhshi, “Text Classification
with Machine Learning Algorithms,” J. Basic. Appl. Sci. Res,
vol. 3, pp. 31–35, 2013.
[10] M. Ahmad, S. Aftab, S. S. Muhammad, and S. Ahmad,
“Machine Learning Techniques for Sentiment Analysis: A
Review,” Int. J. Multidiscip. Sci. Eng., vol. 8, no. 3, pp. 27–
32, 2017.
[11] P. Goncalves, B. Fabrício, A. Matheus, and C. Meeyoung,
“Comparing and Combining Sentiment Analysis Methods
Categories and Subject Descriptors,” Proc. first ACM Conf.
Online Soc. networks, pp. 27–38, 2013.
[12] J. Khairnar and M. Kinikar, “Machine Learning Algorithms
for Opinion Mining and Sentiment Classification,” Int. J. Sci.
Res. Publ., vol. 3, no. 6, pp. 1–6, 2013.
[13] R. Prabowo and M. Thelwall, “Sentiment analysis: A
combined approach,” J. Informetr., vol. 3, no. 2, pp. 143–157,
2009.
[14] P. P. Balage Filho and T. A. S. Pardo, “NILC{_}USP: A
Hybrid System for Sentiment Analysis in Twitter Messages,”
in Second Joint Conference on Lexical and Computational
Semantics (*SEM), Volume 2: Proceedings of the Seventh
International Workshop on Semantic Evaluation (SemEval
2013), 2013, vol. 2, no. SemEval, pp. 568–572.
[15] N. Malandrakis, A. Kazemzadeh, A. Potamianos, and S.
Narayanan, “SAIL : A hybrid approach to sentiment analysis,”
vol. 2, no. SemEval, pp. 438–442, 2013.
[16] F. Sommar and M. Wielondek, “Combining Lexicon- and
Learning-based Approaches for Improved Performance and
Convenience in Sentiment Classification,” 2015.
[17] S. Tan, Y. Wang, and X. Cheng, “Combining learn-based and
lexicon-based techniques for sentiment detection without
using labeled examples,” Proc. 31st Annu. Int. ACM SIGIR
Conf. Res. Dev. Inf. Retr. SIGIR 08, p. 743, 2008.
[18] A. Mudinas, D. Zhang, and M. Levene, “Combining lexicon
and learning based approaches for concept-level sentiment
analysis,” Proc. First Int. Work. Issues Sentim. Discov. Opin.
Min. - WISDOM ’12, pp. 1–8, 2012.
[19] L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu,
“Combining lexicon-based and learning-based methods for
Twitter sentiment analysis,” Int. J. Electron. Commun. Soft
Comput. Sci. Eng., vol. 89, pp. 1–8, 2015.
[20] F. Å. Nielsen, “A new ANEW: Evaluation of a word list for
sentiment analysis in microblogs,” in CEUR Workshop
Proceedings, 2011, vol. 718, pp. 93–98.
[21] A. Shoukry and A. Rafea, “A hybrid approach for sentiment
classification of Egyptian dialect tweets,” Proc. - 1st Int. Conf.
Arab. Comput. Linguist. Adv. Arab. Comput. Linguist. ACLing
2015, pp. 78–85, 2016.
[22] M. Elhawary and M. Elfeky, “Mining Arabic business
reviews,” in Proceedings - IEEE International Conference on
Data Mining, ICDM, 2010, pp. 1108–1113.
[23] R. Duwairi, M. N. Al-refai, and N. Khasawneh, “Feature
Reduction Techniques for Arabic Text Categorization,” J. Am.
Soc. Inf. Sci., vol. 60, no. 11, pp. 2347–2352, 2009.
[24] J. . Serrano-Guerrero, J. A. . Olivas, F. P. . Romero, and E. .
C. Herrera-Viedma, “Sentiment analysis: A review and
comparative analysis of web services,” Inf. Sci. (Ny)., vol. 311,
pp. 18–38, 2015.
[25] L. C. Yu, J. L. Wu, P. C. Chang, and H. S. Chu, “Using a
contextual entropy model to expand emotion words and their
intensity for the sentiment classification of stock market
news,” Knowledge-Based Syst., vol. 41, pp. 89–97, 2013.
[26] L. Barbosa and J. Feng, “Robust Sentiment Detection on
Twitter from Biased and Noisy Data,” Coling, no. August, pp.
36–44, 2010.
[27] A. Esuli and F. Sebastiani, “SENTIWORDNET: A Publicly
Available Lexical Resource for Opinion Mining,” Proc. 5th
Conf. Lang. Resour. Eval., pp. 417–422, 2006.
[28] I. Maks and P. Vossen, “A lexicon model for deep sentiment
analysis and opinion mining applications,” in Decision
Support Systems, 2012, vol. 53, no. 4, pp. 680–688.
[29] S. Baccianella, A. Esuli, and F. Sebastiani, “SentiWordNet 3 .
0 : An Enhanced Lexical Resource for Sentiment Analysis and
Opinion Mining SentiWordNet,” Analysis, vol. 0, pp. 1–12,
2010.
[30] D. Wang, S. Zhu, and T. Li, “SumView: A Web-based engine
for summarizing product reviews and customer opinions,”
Expert Systems with Applications, vol. 40, no. 1. pp. 27–33,
2013.
[31] L. Guo and X. Wan, “Exploiting syntactic and semantic
relationships between terms for opinion retrieval,” Journal of
the American Society for Information Science and Technology,
vol. 63, no. 11. pp. 2269–2282, 2012.
[32] A. Reyes and P. Rosso, “Making objective decisions from
subjective data: Detecting irony in customer reviews,” in
Decision Support Systems, 2012, vol. 53, no. 4, pp. 754–760.
[33] J. Savoy, “Authorship Attribution Based on Specific
Vocabulary,” ACM Trans. Inf. Syst., vol. 30, no. 2, p. Art. nos.
12, 1--30, 2012.
[34] “AlchemyAPI.” [Online]. Available:
https://www.ibm.com/watson/alchemy-api.html.
[35] K. Shaalan and H. Raza, “NERA: Named entity recognition
for Arabic,” J. Am. Soc. Inf. Sci. Technol., vol. 60, no. 8, pp.
1652–1663, 2009.
[36] J. Turian and D. Ph, “Using AlchemyAPI for Enterprise-Grade
Text Analysis,” 2013.
[37] D. Tang, F. Wei, B. Qin, M. Zhou, and T. Liu, “Building
Large-Scale Twitter-Specific Sentiment Lexicon: a
Representation Learning Approach,” Proc. 25th Int. Conf.
Comput. Linguist. (COLING 2014), pp. 172–182, 2014.
[38] T. M. S. Akshi Kumar, A. Kumar, and T. M. Sebastian,
“Sentiment Analysis on Twitter,” IJCSI Int. J. Comput. Sci.
Issues, vol. 9, no. 4, pp. 372–378, 2012.
[39] S. A. Bahrainian and A. Dengel, “Sentiment Analysis using
sentiment features,” Proc. - 2013 IEEE/WIC/ACM Int. Jt.
Conf. Web Intell. Intell. Agent Technol. - Work. WI-IATW
2013, vol. 3, pp. 26–29, 2013.
[40] T. Mullen and N. Collier, “Sentiment analysis using support
vector machines with diverse information sources,” Conf.
Empir. Methods Nat. Lang. Process., pp. 412–418, 2004.
[41] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?:
sentiment classification using machine learning techniques,”
Proc. Conf. Empir. Methods Nat. Lang. Process., pp. 79–86,
2002.
[42] B. Xiang, “Improving Twitter Sentiment Analysis with Topic-
Based Mixture Modeling and Semi-Supervised Training.,”
Acl, pp. 434–439, 2014.
[43] F. Xianghua, L. Guo, G. Yanyan, and W. Zhiqiang, “Multi-
aspect sentiment analysis for Chinese online social reviews
based on topic modeling and HowNet lexicon,” Knowledge-
Based Syst., vol. 37, pp. 186–195, 2013.