Conference PaperPDF Available

Twitter Sentiment Analysis: A Case Study in the Automotive Industry

Authors:

Figures

Content may be subject to copyright.
Twitter Sentiment Analysis: A Case Study in the
Automotive Industry
Sarah E. Shukri
Business Information
Technology Department
The University Of Jordan
Amman, Jordan
Sar8141197@fgs.ju.edu.jo
Rawan I. Yaghi
Business Information
Technology Department
The University Of Jordan
Amman, Jordan
Roa8141203@fgs.ju.edu.jo
Ibrahim Aljarah
Business Information
Technology Department
The University Of Jordan
Amman, Jordan
i.aljarah@ju.edu.jo
Hamad Alsawalqah
Computer Information
Systems Department
The University Of Jordan
Amman, Jordan
h.sawalqah@ju.edu.jo
Abstract Sentiment analysis is one of the fastest growing
areas which uses the natural language processing, text mining
and computational linguistic to extract useful information to help
in the decision making process. In the recent years, social media
websites have been spreading widely, and their users are
increasing rapidly. Automotive industry is one of the largest
economic sectors in the world with more than 90 million cars and
vehicles. Automotive industry is highly competitive and requires
that sellers, automotive companies, carefully analyze and attend
to consumers’ opinions in order to achieve a competitive
advantage in the market. Analysing consumers’ opinions using
social media data can be very great way for the automotive
companies to enhance their marketing targets and objectives. In
this paper, a sentiment analyses on a case study in the automotive
industry is presented. Text mining and sentiment analysis are
used to analyze unstructured tweets on Twitter to extract the
polarity, and emotions classification towards the automotive
classes such as Mercedes, Audi and BMW. We can note from the
emotions classification results that, “joy” category is better for
BMW comparing to Mercedes and Audi, The sadness
percentage is larger for Audi and Mercedes comparing to BMW.
Furthermore, we can note from the polarity classification that
BMW has 72% positive tweets compared 79% for Mercedes and
83% for Audi. In addition, the results show that BMW has 8%
negative polarity compared 18% for Mercedes and 16% for
Audi.
Keywords Sentiment Analysis; Twitter; Automotive;
Classification
I. INTRODUCTION
Others’ opinions have always been an important piece of
information for consumers when it’s time to make buying
decision. Long before awareness of the World Wide Web
became widespread, people often rely on their friends’
recommendations and specialized magazines or websites as
the main sources of information. But with the growth of the
web over the last decade, the social media nowadays provides
new tools to efficiently create and share useful information
[1]. This made it possible to find out about experiences and
the opinions almost everywhere (blogs, forums, social
networks, news portals, and content-sharing sites, etc.).
Researches indicate that using the social media sites is
considered as the best way to grow a business in terms of
money, time, effort and other resources [2].
Although these opinions are meant to be helpful, the
massive availability of such opinions and their unstructured
nature make it difficult for companies to benefit from them.
To solve this issue, a number of techniques for analysing data
generated by users on social media sites have been developed.
Sentiment analysis which is known as opinion mining is one
such recent techniques. Sentiment analysis uses natural
language processing, text mining and computational linguistic
to extract useful information and knowledge from source data.
The purpose of sentiment analysis is to classify polarity from a
source text into positive, neutral and negative. Text mining is
a crucial step in sentiment analysis where unstructured data
are analysed and scored based on how much it relates to a
specific concept, in order to be classified later based on its
given score [3].
Automotive industry is one of the largest and highly
competitive economic sectors in the world. Due to the high
competition, automotive companies are moving toward using
social media sites to reach further customers and advertise
their products in considerably short time.
Twitter is one of the highest growing social media websites
in the world. Twitter is a micro blogging services which
enables users to tweet within any topic with a maximum
length of 140 characters. As of June 20151, Twitter has more
than 500 million users, out of which more than 302 million are
active users. With an average of 500 million tweets created
daily; twitter became one of the greatest sources of
information that is available on the Internet [4]. Thus, twitter
data can be very useful for automotive marketers because it
can be used for mining consumers’ opinions and reviews in
the automotive industry using sentiment analysis. This can
provide useful insights to help companies in creating a
competitive advantage over their competitors.
1 about.twitter.com/company
2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)
978-1-4799-7431-3/15/$31.00 ©2015 IEEE
This research applies sentiment analysis to analyse peoples’
opinions and reviews about three automotive companies:
Mercedes, Audi, and BMW. To do so, tweets are extracted
from twitter and processed using text mining techniques.
These tweets are then used in the sentiment analysis to classify
tweets based on the sentiment that is expressed in a text [5]. At
the end, tweets are classified into three categories: positive
sentiment, negative sentiment, or neutral sentiment. As the
attempts to apply applying sentiment analysis in the
automotive industry, to the best of our knowledge, are very
few [10, 11], the results of this research can provide further
insights about the importance of analysing the consumers’
reviews and opinions in this industry.
The remainder of this paper is organized as follows: Section
II presents the research work related to this research. Section
III presents the methodology. Section IV presents a
demonstration of the method on the case study and discusses
the results. Section V concludes the paper with a summary and
an outlook on future research direction.
II. RELATED WORK
With the explosion of Web 2.0 platforms, social media sites
become a huge source for consumer voices. Capturing and
analyzing public opinions from social media sites has recently
enjoyed a huge burst of research activity. One of The resulting
emerging fields is sentiment analysis [1, 5]. Subsequently
there have been literally hundreds of papers published on the
subject. Among these papers, we focus on the most related to
the work presented in this paper as follows:
In paper [6], the authors analyzed three of the most popular
companies in pizza industry by using text mining. The authors
studied information from social media sites about the users of
those companies and their competitors. The goal was to help
those companies improve their services and strategies to
attract more customers. They found that social media sites
have an important role in creating competitive advantage.
Authors recommended that good understanding and use of
social media users’ information can improve the relationship
of companies with their users, improve their services’ levels,
and improve the quality of their decision.
Another work [7] presented a new approach to provide
decision support for vehicle defect discovery. Authors used
many techniques such as text mining and sentiment analysis
on popular social media communities. Their focus was on
improving vehicle quality management by analyzing social
media. They found that a good analysis of social media data
can improve automotive quality management strategies.
As an attempt to overcome the challenges that may face the
developers while developing opining mining tools, the authors
in [8] developed a model rule-based approach which can
analyze the linguistics of social media sites.
In [9], we can find a case study which applies sentiment
analysis on twitter. Authors presented a method to make
sentiment analysis and opinion mining using tweets. The first
step in the presented method is collecting the corpus and
preparing it for the analysis while the second one is building
the model to classify the tweets using Naive Bayes algorithm
(NB) based on sentiments (positive, negative and neutral).
Another work [10] introduced what is called the J.D. Power
and Associates (JDPA) sentiment Corpus. The JDPA corpus
consists of users’ blog posts containing opinions about
automobiles. Moreover, the authors presented statistics
including inter-annotator agreement and catalogued
components of sentiment that occur naturally.
The authors in [11] analyzed a data set of around 730,000
Tweets published in a time frame of 19 weeks using sentiment
analysis. Within this data set, they analyzed those Tweets
dealing with the corporate crisis of Toyota in 2010. Their
focus was on the dynamics of discussions in social media in
order to reflect sentiments within these discussions. The
authors Identified and investigated specific stages of
communication, which they called “quiet stages” and “peaks”.
III. METHODOLOGY
As the usage of social media sites grows and extends, the
companies can use social media sites to assess their state in the
market as well as their competitors. This can be done by
studying the data generated by users on these sites. Such data
tells about users’ opinions and comments about these
companies’ products or services. Thus, in this paper we will
study the automotive industry in social media, and try to
answer the following questions:
What is the rate of using these companies’ data by users?
What is the percentage of negative reviews and comments
compared to the positive ones?
Who is the leader in automotive sector based on polarity
classifications of reviews and comments?
While the social media provides a great engagement of
users, and leads to incredibly high level of communication
between the user and the seller, still there are some industries
that do not engage in social media. The automotive industry
represents a great example of engagement in social media, as
published in 2014 CMO council report: 1 out of 4 - which
equals 23%- of car buyers has discussed other users’
experiences and reviews before purchasing their car. 38% of
cars’ costumers said that they will use social media in the next
purchase. 84% of the car’s customers use Facebook with a
24% of them using social media sites to purchase their last car
and in the range of October 2012- April 2013 an amazing
increase in the number of clicks of automotive Ad’s on
Facebook occurred to jump up from 16% to 39%2.
In this paper, we will first discuss the level of engagements
in social media of these three automotive manufacturers. We
extracted the engagements percentage from the Talkwalker
API3. BMW, Mercedes and Audi are defined to be of the
largest automotive brands in Europe, it’s very critical to
discuss the level of their engagement in social media. Figure 1
shows the engagement percentage in different social media
sites.
2 www.cmocouncil.org
3 www.talkwalker.com
2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)
As we can note in Figure 1, BMW has the largest
engagement percentage in twitter with a percentage of 62%.
Mercedes also has the largest engagement percentage throw
online news, Blogs, and Other with 18%, 6%, and 30%,
respectively. Audi also has engagement percentage through
twitter comparing to Mercedes with a percentage of 59%
(Audi), and 47% (Mercedes).
Figure 1. Social Media Sites engagement percentage
A. Data collection
In this paper, we collected data from twitter using the
twitter API. The corpus had 3000 tweets, tweets are extracted
using R4.
B. Data pre-processing
Tweets are filtered to be in English language. The corpus
contains three types of cars: Mercedes, Audi, and BMW. Each
type is represented by 1000 tweets. The tweets are extracted
based on the search query using “@” annotation followed by
the car’s type. To build a good experiment, Dataset of each
car's type was extracted from twitter pages and users. After
that, we have started to prepare the extracted datasets by
cleaning them from any unnecessary characters such as
retweets and usernames' symbols, hashtags, numbers,
punctuations, stop words, whitespaces and html links. In this
paper, we applied the following text mining pre-processing
techniques:
· Tokenization: that reads the text that will be mined and
removes all tabs and punctuations between words and
replaces them with a white space,
· Filtering: that will remove words such as: stop words,
extremely repeated words and rarely repeated words,
· Lemmatization: which will be used to transform all the
verbs to the infinite tense and all the nouns to the singular
form.
· Stemming: will be used to return all the words to their
basic forms where it will remove the plural ‘s’ from the
nouns and the ‘ing’ from the verbs.
4 https://www.r-project.org/
C. Sentiment Analysis Models
We used the classification algorithm Naïve Bayes (NB) to
classify the polarity and emotions in the sentiment analysis.
The NB algorithm is simple, easy to implement and efficient
with acceptable accuracy. Furthermore, two sentiment models
are investigated based on polarity lexicon [13], and emotions
lexicon [14].
The NB algorithm is a simple probabilistic model that
assumes all the data attributes are independent. The
probabilistic model uses the Bayes theorem to solve the
classification problems such as the maximum posterior
probability of the class label given the attributes set is
calculated. Bayes theorem is given by the following equation:
Where C is a Class label, X is the attributes set, while P(C)
and P(X|C) are the prior probability of the class and the
conditional probability of the attributes given the class.
The first sentiment model uses NB classifier, which is
trained by the training data set, and makes use of Wiebe's
polarity lexicon [13]. The training data set is annotated to
three classes: positive, neutral and negative tweets.
The NB polarity classifier uses polarity lexicon based on
the matching criteria between the tweet words and lexicon
words. When the training process is finished and the model is
well trained, the second step begins to test the model using
testing data set, which is not labeled. The testing process is
used to assess the accuracy of the built model. The last step is
to validate the model and extract the polarity percentages for
the three categories; positive, negative, and neutral.
The second NB classifier is trained on training data set and
makes use of emotions lexicon using the Strapparava emotions
lexicon [14]. The training data set is annotated to seven
classes: anger, disgust, fear, joy, sadness, surprise, and
unknown tweets. Like the polarity classification, the matching
criteria between the tweet words and emotions lexicon words.
IV. RESULTS
The tweets collected about BMW, Mercedes, and Audi
contains the @BMW tag, @Mercedesbenz, and @Audi,
respectively. Each tweet is analysed and classified to be
positive or negative or neutral tweet based on a query term and
polarity classification. Table I, Table II, and Table III contain
some tweet samples about BMW, Mercedes, and Audi,
respectively and the polarity classifications.
TABLE I: TWEETS SAMPLES (BMW)
Tweet
Polarity Classification
#BMW Nice car, you can try
it?"
Positive
Elegance and sportiness united
in one vehicle: the new
#BMW #series Coupé
Positive
such a bad car #BMW
Negative
(1)
2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)
TABLE II: TWEETS SAMPLES (MERCEDES)
Tweet
Polarity
Classification
@MercedesBenz Intelligent
innovation and safety as never before.
Preview of the future of the #EClass
Positive
Amazing @MercedesBenz 300 SLR
Positive
@MercedesBenz That's not what we'd
expect. Please contact your local
Workshop so that our Technicians
inspect the issue.
Negative
TABLE III: TWEETS SAMPLES (AUDI)
Tweet
Polarity Classification
@audi Probably one of my
worst decisions was buying an
Negative
Proud to own an Audi @audi
Positive
@audi Sorry RPM but this is
rubbish. There is so much
great motor sport happening
and you dish up crap
Negative
@Audi Excellent SUV from
Audi! Beautiful Car!
Positive
Polarity classification for BMW, Mercedes, and Audi are
shown in Figure 2. The figure shows that BMW has 72%
positive tweets compared 79% for Mercedes and 83% for
Audi. Furthermore, the figure shows that BMW has 8%
negative polarity compared 18% for Mercedes and 16% for
Audi. This gives a good indication for customers seeking to
buy cars from the manufacturers that have a good reviews and
comments from users owning this car and it gives indications
to competitors that Audi is a huge competitor.
Fig 2. Polarity Classification for BMW, Mercedes, Audi
Figure 3 shows emotions classification results for three
automotive companies. BMW emotion classifications are
79% labeled as “unknown”, 5% “Joy”, 0.5% “Surprise”, 9%
“Sadness”, 0% “Fear”, 5.5% “Anger” and 1% for “Disgust”.
Mercedes emotions categories are 56.6% labeled as
“Unknown”, 31.9% “Joy”, 0.5% “Surprise”, 4.1% “Sadness”,
0.4% “Fear”, 6.4% “Anger” and 0.1% for “Disgust”. Audi
emotions categories are 63.2% labeled as “Unknown”, 10%
“Joy”, 17.7% “Surprise”, 5.1% “Sadness”, 0.2% “Fear”, 1.3%
“Anger” and 2.4% for “Disgust”. These results give a good
indicator for customers seeking to buy cars and help them to
take a right decision. We can note that, “joy” category was
better for BMW comparing to Mercedes and Audi. This is can
be due to the fact that positive reviews are not necessary to be
“Joy” always, other categories can be also determined as a
positive, since it has no negative implication.
Fig 3. Emotion Classifications for BMW, Mercedes, and Audi
V. CONCLUSION
Sentiment Analysis is considered one of the most attractive
fields that encourage to study and apply in various sectors. In
this paper, sentiment analysis models are applied on three of
most leading automotive industry companies to extract the
polarity and emotions (opinions) of customers around each
company, which are very useful information that helps in
marketing. The results showed that Audi’s positive polarity
was higher (83%) than other companies. On the other hand,
the negative polarity of Audi is less than all other companies.
This means that for example offers in Audi’s page would
circulate to higher number of satisfied people than in BMW
and Mercedes.
Furthermore, the analysis results show that that the
percentage of positive reviews in Audi are the most among the
three companies with a percentage of 83%. In addition, Audi
negative polarity is less than others with a percentage of 16%.
We can conclude that, the Audi users have more satisfaction
comparing to the other users. This will help the users that
welling to buy a car to compare between the three of the
companies based on the previous users' opinions. In addition,
the emotions classification results were consistent with the
polarity classifications, and give more information about each
polarity class.
2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)
REFERENCES
[1] Cambria, Erik, et al. "New avenues in opinion mining and sentiment
analysis."IEEE Intelligent Systems 2 (2013): 15-21.
[2] Edosomwan, Simeon, et al. "The history of social media and its
impact on business." Journal of Applied Management and
entrepreneurship 16.3 (2011): 79-91.
[3] Li, Nan, and Desheng Dash Wu. "Using text mining and sentiment
analysis for online forums hotspot detection and forecast." Decision
Support Systems 48.2 (2010): 354-368.
[4] Lima, Ana CES, and Leandro N. de Castro. "Automatic sentiment
analysis of Twitter messages." Computational Aspects of Social
Networks (CASoN), 2012 Fourth International Conference on.IEEE,
2012.
[5] Pang, Bo, and Lillian Lee. "Opinion mining and sentiment
analysis."Foundations and trends in information retrieval 2.1-2
(2008): 1-135.
[6] He, Wu, ShenghuaZha, and Ling Li. "Social media competitive
analysis and text mining: A case study in the pizza
industry." International Journal of Information Management 33.3
(2013): 464-472.
[7] Abrahams, Alan S., et al. "Vehicle defect discovery from social
media."Decision Support Systems 54.1 (2012): 87-97.
[8] Maynard, Diana, KalinaBontcheva, and Dominic Rout. "Challenges
in developing opinion mining tools for social media." Proceedings of
the@ NLP can u tag# usergeneratedcontent (2012): 15-22.
[9] Pak, Alexander, and Patrick Paroubek. "Twitter as a Corpus for
Sentiment Analysis and Opinion Mining." LREC.Vol. 10. 2010.
[10] Kessler, Jason S., and Nicolas Nicolov. "The JDPA Sentiment Corpus
for the Automotive Domain."
[11] Stieglitz, Stefan, and Nina Krüger. "Analysis of sentiments in
corporate Twitter communicationA case study on an issue of
Toyota." Analysis 1 (2011): 1-2011.
[12] Rish, Irina. "An empirical study of the naive Bayes classifier." IJCAI
2001 workshop on empirical methods in artificial intelligence.Vol.
3.No. 22.IBM New York, 2001.
[13] Wilson, Theresa, JanyceWiebe, and Paul Hoffmann. "Recognizing
contextual polarity in phrase-level sentiment analysis." Proceedings
of the conference on human language technology and empirical
methods in natural language processing.Association for
Computational Linguistics, 2005.
[14] Strapparava, Carlo, and Alessandro Valitutti. "WordNet Affect: an
Affective Extension of WordNet." LREC.Vol. 4. 2004.
2015 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT)
... On the contrary, however, the figures have been split into Audi and Mercedes-Benz, which show the higher percentages of the sad tweets. The level of engagement rates is the next factor to consider, with BMW boasting the lead on Twitter (62% engagement) and Mercedes dominating on most different online platforms (Shukri et al., 2015). ...
Article
Educational Data Mining (EDM) strategies facilitate the efficient and in-depth analysis of student data. EDM provides useful insights into comprehending student learning patterns and identifying factors that influence academic success. This review aims to evaluate the efficacy of classification algorithms popularly explored in EDM for predicting student performance and identifying common trends in existing EDM research. The review follows a systematic approach, relevant research articles have been cited following an inclusion and exclusion criteria to ensure the selection of studies that specifically address the use of EDM techniques for predicting student academic achievement. According to the review findings, most researchers have utilized the features of cumulative grade point average, internal and external assessment, and demographic information to predict student performance. The most common techniques in EDM for predicting students’ performance are Naïve Bayes and Decision Trees. The review also focuses on the potential for bias, key examination of challenges, and possible future directions in the field. In the context of student performance prediction, ethical considerations regarding privacy, data handling, and the interpretation of results are also identified
... On the contrary, however, the figures have been split into Audi and Mercedes-Benz, which show the higher percentages of the sad tweets. The level of engagement rates is the next factor to consider, with BMW boasting the lead on Twitter (62% engagement) and Mercedes dominating on most different online platforms (Shukri et al., 2015). ...
Article
Full-text available
Amid the ever-growing social media content, sentiment analysis is a technique that is necessary for analyzing public opinion. This paper aims to discover what sentiment prevailed and what reaction took place preceding the election. This paper also presents the results of a study that applied sentiment analysis to Twitter data relating to the 15 th General Elections of Malaysia. This research reviews several sentiment analysis techniques based on Twitter data. Using the method of analyzing 1566 tweets, including re-tweets and replies, gathered between November 12th and November 18th, 2022, the findings give us an understanding of the level of emotions that were tweeted by the users of Twitter towards the 15 th general election in Malaysia. The results indicate that the sentiments expressed in the analysed tweets are distributed as follows: slightly positive (41%), positive (31%), neutral (24%), and slightly negative (4%).
... One of the text mining methods used to identify VoC is sentiment analysis. Sentiment analysis is one of the text mining methods used to determine the contextual polarity of an article, whether negative, neutral, or positive (Shukri et al., 2015). The use of sentiment analysis, as described by Jeong and Yoon (2016), can identify features that can be further developed on smartphones. ...
Article
Full-text available
The research aims to provide the decision-maker with a framework for determining customer requirements during product development. The proposed framework is based on sentiment analysis and supervised multilabel classification techniques. Therefore, the proposed technique can categorise customer reviews based on the “product design criteria” label and the “sentiment of the review” label. To achieve the research goal, the research presented in this article uses the existing product development framework presented in the literature. The modification is conducted especially in the conceptual stage of product development, in which the voice of the customer or a customer review is obtained from the scraping, and a multilabel classification technique is performed to categorise customer reviews. The proposed framework is tested by using the set data on women’s clothing reviews from an e-commerce site downloaded from www.kaggle.com based on data by Agarap (2018). The result shows that the proposed framework can categorise customer reviews. The research presented in this paper has contributed by proposing a technique based on sentiment analysis and multilabel classification that can be used to categorise customers during product development. The research presented in this paper answers one of the concerns in the categorisation of needs raised by Shabestari et al. (2019), namely, the unclear rules or main attributes of a requirement that make these needs fall into certain categories. Categorising customer requirements allows decision-makers to determine the direction of product development to meet customer needs.
... Misopoulos et al. [85] identified various aspects of customer service in the airline industry that consumers found positive or negative by analyzing 67,953 Tweets. Shukri et al. [86] performed sentiment analysis of Tweets about the automotive industry. The work focused on specific car brands: Mercedes, Audi, and BMW. ...
Article
Full-text available
Exoskeletons have emerged as a vital technology in the last decade and a half, with diverse use cases in different domains. Even though several works related to the analysis of Tweets about emerging technologies exist, none of those works have focused on the analysis of Tweets about exoskeletons. The work of this paper aims to address this research gap by presenting multiple novel findings from a comprehensive analysis of about 150,000 Tweets about exoskeletons posted between May 2017 and May 2023. First, findings from temporal analysis of these Tweets reveal the specific months per year when a significantly higher volume of Tweets was posted and the time windows when the highest number of Tweets, the lowest number of Tweets, Tweets with the highest number of hashtags, and Tweets with the highest number of user mentions were posted. Second, the paper shows that there are statistically significant correlations between the number of Tweets posted per hour and the different characteristics of these Tweets. Third, the paper presents a multiple linear regression model to predict the number of Tweets posted per hour in terms of these characteristics of Tweets. The R2 score of this model was observed to be 0.9540. Fourth, the paper reports that the 10 most popular hashtags were #exoskeleton, #robotics, #iot, #technology, #tech, #innovation, #ai, #sci, #construction and #news. Fifth, sentiment analysis of these Tweets was performed, and the results show that the percentages of positive, neutral, and negative Tweets were 46.8%, 33.1%, and 20.1%, respectively. To add to this, in the Tweets that did not express a neutral sentiment, the sentiment of surprise was the most common sentiment. It was followed by sentiments of joy, disgust, sadness, fear, and anger, respectively. Furthermore, hashtag-specific sentiment analysis revealed several novel insights. For instance, for almost all the months in 2022, the usage of #ai in Tweets about exoskeletons was mainly associated with a positive sentiment. Sixth, lexicon-based approaches were used to detect possibly sarcastic Tweets and Tweets that contained news, and the results are presented. Finally, a comparison of positive Tweets, negative Tweets, neutral Tweets, possibly sarcastic Tweets, and Tweets that contained news is presented in terms of the different characteristic properties of these Tweets. The findings reveal multiple novel insights related to the similarities, variations, and trends of character count, hashtag usage, and user mentions in such Tweets during this time range.
... On the other hand, the social media analysis found out only 459 results, which is very little compared to the amount of data generally retrieved through the same method but on different topics. To illustrate, Troisi et al. [83] retrieved a total of 12 million posts for the investigation of the main impacting factors on students' university choice, and Shukri at al. [141] analyzed 3000 tweets when focusing on the Twitter sentiment analysis of the automotive industry. Concerning practitioners, a huge database was retrieved on ECs: 9734 articles from 2015 to 2022. ...
Article
Full-text available
The development of energy communities has the potential to support the energy transition owing to the direct engagement of people who have the chance to become “prosumers” of energy. In properly explaining the benefits that this phenomenon can give to the population, a key set of channels is represented by social media, which can hit the target of citizens who have the budget to join the energy communities and can also “nurture” younger generations. In this view, the present work analyzes the performance of the topic “energy communities” on the main social media in order to understand people’s awareness of its benefits and to assess the societal awareness of this topic in terms of engagement and positive sentiment. The analysis conducted first concerned the definitions and conceptualization of energy communities of academics and practitioners, completed through a content analysis; we then focused on the fallout of these themes on social media and on its engagement (to understand if it was capable of generating a positive attitude). The social media analysis took place through a platform that uses artificial intelligence to analyze communication channels. The results show that there is still poor engagement with the energy community theme in social media, and a more structured communication strategy should be implemented with the collaboration between social media and practitioners/academics. Despite previous studies not analyzing how social media recall the topics of academics and practitioners related to energy communities, this is an important aspect to consider in order to conceive integrated marketing communication for promoting energy communities to citizens, as here demonstrated and proposed for the very first time.
Thesis
Full-text available
An analysis of Twitter and LastQuake text data surrounding the Zagreb earthquake of 2020. The results of the analysis consisted of the polarity and opinion of the text data as well as prominent topics within the text data. The overall aim of the study was to determine the successes and failures of the emergency response following the earthquake event. Key points include, COVID19 considerations, validity of the research methods, and future applications for the research.
Conference Paper
Full-text available
Knowing about communication of specific issues in social media has become increasingly important for the reactive and proactive stakeholder-communication of enterprises. Tools have been designed to monitor social media sites and to aggregate data of discussions in social media. However, these tools do not consider the dynamics of discussions and are not able to reflect sentiments within these discussions. In our contribution, we address these aspects by analyzing a data set of around 730,000 Tweets published in a time frame of 19 weeks. Within this data set, we analyzed those Tweets dealing with the corporate crisis of Toyota in 2010. We classified sentiments by using a linguistic approach. In this context, we identified and investigated specific stages of communication ("quiet stages" and "peaks"). Additionally, our study concentrates on the sentiments found in Tweets of the ten most active participants of the discussion.
Article
Full-text available
While much work has recently focused on the analysis of social media in order to get a feel for what people think about current topics of interest, there are, however, still many challenges to be faced. Text mining systems originally designed for more regular kinds of texts such as news articles may need to be adapted to deal with facebook posts, tweets etc. In this paper, we discuss a variety of issues related to opinion mining from social media, and the challenges they impose on a Natural Language Processing (NLP) system, along with two example applications we have developed in very different domains. In contrast with the majority of opinion mining work which uses machine learning techniques, we have developed a modular rule-based approach which performs shallow linguistic analysis and builds on a number of linguistic subcomponents to generate the final opinion polarity and score.
Article
Full-text available
In this paper we present a linguistic resource for the lexical representation of affective knowledge. This resource (named W ORDNET- AFFECT) was developed starting from WORDNET, through a selection and tagging of a subset of synsets representing the affective meanings. In this paper we present a linguistic resource for a lexical representation of affective knowledge. This re- source (named WORDNET-AFFECT) was developed start- ing from WORDNET, through the selection and labeling of the synsets representing affective concepts. Affective computing is advancing as a field that allows a new form of human computer interaction, in addition to the use of natural language. There is a wide perception that the future of human-computer interaction is in themes such as entertainment, emotions, aesthetic pleasure, motivation, attention, engagement, etc. Studying the relation between natural language and affective information and dealing with its computational treatment is becoming crucial. For the development of WORDNET-AFFECT, we con- sidered as a starting point WORDNET DOMAINS (Magnini and Cavaglia, 2000), a multilingual extension of Word- Net, developed at ITC-irst. In WORDNET DOMAINS each synset has been annotated with at least one domain label (e.g. SPORT, POLITICS, MEDICINE), selected from a set of about two hundred labels hierarchically organized. A do- main may include synsets of different syntactic categories: for instance the domain MEDICINE groups together senses from Nouns, such as doctor#1 (i.e. the first sense of the word doctor) and hospital#1, and from Verbs such as operate#7. For WORDNET-AFFECT, our goal was to have an addi- tional hierarchy of "affective domain labels", independent from the domain hierarchy, with which the synsets repre- senting affective concepts are annotated.
Article
Full-text available
This paper presents a rich annotation scheme for men- tions, co-reference, meronymy, sentiment expressions, modifiers of sentiment expressions including neutral- izers, negators, and intensifiers, and describes a large corpus annotated with this scheme. We describe how this corpus relates to recent, state-of-the-art work in sentiment analysis, and define the various annotation types, provide examples, and show statistics on occur- rence and inter-annotator agreement. This resource is the largest sentiment-topical corpus to date and is pub- licly available. It helps quantify sentiment phenomena, and allows for the construction of advanced sentiment systems and enables direct comparison of different al- gorithms.
Article
Full-text available
The naive Bayes classifier greatly simplify learn-ing by assuming that features are independent given class. Although independence is generally a poor assumption, in practice naive Bayes often competes well with more sophisticated classifiers. Our broad goal is to understand the data character-istics which affect the performance of naive Bayes. Our approach uses Monte Carlo simulations that al-low a systematic study of classification accuracy for several classes of randomly generated prob-lems. We analyze the impact of the distribution entropy on the classification error, showing that low-entropy feature distributions yield good per-formance of naive Bayes. We also demonstrate that naive Bayes works well for certain nearly-functional feature dependencies, thus reaching its best performance in two opposite cases: completely independent features (as expected) and function-ally dependent features (which is surprising). An-other surprising result is that the accuracy of naive Bayes is not directly correlated with the degree of feature dependencies measured as the class-conditional mutual information between the fea-tures. Instead, a better predictor of naive Bayes ac-curacy is the amount of information about the class that is lost because of the independence assump-tion.
Chapter
This chapter presents a rich annotation scheme for mentions, co-reference, meronymy, sentiment expressions, modifiers of sentiment expressions including neutralizers, negators, and intensifiers, and describes a large corpus annotated with this scheme. We define the various annotation types, provide examples, and show statistics on occurrence and inter-annotator agreement. This resource is the largest sentiment-topical corpus to date and is publicly available. It helps quantify sentiment phenomena, and allows for the construction of advanced sentiment systems and enables direct comparison of different algorithms.
Article
The distillation of knowledge from the Web—also known as opinion mining and sentiment analysis—is a task that has recently raised growing interest for purposes such as customer service, predicting financial markets, monitoring public security, investigating elections, and measuring a health-related quality of life. This article considers past, present, and future trends of sentiment analysis by delving into the evolution of different tools and techniques—from heuristics to discourse structure, from coarse- to fine-grained analysis, and from keyword- to concept-level opinion mining.
Article
A pressing need of vehicle quality management professionals is decision support for the vehicle defect discovery and classification process. In this paper, we employ text mining on a popular social medium used by vehicle enthusiasts: online discussion forums. We find that sentiment analysis, a conventional technique for consumer complaint detection, is insufficient for finding, categorizing, and prioritizing vehicle defects discussed in online forums, and we describe and evaluate a new process and decision support system for automotive defect identification and prioritization. Our findings provide managerial insights into how social media analytics can improve automotive quality management.