FIGURE 1 - uploaded by Abdulmotaleb El Saddik
Content may be subject to copyright.
Source publication
Sports fans generate a large amount of tweets which reflect their opinions and feelings about what is happening during various sporting events. Given the popularity of football events, in this work, we focus on analyzing sentiment expressed by football fans through Twitter. These tweets reflect the changes in the fans’ sentiment as they watch the g...
Contexts in source publication
Context 1
... analysis can be considered a docu- ment classification problem, aimed at separating documents which express positive and negative sentiments by exploiting certain syntactic and linguistic features [23]. In the literature, the sentiment analysis follows the general framework for the classification problem, which is depicted in Figure 1. Feature extraction and classifier learning are the main components. ...
Context 2
... analysis can be consid- ered a document classification problem, aimed at separating documents which express positive and negative sentiments by exploiting certain syntactic and linguistic features [23]. In the literature, the sentiment analysis follows the general framework for the classification problem, which is depicted in Figure 1. Feature extraction and classifier learning are the main components. ...
Similar publications
The social media in digital form over internet is getting popularity in recent years. This digital platform is being used by many to share their thought or opinion. Though these social media had given results too many good causes, there are some users are present on these platforms for radical activities. In this paper, the tweets form the digital...
The rapid development of information technology has changed the way people interact and express their opinions on public policies, including the People's Housing Savings (Tapera) policy in Indonesia. People now primarily express their views openly on social media platforms like Twitter, generating a substantial amount of text data for analysis to u...
The rapid development of information technology has changed the way people interact and express their opinions on public policies, including the People's Housing Savings (Tapera) policy in Indonesia. People now primarily express their views openly on social media platforms like Twitter, generating a substantial amount of text data for analysis to u...
With the rapid growth of usage of online social media platforms in daily life there has also been an increase in opinion mining or sentiment analysis to extract the user's sentiments or views towards any topic. Twitter's data or tweets has been the focus point among the researchers as it provides abundant data and in a wide variety of fields. While...
PT Tiki Jalur Nugraha Ekakurir (JNE) is the most popular courier service in the country and ranked the first in the category of courier service companies in Indonesia Top Brand 2018. Despite being a top brand, JNE still faces a variety of complaints posted by customers on the internet and social media. This makes it interesting to study as a means...
Citations
... Sports fans generate many tweets that reflect their feelings throughout a sporting event. Soccer is very popular, so it is possible to analyze the fans' sentiments expressed in the tweets; they reflect the feelings expressed during the game and the events that occur, such as goals, penalties, and fouls [13]. ...
... In Aloufi and Saddik [13], a dataset of feelings expressed in football matches was developed, which were manually annotated, collecting tweets that were posted during two popular football events: the FIFA World Cup 2014 and the UEFA Champions League (CL) 2016/ 2017, and from them they created a lexicon, later they developed a classifier capable of recognizing the feelings expressed in a football game, for this purpose authors tested the SVM, Random Forest and Naive Bayes classifiers. The results of the different sentiment models using training with the CL 2016/2017 data showed better performance than those trained with the FIFA dataset. ...
This article presents an approach applying artificial intelligence techniques for sentiment analysis to identify potentially negative situations in soccer games. Two artificial intelligence techniques were employed for sentiment analysis: (1) bag-of-words and (2) computer vision. The first is used for Natural Language Processing (NLP) and sentiment identification, and the second is used for computer-based emotion recognition. Four soccer matches were analyzed using data from X social media platform (formerly Twitter). The evaluation was performed over real scenarios in Mexico: an atypical match dated March 5th, 2002; the final of the 2023 closing season; a regular match of 2024; and the game of the closing season 2024. The results indicate that for a critical event, an average negative perception of 77.5%; for a closing season match, a positive perception of 54.9%; for a regular boring season match, an average negative perception of 58.55%.; and for a final match of the closing session 2024, an average negative perception of 62.2%.
... Analyzing such data provides valuable insights for football stakeholders such as team coaches, players, management, and government. This helps them make data-driven decisions that can enhance the performance of athletes, improve the viewing experience for fans, and gain competitive superiority, which revolutionized the way football events are planned, organized, managed, and executed (Wunderlich and Memmert 2020;Aloufi and El Saddik 2018). ...
... Further, polarity analysis was conducted using NB, K-nearest neighbor (KNN), and random forest algorithms. In the same context, Aloufi et al. introduced a football sentiment dataset containing around 54,000 manually labeled tweets from both the 2014 FIFA World Cup in Brazil and the 2016/2017 UEFA Champions League (Aloufi and El Saddik 2018). They evaluated three learning algorithms (SVM, MNB, and RF) and found that SVM had the highest accuracy rate. ...
... Several researchers have analyzed the sentiments and emotions of sports fans. Aloufi and El Saddik (2018) for example, evaluated sentiments expressed by football fans on the X platform. They developed a football-specific sentiment dataset and utilized the dataset to automatically create a football-specific sentiment lexicon. ...
The 2022 Qatar World Cup created massive global attention and generated widespread discussions on different social media platforms, including the X platform. The event was the subject of intense debate after Qatar was announced as the host. Opinions were divided, with supporters and critics weighing in based on political, ethical, cultural, and social considerations. Public sentiments evolved throughout three key phases-before, during, and after the event-shaped by numerous factors. This study aims to analyze these sentiments during these three stages based on a novel hybrid evolutionary approach. Three versions for each stage were produced by applying pre-trained word embeddings with 100 and 400 features and sentiment features combined with word embeddings. In total, nine different versions of datasets were employed to examine the proposed approach. Furthermore, five different metaheuristic algorithms were applied: the multi-verse optimizer (MVO), the genetic algorithm (GA), the particle swarm optimization (PSO), the salp swarm algorithm (SSA), and the whale optimization algorithm (WOA). The five metaheuristic algorithms were combined with the feature selection-support vector machine (FS-SVM) and weighting-support vector machine (WSVM) to examine the newly created dataset versions. The results reveal that people’s perspectives shifted from negative before the event to positive during and after the event. Moreover, a comparison of the proposed MVO-WSVM and MVO-SVM-Fs approaches with other metaheuristic algorithms showed the superior accuracy of the proposed approaches in sentiment prediction.
... Sentiment analysis has also been applied to sports-related data. Aloufi and El Saddik (2018) developed a domain-specific strategy to comprehend the sentiment of the posts of football supporters on X. The researchers manually labelled a football-specific sentiment dataset that they had developed, and this dataset was employed to create a sentiment lexicon that is tailored to football. ...
In the last few years, various topics, including sports, have seen social media platforms emerge as significant sources of information and viewpoints. Football fans use social media to express their opinions and sentiments about their favourite teams and players. Analysing these opinions can provide valuable information on the satisfaction of football fans with their teams. In this article, we present Soutcom, a scalable real‐time system that estimates the satisfaction of football fans with their teams. Our approach leverages the power of social media platforms to gather real‐time opinions and emotions of football fans and applies state‐of‐the‐art machine learning‐based sentiment analysis techniques to accurately predict the sentiment of Arabic posts. Soutcom is designed as a cloud‐based scalable system integrated with the X (formerly known as Twitter) API and a football data service to retrieve live posts and match data. The Arabic posts are analysed using our proposed bidirectional LSTM (biLSTM) model, which we trained on a custom dataset specifically tailored for the sports domain. Our evaluation shows that the proposed model outperforms other machine learning models such as Random Forest, XGBoost and Convolutional Neural Networks (CNNs) in terms of accuracy and F1‐score with values of 0.83 and 0.82, respectively. Furthermore, we analyse the inference time of our proposed model and suggest that there is a trade‐off between performance and efficiency when selecting a model for sentiment analysis on Arabic posts.
... Aloufi and Saddik [5] presented a football-specific sentiment classifier trained on 54,526 manually labeled tweets using SVM, multinomial Naive Bayes (MNB), and random forest (RF), with the SVM classifier exhibiting the most robust performance. Pacheco et al. [6] discovered increased audience engagement on as teams exited the competition. ...
... This result aligns with previous research conducted by [10], [11] who also obtained an accuracy of 72% using the SVM algorithm with the word2vec model in predicting sentiment in United State Airline tweets and classifying Turkish tweets, respectively. The results from the study are consistent with [5]. SVM outperformed NB irrespective of the feature extraction technique, implying SVM's robustness in classifying sentiment in Ghanaian football tweets. ...
Football as an attractive sport generates huge volumes of tweets concerning fans' opinions, feelings, and judgments during prime events. Such data can be leveraged in sentiment analysis, an algorithmic approach to analyzing text in tweets by extracting emotional tones. This paper presents a novel benchmark dataset of 132,115 tweets collected during the 2022 world cup on 𝕏 (formerly Twitter) for football-related sentiment classification. We also performed sentiment analysis on the dataset using lexicon-based tools, traditional machine learning algorithms, and pre-trained models, robustly optimized bidirectional encoder representations from transformers (BERT)- pretraining approach RoBERTa and distilled version of BERT (DistilBERT) to understand the emotions and reactions of football fans during different phases of the football matches. Results from the study indicate that most tweets had neutral sentiments in both context-aware and context-free analysis. We also describe our novel GhaFootBERT, a sentiment classification model based on transfer learning on BERT, which provides an effective approach to sentiment classification of football-related tweets. Our model performs robustly, outperforming the traditional models with 92% accuracy.
... However, aspect categorization is out of the scope of this paper. [37] used an ngram model with TF-IDF for DW extraction from football sport corpus. [38] used a POS tag with simple term frequency (TF) for aspect extraction. ...
Domain sentimental lexicon building become an attractive field in recent years. This is due to the increased number of users’ generated data through the internet besides the different sentiments of opinion words in different contexts. Domain lexicons mainly consist of opinion pairs and their associated sentiment. Any opinion pair is formed by a domain word and one of its associated opinion words. Therefore, to generate a domain lexicon from a domain corpus, domain word extraction is needed with their associated opinion words. One of the traditional approaches is frequency-based approaches. However, the ambiguity problem is a big concern of these approaches. This paper introduced a frequency-based equation that considers the context of the words for domain word extraction. The equation was tested on five Amazon reviews datasets and it proved its efficiency over other used frequency-based equations in terms of recall and precision. Therefore, more related lexicons to the domains were generated.
... Analyzing social media content, especially tweets, has grown as a prominent research area. While previous research focused on sentiment analysis in contexts such as sporting events [15], new products [9,10], and restaurants [16], limited research has targeted sentiment analysis of COVID-19 vaccine-related tweets. This study addresses this gap, providing insights into social media users' attitudes towards the COVID-19 vaccine and factors affecting vaccine acceptance. ...
... II. RELATED WORK Sentiment analysis of tweets has been widely studied in the past, with a focus on various topics including sporting events [18][19][20][21], new products [9,10], restaurants [16], and COVID-19 [17].Hasan et al. [9] proposed a supervised method for determining tweet sentiments using Twitter hashtags and emoticons as training labels, evaluating different feature types including punctuation, patterns, words, and n-grams.M. S. Neethu and R. Rajasree [10] conducted sentiment analysis of tweets related to electronic products, extracting Twitter-specific features and using classifiers including Na¨ıve Bayes, SVM, Maximum Entropy, and Ensemble classifiers to test the classification accuracy.Dhirajgurkhe et al. [11] presented a method for processing Twitter data, collecting it from various sources and eliminating non-contributing features before submitting it to a Na¨ıve Bayes classification algorithm to measure probabilities. Gautam and Yadav [12] focused on the classification o f consumer reviews using machine learning algorithms including Naıve Bayes, SVM, and Maximum Entropy, finding better accuracy with Na¨ıve Bayes.Amolik et al. [4] developed a model for senti-ment analysis of tweets related to Bollywood or Hollywood films, using function vectors and classifiers such as SVM and Na¨ıve Bayes to identify positive, negative, and neutral tweets. ...
... In 2018, research focused on analyzing soccer fans' reactions during matches and constructing a sentiment lexicon from the collected tweets [15]. Alhumaidi Al Otaibi (20XX) compared supervised and unsupervised algorithms to determine the more popular restaurant choice between McDonald's and KFC based on sentiment analysis of 7000 tweets [16]. Abu-Salah, R. et al. conducted a study on the sentiment analysis of tweets related to COVID-19 in the Arab world. ...
... The domain specific approach of [43] focuses on at first building a manually labelled (positive, negative, neutral) football sentiment dataset which is then used to automatically build a lexicon which is football specific. A classifier then identifies the sentiment of the conversation. ...
Sentiment analysis is a part of natural language processing, along with text mining. Over the years, sentiment analysis has become a key study area for researchers and industries all over the world. The major goal of this review is to overview the papers that have found the sentiment of the data under study over the past few years. Also, the various techniques and methods that have enabled us to solve the problem of sentiment analysis are looked into and put forward briefly. The articles are looked into considering the area of sentiment analysis in which the contributions are made, be it sentiment detection, dataset creation, or transfer learning. This would help the researchers to curate advanced and accurate models and methods for analyzing sentiment. Additionally, they would be given a quick rundown of current developments in this area of research. With this in mind, the literature study is conducted taking into account various applications, various machine learning algorithms, the type of data utilized for analysis purposes, and various performance measurements. Challenges and gaps based on all the contributions studied are summarized.
... First, a cleaned dataset is created by removing from the entire dataset the retweets and duplicate tweets as suggested in the scientific literature [3,55]. This action ensures that the resulting dataset contains only unique user-posted tweets. ...
Given the high amount of information available on social media, the paper explores the degree of vaccine hesitancy expressed in English tweets posted worldwide during two different one-month periods of time following the announcement regarding the discovery of new and highly contagious variants of COVID-19—Delta and Omicron. A total of 5,305,802 COVID-19 vaccine-related tweets have been extracted and analyzed using a transformer-based language model in order to detect tweets expressing vaccine hesitancy. The reasons behind vaccine hesitancy have been analyzed using a Latent Dirichlet Allocation approach. A comparison in terms of number of tweets and discussion topics is provided between the considered periods with the purpose of observing the differences both in quantity of tweets and the discussed discussion topics. Based on the extracted data, an increase in the proportion of hesitant tweets has been observed, from 4.31% during the period in which the Delta variant occurred to 11.22% in the Omicron case, accompanied by a diminishing in the number of reasons for not taking the vaccine, which calls into question the efficiency of the vaccination information campaigns. Considering the proposed approach, proper real-time monitoring can be conducted to better observe the evolution of the hesitant tweets and the COVID-19 vaccine hesitation reasons, allowing the decision-makers to conduct more appropriate information campaigns that better address the COVID-19 vaccine hesitancy.
... They proved their proposed features' extraction techniques to be efficient by applying orthodox ML classifiers, which proved that the suggested attributes can omit the composite COVID-19 tweets in many cases. In [24], the authors proposed a field-based sentiment analysis model by employing various ML-based methods. Their work is based on football tweets and labels the feature for the sentiment classification as fouls, penalties, and goal-scoring, among others. ...
The task of classifying opinions conveyed in any form of text online is referred to as sentiment analysis. The emergence of social media usage and its spread has given room for sentiment analysis in our daily lives. Social media applications and websites have become the foremost spring of data recycled for reviews for sentimentality in various fields. Various subject matter can be encountered on social media platforms, such as movie product reviews, consumer opinions, and testimonies, among others, which can be used for sentiment analysis. The rapid uncovering of these web contents contains divergence of many benefits like profit-making, which is one of the most vital of them all. According to a recent study, 81% of consumers conduct online research prior to making a purchase. But the reviews available online are too huge and numerous for human brains to process and analyze. Hence, machine learning classifiers are one of the prominent tools used to classify sentiment in order to get valuable information for use in companies like hotels, game companies, and so on. Understanding the sentiments of people towards different commodities helps to improve the services for contextual promotions, referral systems, and market research. Therefore, this study proposes a sentiment-based framework detection to enable the rapid uncovering of opinionated contents of hotel reviews. A Naive Bayes classifier was used to process and analyze the dataset for the detection of the polarity of the words. The dataset from Datafiniti’s Business Database obtained from Kaggle was used for the experiments in this study. The performance evaluation of the model shows a test accuracy of 96.08%, an F1-score of 96.00%, a precision of 96.00%, and a recall of 96.00%. The results were compared with state-of-the-art classifiers and showed a promising performance and much better in terms of performance metrics.
... Men and women are seen to express similar emotional outbursts like anger, fear, etc. over Premier league match outcomes. A sentiment analysis of tweets discussing football matches was conducted in [8] to identify specific emotions expressed over football-related tweets. The domain-specific approach of the study will further unravel the semantic meaning of expressed emotions and thereby design a sentiment classifier capable of classifying expressed sentiments in football tweet conversations. ...
Sentiment analysis have been prominently employed for opinion analytics across different use cases which seeks to compute polarity values on opinions expressed in textual documents. The Vader-based lexicon approach is widely deployed especially for tweet corpus with emoticon elements. However, there is need to ascertain the peculiarities of the MultiLingual sentiment analyzer alongside the Vader approach towards improving future studies in their use case domain. The Twitter application programming interface was employed in this study to extract public opinions in two corpus, on the Africa Cup of Nations matches involving Nigeria. Experimental results shows no one-to-one mapping between the sentiment scores returned by the two analyzers while the MultiLingual analyzer proves to be reputable for analyzing tweets with shorter phrases. Tokens returned as the most weighted in the two corpus, as analyzed by the two methodologies, likewise shows obvious contrasts in their weights. Two Nigerian players were returned as prominent topics from the two corpus by the topic modelling phase of the study while in MultiLingual analysis, local dialect tokens outweighs other English unigrams, unlike in the Vader-based lexicon approach.