Conference Paper

Election Result Prediction Using Twitter Sentiments Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Sentiment analysis or opinion mining is a method into the field of NLP to teach machines to learn, detect and recognize emotional and sentimental information from a given text message [9]. Given an opinion, sentiment analysis can extract its meaning, sense, and emotional charge of the subject who wrote it. ...
... The study of [9] collected data from Twitter aiming to predict the Congress election outcome in India by using sentiment analysis. To label tweets, they used the Valence Aware Dictionary and Estiment Reasoner (VADER) [31], and used a manual feature extraction [32] which are combined in a bag of words fashion (BoW). ...
... Their empirical findings revealed that Maximum Entropy and the N-Gram Language Model outperformed SVM and Naïve Bayes, achieving accuracies in the range of 65% to 77%. Similarly, Batra et al. (2020) used various machine learning algorithms on Tweets to predict election outcomes through sentiment analysis, and were able to achieve an accuracy of 86% with the decision tree classifier. Various studies have deep-dived into the sentiment of political communication in social media platforms (Haselmayer and Jenny 2017). ...
Article
Full-text available
The advent of large language models (LLMs) has marked a new era in the transformation of computational social science (CSS). This paper dives into the role of LLMs in CSS, particularly exploring their potential to revolutionize data analysis and content generation and contribute to a broader understanding of social phenomena. We begin by discussing the applications of LLMs in various computational problems in social science including sentiment analysis, hate speech detection, stance and humor detection, misinformation detection, event understanding, and social network analysis, illustrating their capacity to generate nuanced insights into human behavior and societal trends. Furthermore, we explore the innovative use of LLMs in generating social media content. We also discuss the various ethical, technical, and legal issues these applications pose, and considerations required for responsible LLM usage. We further present the challenges associated with data bias, privacy, and the integration of these models into existing research frameworks. This paper aims to provide a solid background on the potential of LLMs in CSS, their past applications, current problems, and how they can pave the way for revolutionizing CSS.
... On a data set labelled with VADER, the performance of five distinct machine learning models was analysed and compared in the work of P.K. Batra et al. [122]. BOW and tf-idf are two of the feature extraction approaches that were investigated. ...
Article
Full-text available
Modern social media’s rise to prominence has altered the ways in which candidates reach out to voters and conduct campaigns. Researchers often dwell upon the uses of social media platforms as a plethora of information for various tasks, such as election prediction, since they contain a large volume of people’s ideas about politics and leaders. Modern political campaigns and party propaganda make extensive use of social media. It is common practise for political parties and candidates to utilise Twitter and other social media during election season for coverage and promotion. This study analyses and provides estimates for the reliability of several volumetric social media techniques to predict election outcomes from social media activity. Incredibly large datasets made available by social media sites may be mined for insights into societal problems and predictions about the future. However, this is difficult because of the skewed and noisy nature of the data. This literature review aims to enlighten readers about the researchers’ input towards the process of forecasting election outcomes using social media content by outlining an assessment of sentiment analysis and its methodologies. The study also discusses research that aims to foretell upcoming elections in several nations by analysing user textual data on social media sites. In addition, this paper has pointed out some of the research gaps that exist in the area of election outcome forecasting and some of the challenging questions in the domain of sentiment analysis. In addition, this paper makes recommendations for the future of election prediction based on material gleaned from social media.
... They computed a weighted sensitivity score using the RNTN model. In 2019, Lok Sabha election results were predicted by Payal Khurana Batra and her group [11]. Following its preparation, he separated the data into two groups, each of which had a distinct script written by the Congress and BJP. ...
Article
Full-text available
This paper explores the potential of Twitter, a popular social media platform, as a tool for predicting election outcomes. Sentiment analysis has emerged as a powerful tool for predicting election outcomes, with numerous studies showcasing its effectiveness in various countries. For instance, research has utilized sentiment analysis to forecast election results in nations like the USA, India, Pakistan and other countries, demonstrating the utility of social media data in gauging public opinion and predicting electoral results [1]. Elections in India are always considered important events that most people look forward to the rapid growth of social media in the past has provided end users with powerful tools to share their ideas. Twitter, which is one such platform, provides daily updates on political events through various hashtags and trends. People react to political events and give their opinions. Our approach is to collect tweets from top political parties contesting the Gujarat Assembly Elections 2022, and then calculate sentiment scores. The database includes a variety of recent and well-liked tweets about a specific political party. Party tweets are generated with specific keywords like “BJP”, “AAP”,“Congress” and so on. In the context of India, Twitter sentiment analysis tools and classification have been used to predict the outcomes of state assembly elections, underscoring the potential of social media data in forecasting electoral results within the country [2]. We used standard machine learning algorithms like VADER sentiment analyzer on Random Forest and Decision Tree for our classification and testing data to classify tweets as positive and negative. As a result, this work uses sentiment analysis to evaluate tweets gathered from Twitter and forecast election outcomes. This work shows the growing influence of social media on politics and the feasibility of using such platforms for predictive analysis. The findings of this study could provide valuable insights for political parties, policymakers, and researchers interested in the intersection of social media and politics. Random Forest and Decision Tree models performed well in predicting election outcomes based on sentiment analysis on Twitter data with 89% and 86% respectively.
... Tumasjan et al. (2010) [15] and Sanders and Van den Bosch (2013) [16] reported some encouraging early findings, but shortly articles expressing scepticism about the accuracy of election outcome prediction based on tweets surfaced (Gayo-Avello, 2012) [17]. Despite these results, which lend credence to the scepticism, studies on attempts to predict elections using tweets continue to be published, frequently incorporating sentiment analysis (Nugroho, 2021 [7]; Batra et al., 2020 [18]; Rao et al., 2020[19]). ...
Article
As a result of the growth of online social networking platforms and applications, a sizeable amount of user-generated text content is created daily in the form of comments, reviews, and short text messages. Users can write messages, share them, and add images and videos to social networking sites like Twitter, Facebook, and others. Consequently, a significant volume of sentiment-rich data is generated. Sentiment analysis then comes into play in this scenario, which evaluates opinions as positive, neutral, or negative by extracting, recognizing, or representing them from various sources, including social media, news, articles, and blogs. This study aims to analyze the results from different sentiment analysis models and technologies that combine natural language processing. Case studies of various industries that can benefit or have been benefiting from sentiment analysis are also discussed to provide an approachable pathway for anyone who wishes to go more deeply into this field. For example, the business world has used it to learn what customers think of a certain company or brand. The impact of profanity on how readers interpret tweets and other social media messages are studied in sociology and psychology. Political scientists are trying to anticipate election results based on tweets to evaluate answers, among other things, and to look for trends, ideological bias, and opinions. Researchers have previously evaluated numerous models using well-known techniques like Naive Bayes, support vector machines, etc., and the findings have been compared with promising outcomes.
... El análisis de sentimientos o minería de opiniones es un método muy peculiar en el campo del procesamiento de lenguaje natural (NLP por sus siglas en Inglés) para enseñar a una máquina a extraer emociones de un texto dado (Khurana Batra et al., 2020). Esta técnica tiene como objetivo obtener información de textos que pueden almacenarse en formatos estructurados como hojas de cálculo y documentos HTML o en formatos no estructurados como texto plano (Silva et al., 2022). ...
Article
La tasa de aceptación popular es un concepto que se utiliza para explicar el aumento del apoyo popular hacia un personaje político, de un país, en un periodo determinado. Esta cifra se extrae a través de encuestas solicitadas que llegan a cierta muestra limitada de ciudadanos dispuestos y además son caras de realizar.En esta investigación se ha implementado un sistema automático para la estimación de la aprobación popular del presidente del Perú utilizando datos de Twitter. El método es simple, rápido y de alta sensibilidad, pudiendo extenderse rápidamente para otros casos de análisis de opinión.
... Payal Khurana Batra et al [5]; proposed a technique to extricate emotions from text by teach a machine called sentiment analysis. A keyword might be of any kind like a social assertion, a tweet, message and so forth. ...
Conference Paper
Full-text available
Along with the evolution of time the large number of people used the social media platform to share views. This makes more people can communicate with each other. Alongside these benefits, it has some negative sides also which brings animosity towards some part of individuals. It can also include the hate speech. Hate speech is the speech that might include the abusive or threatening words which affects the community. Such type of speech need to be detected and removed from social media platform before spreading. Analysis of sentiments is the method of deciding whether the sentiments in the text is hatred or not hatred. We analyzed the Twitter dataset using weka software. In this dataset there were total 5000 Tweets and we applied two filters(Tweet to Sparse Feature Vector, Tweet to Lexicon Feature Vector) on it to give model accuracy of machine learning. The experimental result in both cases of Twitter dataset has an highest accuracy of Random forest technique i.e.93%.
... According to experimental results, the model can achieved up to 98.70% accuracy on multiclass based prediction (Mohbey 2020). Furthermore, Barat et al. two feature extraction algorithms BOW and tf-idf are used and integrated with five machine learning approaches for the Indian Loksabha general elections 2019 (Batra et al. 2020). ...
Article
Full-text available
Nowadays, political parties have widely adopted social media for their party promotions and election campaigns. During the election, Twitter and other social media platforms are used for political coverage to promote the party and its candidates. This research discusses and estimates the stability of many volumetric social media approaches to forecast election results from social media activities. Numerous machine learning approaches are applied to opinions shared on social media for predicting election results. This paper presents a machine learning model based on sentiment analysis to predict Pakistan's general election results. In a general election, voters vote for their favorite party or candidate based on their personal interests. Social media has been extensively used for the campaign in Pakistan general election 2018. Using a machine learning technique, we provide a five-step process to analyze the overall election results, whether fair or unfair. The work is concluded with detailed experimental results along with discussion on the outcomes of sentiment analysis for real-world forecasting and approval of general elections in Pakistan.
Article
Full-text available
Opinions of the public and the sentiments originating thereby play a pivotal role in social procedures. Sentiment analysis deals with the resolution of the tone or polarity of the text-how positive or negative it is. When applied to news reports, it provides a wide range of applications. This study analyses news reports in real-time from reliable sources using a slightly modified Na¨ıve Bayes' Algorithm. An article is fetched and then pre-processed to get rid of noisy words like English articles. After tokenization, the probability of each word being either positive or negative is determined. This is achieved by training a model using a dataset of brief news headlines, with their sentiment values labelled. The overall probability is summed using the well-known Bayes' theorem, which gives the name to the algorithm.A slight modification is proposed to this algorithm by calculating sentiment value for the field 'engineering,' which separates or calculates how a particular report is related to engineering. Based on the relevance to engineering (defined herewith using the dataset), a system is developed that prompts the head of an organization or any competent authorities with the report through an email
Article
Full-text available
This paper explores the complexities of predicting election outcomes in India, focusing on the winning party and the probability of incumbent reelection. Leveraging historical voting data and socio-economic indicators from the Socioeconomic High-resolution Rural-Urban Geographic (SHRUG) dataset and the Lok Dhaba database, the study employs advanced machine learning models to forecast electoral results. The main goal of this paper is to find these models ability to forecast the victorious party and determine the likelihood of reelection is the main goal. Several models, including Random Forest, Gradient Boosting, and Decision Tree, were assessed to meet these goals. With an accuracy of 99.89%, the Random Forest model outperformed the rest of them. This is because of its ensemble learning strategy, which lowers overfitting and increases predictive power. Additionally successful were the Decision Tree and Gradient Boosting models, which yielded accuracies of 98.75% and 99.78%, respectively. The study faced challenges such as computational complexity and potential bias introduced by the dataset, particularly due to the historical dominance of the Indian National Congress (INC) party. Despite these challenges, the models provided valuable insights into voter behaviour and electoral trends. The implications of this study are significant for political analysts and campaign strategists. Accurate predictions can guide the development of targeted campaign strategies and enhance understanding of electoral dynamics. Future research should address dataset biases and explore more efficient algorithms to improve the robustness and applicability of these predictions in real-world scenarios.
Article
Full-text available
Sentiment analysis has become an important task in natural language processing because it is used in many different areas. This paper gives a detailed review of sentiment analysis, including its definition, challenges, and uses. Different approaches to sentiment analysis are discussed, focusing on how they have changed and their limitations. Special attention is given to recent improvements with transformer models and transfer learning. Detailed reviews of well-known transformer models like BERT, RoBERTa, XLNet, ELECTRA, DistilBERT, ALBERT, T5, and GPT are provided, looking at their structures and roles in sentiment analysis. In the experimental section, the performance of these eight transformer models is compared across 22 different datasets. The results show that the T5 model consistently performs the best on multiple datasets, demonstrating its flexibility and ability to generalize. XLNet performs very well in understanding irony and sentiments related to products, while ELECTRA and RoBERTa perform best on certain datasets, showing their strengths in specific areas. BERT and DistilBERT often perform the lowest, indicating that they may struggle with complex sentiment tasks despite being computationally efficient.
Article
Full-text available
The 13th Presidential election has created a wide agenda in many countries as well as in Turkey. In this election period, along with traditional media tools, social media tools were also used frequently in the execution of election campaigns. Interactions received through social media platforms once again proved the effective power of social media tools to reach large masses of all parties and party leaders. For this reason, the Open Microphone program organized by Oğuzhan Uğur, in which many politicians participated, was followed with interest not only in Turkey's agenda, but also in the world's agenda. In this context, this study aims to reveal various analysis findings with Emotion Analysis methods, especially from the comments made within the scope of this program. For this purpose, in this study, a total of 261.728 user comments, specific to 7 different politicians, were analyzed using the NRC emotion dictionary. With the NRC emotion dictionary, a broader emotional polarity was obtained, including the emotions of anger, fear, trust, anticipation, surprise, sadness, joy, and disgust, in addition to positive or negative emotion polarity. As a result of the findings, this study reveals that the emotion analysis of the masses through Youtube comments or different platforms can be a critical source of information for political campaigns.
Article
The Billboard chart is a clear barometer for measuring a song's success in the music industry. Therefore, a number of artists and affiliated marketers in the music industry have attempted to determine how to emerge at the top of the chart. In the current study, artist-fan interactions on social media are examined as one of the possible indicators to predict the success of songs on the Billboard Hot 100 chart. The performance of a song on the Billboard chart was predicted based on the artist-fan interaction using the artist-fan dataset composed of posts, comments, and quote tweets, their sentimental levels, and the interaction styles of each post. Overall, the XGBoost model with the quote-tweet interaction data exhibited the highest classification performance (F1-score: 80.75% on Top 1 label), showing that the interaction features extracted from quote-tweets show the strongest relevance to a song's success. We present a simplified approach for observing and understanding public perception for the entertainment industry, specifically for the music industry, through social media interactions. We also suggest the facilitation of artist-fan interactions on social media with similar functions of quote-tweet function on Twitter as a valid strategy to make songs more successful.
Article
Full-text available
Gathering public opinion by analyzing big social data has attracted wide attention due to its interactive and real time nature. For this, recent studies have relied on both social media and sentiment analysis in order to accompany big events by tracking people’s behavior. In this paper, we propose an adaptable sentiment analysis approach that analyzes social media posts and extracts user’s opinion in real-time. The proposed approach consists of first constructing a dynamic dictionary of words’ polarity based on a selected set of hashtags related to a given topic, then, classifying the tweets under several classes by introducing new features that strongly fine-tune the polarity degree of a post. To validate our approach, we classified the tweets related to the 2016 US election. The results of prototype tests have performed a good accuracy in detecting positive and negative classes and their sub-classes.
Conference Paper
Full-text available
Social media have received more attention nowadays. Public and private opinion about a wide variety of subjects are expressed and spread continually via numerous social media. Twitter is one of the social media that is gaining popularity. Twitter offers organizations a fast and effective way to analyze customers' perspectives toward the critical to success in the market place. Developing a program for sentiment analysis is an approach to be used to computationally measure customers' perceptions. This paper reports on the design of a sentiment analysis, extracting a vast amount of tweets. Prototyping is used in this development. Results classify customers' perspective via tweets into positive and negative, which is represented in a pie chart and html page. However, the program has planned to develop on a web application system, but due to limitation of Django which can be worked on a Linux server or LAMP, for further this approach need to be done.
Article
Full-text available
With the advancement of web technology and its growth, there is a huge volume of data present in the web for internet users and a lot of data is generated too. Internet has become a platform for online learning, exchanging ideas and sharing opinions. Social networking sites like Twitter, Facebook, Google+ are rapidly gaining popularity as they allow people to share and express their views about topics,have discussion with different communities, or post messages across the world. There has been lot of work in the field of sentiment analysis of twitter data. This survey focuses mainly on sentiment analysis of twitter data which is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous and are either positive or negative, or neutral in some cases. In this paper, we provide a survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics. Using various machine learning algorithms like Naive Bayes, Max Entropy, and Support Vector Machine, we provide a research on twitter data streams.General challenges and applications of Sentiment Analysis on Twitter are also discussed in this paper.
Article
Full-text available
We examine sentiment analysis on Twitter data. The contributions of this paper are: (1) We introduce POS-specific prior polarity fea- tures. (2) We explore the use of a tree kernel to obviate the need for tedious feature engineer- ing. The new features (in conjunction with previously proposed features) and the tree ker- nel perform approximately at the same level, both outperforming the state-of-the-art base- line. kernel based model. For the feature based model we use some of the features proposed in past liter- ature and propose new features. For the tree ker- nel based model we design a new tree representa- tion for tweets. We use a unigram model, previously shown to work well for sentiment analysis for Twit- ter data, as our baseline. Our experiments show that a unigram model is indeed a hard baseline achieving over 20% over the chance baseline for both classifi- cation tasks. Our feature based model that uses only 100 features achieves similar accuracy as the uni- gram model that uses over 10,000 features. Our tree kernel based model outperforms both these models by a significant margin. We also experiment with a combination of models: combining unigrams with our features and combining our features with the tree kernel. Both these combinations outperform the un- igram baseline by over 4% for both classification tasks. In this paper, we present extensive feature analysis of the 100 features we propose. Our ex- periments show that features that have to do with Twitter-specific features (emoticons, hashtags etc.) add value to the classifier but only marginally. Fea- tures that combine prior polarity of words with their parts-of-speech tags are most important for both the classification tasks. Thus, we see that standard nat- ural language processing tools are useful even in a genre which is quite different from the genre on which they were trained (newswire). Furthermore, we also show that the tree kernel model performs roughly as well as the best feature based models, even though it does not require detailed feature en-
Article
Full-text available
The bag-of-words model is one of the most popular representation methods for object categorization. The key idea is to quantize each extracted key point into one of visual words, and then represent each image by a histogram of the visual words. For this purpose, a clustering algorithm (e.g., K-means), is generally used for generating the visual words. Although a number of studies have shown encouraging results of the bag-of-words representation for object categorization, theoretical studies on properties of the bag-of-words model is almost untouched, possibly due to the difficulty introduced by using a heuristic clustering process. In this paper, we present a statistical framework which generalizes the bag-of-words representation. In this framework, the visual words are generated by a statistical process rather than using a clustering algorithm, while the empirical performance is competitive to clustering-based method. A theoretical analysis based on statistical consistency is presented for the proposed framework. Moreover, based on the framework we developed two algorithms which do not rely on clustering, while achieving competitive performance in object categorization when compared to clustering-based bag-of-words representations. KeywordsObject recognition-Bag of words model-Rademacher complexity
Article
Full-text available
A novel probabilistic retrieval model is presented. It forms a basis to interpret the TF-IDF term weights as making relevance decisions. It simulates the local relevance decision-making for every location of a document, and combines all of these “local” relevance decisions as the “document-wide” relevance decision for the document. The significance of interpreting TF-IDF in this way is the potential to: (1) establish a unifying perspective about information retrieval as relevance decision-making; and (2) develop advanced TF-IDF-related term weights for future elaborate retrieval models. Our novel retrieval model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights. In general, we show that the term-frequency factor of the ranking formula can be rendered into different term-frequency factors of existing retrieval systems. In the basic ranking formula, the remaining quantity - log p(&rmacr;|t ∈ d) is interpreted as the probability of randomly picking a nonrelevant usage (denoted by &rmacr;) of term t. Mathematically, we show that this quantity can be approximated by the inverse document-frequency (IDF). Empirically, we show that this quantity is related to IDF, using four reference TREC ad hoc retrieval data collections.
Article
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a content-analysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets’ political sentiment demonstrates close correspondence to the parties' and politicians’ political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research.
Conference Paper
Sentiment analysis is an evaluation of the opinion of the speaker, writer or other subject with regard to some topic. In US presidential election 2016, Donald Trump, Hillary Clinton and Bernie Sanders were among the top election candidates. The opinion of the public for a candidate will impact the potential leader of the country. Twitter is used to acquire a large diverse data set representing the current public opinions of the candidates. The collected tweets are analyzed using lexicon based approach to determine the sentiments of public. In this paper, we determine the polarity and subjectivity measures for the collected tweets that help in understanding the user opinion for a particular candidate. Further, a comparison is made among the candidates over the type of sentiment. Also, a word cloud is plotted representing most frequently appearing words in the tweets.
Conference Paper
Twitter is a microblogging website where users read and write millions of short messages on a variety of topics every day. This study uses the context of the German federal election to investigate whether Twitter is used as a forum for political deliberation and whether online messages on Twitter validly mirror offline political sentiment. Using LIWC text analysis software, we conducted a contentanalysis of over 100,000 messages containing a reference to either a political party or a politician. Our results show that Twitter is indeed used extensively for political deliberation. We find that the mere number of messages mentioning a party reflects the election result. Moreover, joint mentions of two parties are in line with real world political ties and coalitions. An analysis of the tweets' political sentiment demonstrates close correspondence to the parties' and politicians' political positions indicating that the content of Twitter messages plausibly reflects the offline political landscape. We discuss the use of microblogging message content as a valid indicator of political sentiment and derive suggestions for further research. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Twitter based election prediction and analysis
  • P Salunkhe
  • S Deshmukh