Article

Sentiment Analysis of Lockdown in India during COVID-19: A Case Study on Twitter

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the rapid increase in the use of the Internet, sentiment analysis has become one of the most popular fields of natural language processing (NLP). Using sentiment analysis, the implied emotion in the text can be mined effectively for different occasions. People are using social media to receive and communicate different types of information on a massive scale during COVID-19 outburst. Mining such content to evaluate people's sentiments can play a critical role in making decisions to keep the situation under control. The objective of this study is to mine the sentiments of Indian citizens regarding the nationwide lockdown enforced by the Indian government to reduce the rate of spreading of Coronavirus. In this work, the sentiment analysis of tweets posted by Indian citizens has been performed using NLP and machine learning classifiers. From April 5, 2020 to April 17, 2020, a total of 12,741 tweets having the keywords "Indialockdown" are extracted. Data have been extracted from Twitter using Tweepy API, annotated using TextBlob and VADER lexicons, and preprocessed using the natural language tool kit provided by the Python. Eight different classifiers have been used to classify the data. The experiment achieved the highest accuracy of 84.4% with LinearSVC classifier and unigrams. This study concludes that the majority of Indian citizens are supporting the decision of the lockdown implemented by the Indian government during corona outburst.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Gupta et al. [18] studied how Indians feel about the state's countrywide shutdown, which was enacted to prevent coronavirus from spreading. Computational models were used to detect emotions in tweets written by Indian citizens. ...
... For sentiment prediction, Nemes and Kiss [13] used the RNN, and Imran et al. [14] investigated sentiment across neighboring countries during the pandemic. Chakraborty et al. [15] and Basiri et al. [16] used SVM and hybrid model for the sentiment analysis, while Kaur and Ranjan [17] and Gupta et al. [18] studied public reactions to lockdowns and pandemics. In emotion detection and sentiment analysis, Nemes and Kiss [13] used the RNN for sentiment prediction and compared it with TextBlob. ...
... Chakraborty et al. [15] and Basiri et al. [16] used SVM and hybrid models for the sentiment analysis with varying accuracy. Kaur and Ranja [17] and Gupta et al. [18] worked on reactions to lockdowns and pandemics and moved more into sentiment than detailed emotion detection. ...
Article
Full-text available
COVID-19 has significantly impacted peoples’ mental health because of isolation and social distancing measures. It practically impacts every segment of people’s daily lives and causes a medical problem that spreads throughout the entire world. This pandemic has caused an increased emotional distress. Since everyone has been affected by the epidemic physically, emotionally, and financially, it is crucial to examine and comprehend emotional reactions as the crisis affects mental health. This study uses Twitter data to understand what people feel during the pandemic. We collected Twitter data about COVID-19 and isolation, preprocessed the text, and then classified the tweets into various emotion classes. The data are collected using the twarc library and the Twitter academic researcher account and labeled using a Vader analyzer after preprocessing. We trained five machine learning models, namely, support vector machine (SVM), Naïve Bayes, KNN, decision tree, and logistic regression to find patterns and trends in emotions. The emotional reactions of individuals to the COVID-19 crisis are then analyzed. We applied precision, recall, F1-score, and accuracy as the evaluation metrics, which shows that SVM has performed best among other models. Our results show that isolated people felt various emotions, out of which, fear, sadness, and surprise were the most common. This study gives insights into the emotional impact of the pandemic and shows the power of Twitter data in understanding mental health outcomes. Our findings can be used to develop targeted interventions and support strategies to address the emotional toll of the pandemic.
... Thus the proposed model obtained a better sentiment score. Gupta et al. [15] introduced a lexiconbased model (LBM) classification approach based on corona virus tweets. The Twitter data was analyzed based on drugs, instances, and circumstances across the patients' ideas during lockdown time. ...
... Step 4: Let x be the input text, and a be the aspect or entity we want to analyze the sentiment towards. h absa = RNN(x) (14) //Represented the ABSA equation y absa = Softmax(W absa * h absa ) (15) Where h absa is the hidden state of the RNN, y absa is the output sentiment score for the aspect a, and W absa is the weight matrix. ...
Article
Full-text available
Sentiment analysis is a sub-domain in opinion mining that extracts sentiments from the users’ opinions from text messages. Opinions from E-commerce websites, blogs, online social media, etc., and these opinions are in the form of text, suggestions, and comments. This paper describes the new sentiment analysis model to predict sentiments effectively that can be used to improve product quality and sales. The proposed approach is an integrated model combining several techniques, such as the pre-trained model BERT-large-cased (BLC) for training the dataset. BLC model contains 24-layer, 1024-hidden, 16-heads, 340M parameters. Optimization algorithms can fine-tune pre-trained models, such as BERT, for sentiment analysis tasks. Fine-tuning involves training the pre-trained model on a specific sentiment analysis task to improve performance. Stochastic Gradient Descent (SGD) is the optimized algorithm that helps to analyze the sentiments effectively from the given datasets. The next step is the combination of pre-processing techniques such as Tokenization, Stop Word Removal, etc. The next step focused on Bag-of-Words (BoW) and word embedding techniques like Word2Vec used to extract the features from the datasets. The deep sentiment analysis (DSA) based classification is designed to classify the sentiments based on aspect and priority model to achieve better results. The proposed model combines Aspect and Priority-based Sentiment analysis with a Decision-based Recurrent Neural Network (D-RNN). The experiments are conducted using Twitter, Restaurant, and Laptop datasets available publicly on Kaggle—the proposed model’s performance is analyzed using a confusion matrix. The proposed approach addresses various challenges in analyzing the sentiment analysis. Python programming language with several libraries such as Keras, Pandas, and others extracts the sentiments from given datasets. The comparison between the existing and proposed models shows the effectiveness of the sentiment outputs.
... A Rule-Based framework's main principle is to develop and combine knowledge of a particular topic into a computer program. This research explores TextBlob and Valence Aware Dictionary for Sentiment Reasoning (VADER), two Rule-Based techniques [65,66]. ...
... It is primarily employed for text corpora on social networks. The Sentiment Intensity Analyzer function uses the VADER tool to assist in determining the sentiment score for text content [65,66]. ...
Article
Background Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media networks. Objective This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage. Methods Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis. Results The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models. Conclusion Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a marginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with AdaBoost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.
... Tripathi [85] and Situala et al. [86] used multiple machine-learning approaches to perform sentiment analysis of COVID-19-focused tweets that were posted by people who stated their location as Nepal on Twitter. The purpose of the work by Gupta et al. [87] was to examine the perceptions of Indians, as expressed on Twitter, towards the Indian Government's countrywide lockdown, which was implemented to slow the spread of COVID-19. In this context, the authors used the LinearSVC classifier to perform sentiment analysis, and their classifier achieved a performance accuracy of 84.4%. ...
... As can be seen from Table 4, the work presented in this paper is the first work in this area that focuses on sentiment analysis of tweets that focused on COVID-19 and MPox at the same time. Pokharel [79] Chakraborty et al. [80] Shofiya et al. [81] Basiri et al. [82] Cheeti et al. [83] Ridhwan et al. [84] Tripathi [85] Situala et al. [86] Gupta et al. [87] Alanezi et al. [88] Dubey [89] Rahman et al. [90] Ainlet et al. [91] Slobodin et al. [92] Zou et al. [93] Alhuzali et al. [94] Hussain et al. [95] Liu et al. [96] Hu et al. [97] Khan et al. [98] Ahmed et al. [99] Lin et al. [100] Jang et al. [101] Tsao et al. [102] Griffith et al. [103] Chum et al. [104] Kothari et al. [105] Barkur et al. [106] Afroz et al. [107] Hota et al. [108] Venigalla et al. [109] Paliwal et al. [110] Zhou et al. [111] Lamsal et al. [112] Zhou et al. [113] de Melo et al. [114] Brum et al. [115] de Sousa et al. [116] Iparraguirre-Villanueva et al. [117] Mohbey et al. [118] Farahat et al. [119] Sv et al. [120] Bengesi et al. [121] Dsouza et al. [122] Zuhanda et al. [123] Cooper et al. [124] Ng et al. [125] Thakur [this work] ...
Article
Full-text available
Mining and analysis of the big data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text analysis of tweets related to Ebola, E-Coli, Dengue, Human Papillomavirus (HPV), Middle East Respiratory Syndrome (MERS), Measles, Zika virus, H1N1, influenza-like illness, swine flu, flu, Cholera, Listeriosis, cancer, Liver Disease, Inflammatory Bowel Disease, kidney disease, lupus, Parkinson’s, Diphtheria, and West Nile virus. The recent outbreaks of COVID-19 and MPox have served as “catalysts” for Twitter usage related to seeking and sharing information, views, opinions, and sentiments involving both of these viruses. None of the prior works in this field analyzed tweets focusing on both COVID-19 and MPox simultaneously. To address this research gap, a total of 61,862 tweets that focused on MPox and COVID-19 simultaneously, posted between 7 May 2022 and 3 March 2023, were studied. The findings and contributions of this study are manifold. First, the results of sentiment analysis using the VADER (Valence Aware Dictionary for sEntiment Reasoning) approach shows that nearly half the tweets (46.88%) had a negative sentiment. It was followed by tweets that had a positive sentiment (31.97%) and tweets that had a neutral sentiment (21.14%), respectively. Second, this paper presents the top 50 hashtags used in these tweets. Third, it presents the top 100 most frequently used words in these tweets after performing tokenization, removal of stopwords, and word frequency analysis. The findings indicate that tweets in this context included a high level of interest regarding COVID-19, MPox and other viruses, President Biden, and Ukraine. Finally, a comprehensive comparative study that compares the contributions of this paper with 49 prior works in this field is presented to further uphold the relevance and novelty of this work.
... In the Journal of Environmental Management, Kumar, Kumar, and Gupta [5] discussed the current state of medical waste management and proposed sustainable strategies for utilizing this waste in the construction industry. They highlighted the environmental and public health benefits of sustainable medical waste management. ...
Article
The sustainable use of medical waste in the construction sector has gained attention as an innovative solution to address the growing environmental concerns linked to biomedical waste disposal. Medical waste contributes significantly to pollution, and its effective management is vital to safeguarding both public health and the environment. Simultaneously, the construction industry, as one of the largest consumers of natural resources, generates considerable waste. Integrating medical waste into construction materials offers a dual benefit-minimizing the environmental footprint of the construction industry while providing an eco-friendly solution for managing medical waste. This paper provides a comprehensive review of the current research on the sustainable incorporation of medical waste into construction materials. It examines various methods of medical waste utilization, identifies the challenges and opportunities this approach presents, and explores the policy and regulatory frameworks that support sustainable medical waste management. The paper concludes by emphasizing the potential of this approach to promote a circular economy, reduce the environmental impact of waste management, and underscores the importance of further research and cross-industry collaboration between the medical and construction sectors.
... Gupta [12] highlighted analyzing sentiments analysis of Indian towards statewide lockdown enforced by the Indian leadership to reduce the spread of Covid-19 spread. Sentiment Classification of tweets presented by national citizens of India was performed in study using machine learning and NLP models. ...
... Some of these works are discussed below. Gupta et al. used Twitter data from April 5, 2020, to April 17, 2020, a total of 12741 tweets that mentioned the keywords "India" and "lockdown" to prepare a database (Gupta et al., 2020). They used VADER and TextBlob to annotate the tweets and analyse how Indians are taking the idea of lockdown during COVID-19. ...
... Furthermore, Demilie and Salau [33] address the pervasive issue of hate speech detection on social media platforms, emphasizing the need for advanced research and optimal approaches in this challenging domain. Gupta et al. [34]contribute to the sentiment analysis domain, mining the sentiments of Indian citizens regarding the nationwide lockdown during the COVID-19 outbreak, with a notable accuracy of 84.4%. Jayasurya et al. [35] conduct sentiment analysis on the topic of COVID-19 vaccination, employing 14 different machine learning classifiers and revealing insightful temporal and spatial analyses of textual data. ...
Article
Full-text available
Informal education via social media plays a crucial role in modern learning, offering self-directed and community-driven opportunities to gain knowledge, skills, and attitudes beyond traditional educational settings. These platforms provide access to a broad range of learning materials, such as tutorials, blogs, forums, and interactive content, making education more accessible and tailored to individual interests and needs. However, challenges like information overload and the spread of misinformation highlight the importance of digital literacy in ensuring users can critically evaluate the credibility of information. Consequently, the significance of sentiment analysis has grown in contemporary times due to the widespread utilization of social media platforms as a means for individuals to articulate their viewpoints. Twitter (now X) is well recognized as a prominent social media platform that is predominantly utilized for microblogging. Individuals commonly engage in expressing their viewpoints regarding contemporary events, hence presenting a significant difficulty for scholars to categorize the sentiment associated with such expressions effectively. This research study introduces a highly effective technique for detecting misinformation related to the COVID-19 pandemic. The spread of fake news during the COVID-19 pandemic has created significant challenges for public health and safety because misinformation about the virus, its transmission, and treatments has led to confusion and distrust among the public. This research study introduce highly effective techniques for detecting misinformation related to the COVID-19 pandemic. The methodology of this work includes gathering a dataset comprising fabricated news articles sourced from a corpus and subjected to the natural language processing (NLP) cycle. After applying some filters, a total of five machine learning classifiers and three deep learning classifiers were employed to forecast the sentiment of news articles, distinguishing between those that are authentic and those that are fabricated. This research employs machine learning classifiers, namely Support Vector Machine, Logistic Regression, K-Nearest Neighbors, Decision Trees, and Random Forest, to analyze and compare the obtained results. This research employs Convolutional Neural Networks, Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) as deep learning classifiers, and afterwards compares the obtained results. The results indicate that the BiGRU deep learning classifier demonstrates high accuracy and efficiency, with the following indicators: accuracy of 0.91, precision of 0.90, recall of 0.93, and F1-score of 0.92. For the same algorithm, the true negatives, and true positives came out to be 555 and 580, respectively, whereas, the false negatives and false positives came out to be 81, and 68, respectively. In conclusion, this research highlights the effectiveness of the BiGRU deep learning classifier in detecting misinformation related to COVID-19, emphasizing its significance for fostering media literacy and resilience against fake news in contemporary society. The implications of this research are significant for higher education and lifelong learners as it highlights the potential for using advanced machine learning to help educators and institutions in the process of combating the spread of misinformation and promoting critical thinking skills among students. By applying these methods to analyze and classify news articles, educators can develop more effective tools and curricula for teaching media literacy and information validation, equipping students with the skills needed to discern between authentic and fabricated information in the context of the COVID-19 pandemic and beyond. The implications of this research extrapolate to the creation of a society that is resistant to the spread of fake news through social media platforms.
... Also, this model shows the comparative performance with various metrics based on the sentiments. Gupta et al. [13] introduced a new sentiment analysis model that analyzes the tweets given by various users on Twitter social media. All these tweets were analyzed according to the sentiments. ...
Article
Full-text available
Sentiment analysis (SA), the process of determining the emotional tone behind a piece oftext, has gained significant importance in various domains, including marketing, customerfeedback analysis, and social media monitoring. Traditional SA models often facechallenges handling diverse global datasets due to language variations, cultural nuances,and context differences. This paper proposes an Ensemble Recommendation System (ERS)approach to address these challenges and improve SA accuracy. The ERS combines GatedAttention-Based Recurrent Networks (GARNET) Steps with Transfer Learning. The ERSleverages the power of ensemble learning, combining multiple sentiment analysis modelstrained on global datasets from diverse linguistic backgrounds. The ERS can provide morerobust and accurate sentiment classifications by aggregating predictions from these models.Additionally, the system utilizes a recommendation mechanism that dynamically selects themost suitable model based on the characteristics of the input text, such as language, tone,and context. In this paper, an advanced pre-trained model DistilBERT is used to train theselected datasets and apply transfer learning. Transfer learning is used to send the featuresextracted from training and sends them to the proposed approach ERS. Second, we designand train multiple state-of-the-art SA models integrated to handle specific linguisticattributes while also considering the cultural biases present in the datasets. Third, weintroduce the ERS recommendation mechanism, which enhances the system's performanceand optimizes computational resources. For the effectiveness of the proposed ERS,extensive experiments were conducted on benchmark datasets and comparing itsperformance with individual sentiment analysis models and conventional ensembletechniques. The performance of the proposed approach achieves superior sentimentclassification accuracy, especially when dealing with challenging global datasets. Finally,ERS presents a promising solution for deep sentiment analysis over diverse global datasets.Finally, the ERS also focused on finding the popular items or products recommended bythe proposed approach.
... Social media platforms such as Twitter receive enormous messages during health emergencies such as epidemics or pandemics. The real-time data on Twitter is vital for disease-related data analyses and applications [5], [12], [13], [14]. This section summarizes recent works on interpretable classification and summarization models during crisis events such as disease outbreaks. ...
Article
Social media platforms, such as Twitter, are crucial resources to obtain situational information during disease outbreaks. Due to the sheer volume of user-generated content, providing tools that can automatically classify input texts into various types, such as symptoms, transmission, prevention measures, etc., and generate concise situational updates is necessary. Apart from high classification accuracy, interpretability is an important requirement when designing machine learning models for tasks in medical domain. In this article, we provide annotated epidemic-related datasets with labels of information types and rationales, which are short phrases from the original tweets, to support the assigned labels. Next, we introduce a trustworthy approach for the automatic classification of tweets posted during epidemics. Our classification model is able to extract short explanations/rationales for output decisions on unseen data. Moreover, we propose a simple graph-based ranking method to generate short summaries of tweets. Experiments on two epidemic-related datasets show the following: 1) our classification model obtains an average of 82% Macro-F1 and better interpretability scores in terms of Token-F1 (20% improvement) than baselines; 2) the extracted rationales capture essential disease-related information in the tweets; 3) our graph-based method with rationales is simple, yet efficient for generating concise situational updates.
... Sentiment analysis is one of the methods used to gain general information about consumer emotions when using or purchasing OTA services. Sentiment analysis was carried out during the lockdown policy in India, using several techniques [8]. Another application of sentiment analysis was in analyzing film reviews [9]. ...
Article
Full-text available
The growth and development of the internet users have given Indonesia an opportunity to develop internet-based services, such as online travel agents (OTA). Along with this OTA development, conventional travel agents were declining. Many conventional travel agents have decided to switch to online travel agents. The emergence of new OTAs has also made OTAs competition more challenging. Thus, a lesson learned from the market leader OTA is expected to help new OTAs surviving the competition. This research uses the sentiment analysis method to understand consumers' perceptions towards OTA and uses the social network analysis method to recognize actors who play significant roles in the travel agent business network. Lastly, the marketing strategies of the major and well-known OTAs perceived by online consumers was analyzed. Using the data collected from three major OTAs social media network (i.e., Traveloka, Tiket, and Booking), it was found that the general impression of consumers towards OTA is a positive sentiment. Furthermore, each key actor for each OTAs can be recognized. Lastly, marketing strategies can be proposed, namely by providing the complete product offerings, provide competitive price, creating special promos for consumers, promotion to be carried out on all social media using Bahasa Indonesia, and make the products offered available throughout Indonesia and can be used by everyone, especially travelers.
... In recent years, Twitter and Sina Weibo have become popular platforms for users to post and spread microblog messages. Analyzing the sentiment of tweets on large social media platforms can help governments or e-commerce companies, for example, perceive public opinion on various topics (e.g., political events, celebrities, daily life, etc.) and has a wide range of applications in both academic and industrial fields [1,6]. Microblog sentiment analysis aims to quickly discriminate sentiment from massive amounts of data, reducing B Salvador García salvagl@decsai.ugr.es ...
Article
Full-text available
In sentiment analysis of microblogs that involve social relationships, a common approach is to expand the features of target microblogs using microblog relationship networks. However, the current research methodology only relies on individual interaction behaviors on social platforms to construct these networks, disregarding the guiding influence of microblog relationship networks on microblogs. Consequently, this leads to the feature expansion of microblogs while introducing interference among them. To address this problem, this study aims to construct a more precise microblog relationship network by incorporating multiple interactive behaviors from social platforms. This network will serve as a guide for sentiment interactions among microblog texts, thereby mitigating feature interference. Firstly, we utilize various interaction behaviors on social platforms to build a microblog relationship network. We employ a LINE network embedding to represent the microblog relationship network as microblog relationship features. Secondly, we extract word-level and sentence-level features from the microblog text using a BERT pre-training model. The word-level features are combined using convolutional neural networks. Subsequently, the word-level and sentence-level features of microblogs are separately guided for interaction fusion through relational features. An attention network is then constructed to fuse the post-interaction features in a single step. Prior to secondary fusion, the primary fusion features, post-interaction word-level features, and post-interaction sentence-level features are weighted, and the sentiment categories of microblogs are outputted. Finally, we compare the proposed method with the text-only microblog sentiment analysis approach and the sentiment analysis method that incorporates social relationships on two real datasets. The comparison results demonstrate the superiority of our proposed method.
... This phenomenon can be explained by how the citizens' faith in their governments has gradually eroded [5], especially during the tough pandemic period. We observe an increase in such literature during COVID-19, e.g., [76,77]. During such a time, it is important for governments to pay attention to citizens' voices and make positive responses. ...
Article
Full-text available
Natural language processing (NLP), which is known as an emerging technology creating considerable value in multiple areas, has recently shown its great potential in government operations and public administration applications. However, while the number of publications on NLP is increasing steadily, there is no comprehensive review for a holistic understanding of how NLP is being adopted by governments. In this regard, we present a systematic literature review on NLP applications in governments by following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol. The review shows that the current literature comprises three levels of contribution: automation, extension, and transformation. The most-used NLP techniques reported in government-related research are sentiment analysis, machine learning, deep learning, classification, data extraction, data mining, topic modelling, opinion mining, chatbots, and question answering. Data classification, management, and decision-making are the most frequently reported reasons for using NLP. The salient research topics being discussed in the literature can be grouped into four categories: (1) governance and policy, (2) citizens and public opinion, (3) medical and healthcare, and (4) economy and environment. Future research directions should focus on (1) the potential of chatbots, (2) NLP applications in the post-pandemic era, and (3) empirical research for government work.
... Several studies were conducted on COVID-19 primarily focusing on the Indian sub-continent. Gupta et al (2020) performed a study to identify the sentiments of Indians during the period of lockdown using several machine learning classifiers. Bhattacharya et al (2021) conducted a study on India to identify how various factors, including infodemic, affect the mental state of individuals around the globe. ...
Article
Full-text available
Amidst the persistent COVID-19 pandemic, there has been a profound disruption in political, economic, and social disruption in the entire world. India has emerged as one of the most affected countries by this pandemic globally. The government has taken extensive measures to combat the disease and is disseminating essential information regarding the same on social media, particularly Twitter. Restricted or polarized interactions and diverging opinions among the politicians may hinder the formulation of important policies and measures for managing this crucial situation. This paper, therefore, aims to perform an in-depth investigation on the Twitter activities of Indian political leaders in response to COVID-19. The study presents an analysis of their tweet sentiments and formation of networks during political discussions. The analysis has been done on three different topics pertaining to COVID-19: preventive measures, lockdown, and vaccination separately. Firstly, the communication ties formed between the politicians during discussions on the respective topics are investigated based on network analysis of their mentions and retweets. The communities formed in the interaction networks and the extent of polarization between the communities is then examined. Secondly, sentiment analysis of the tweets have been performed using some well-known machine learning classifiers to identify the sentiment leaning of the politicians and the communities toward the issue. This combined approach of network and sentiment based analysis provides better characterization of political communities and their leanings regarding the pandemic. The findings revealed the presence of polarized communication during retweets while high level of cross-party interactions during mentions. The politicians have been identified to have overall positive response toward preventive measures and vaccination while majority have shown negative sentiments toward lockdown.
... The tweets will then be classified as positive, neutral or negative (6) . The opinions of people about various aspects like covid-19 symptoms, vaccination, and quarantine will be then summarized using the T5 summarizer (7,8) . Finally, In the proposed system, we provide an interactive Q/A system in addition to analysis of the tweets which is not present in the existing systems. ...
... They combined outputs of deep learning models and feature-based models using a multilayer perceptron network. In more recent works, Gupta et al. (2020) collected the data from Twitter to analyze the sentiment of Indians towards the nationwide lockdown imposed by the government during the Corona pandemic. They experimented with 8 different ML classifiers and found that LinearSVC performs better than other classifiers. ...
Article
Full-text available
In recent times, there has been tremendous growth in the number of multi-lingual users on social media platforms. Consequently, the code-mixing phenomenon, i.e., mixing of more than one language, has become ubiquitous in Internet content. In this paper, we present a shared-private, multi-lingual, multi-task model coupled with a transformer-based pre-trained encoder for sentiment analysis of code-mixed and English languages. Our model is tailored for multitasking that transfers the knowledge between code-mixed and English sentiment tasks. We consider code-mixed sentiment analysis as the primary task and enhance its performance by English sentiment analysis (auxiliary task) by sharing knowledge between them. We fine-tune the Bidirectional Encoder Representation using Transformer (BERT) encoder in a shared-private fashion to obtain the shared and task-specific features using the multi-task objective function. We evaluate our proposed framework using three benchmark datasets for the Hindi-English (Hinglish), Punjabi-English (Punglish) code-mixed and English sentiment tasks. Experiment results justify that our proposed multi-task framework improves the performance of our primary task in comparison to the state-of-art single-task systems.
... While Python is generally linked with sentiment analysis and text mining because of its large ecosystem of tools, frameworks and pre-processes using the natural language toolkit provided by the Python (P. Gupta et al., 2021), using PHP in this context gives a unique and new perspective. The choice to investigate PHP as a tool for data mining, analysis, categorization, and sentiment analysis was motivated by various factors, including its untapped potential: Python has unquestionably established itself as a data science and analytics powerhouse, with to packages as diverse as NLTK, Scikit-learn, and TensorFlow. ...
Article
Full-text available
The purpose of this research is to undertake a complete sentiment analysis of Twitter users' opinions and attitudes about internet services provided by Indosat, Ooredoo, and XL Axiata in Indonesia. This study applies the Naive Bayes Classifier algorithm to efficiently categorize attitudes into positive, negative, and neutral groups, using the Twitter API for data collecting and the PHP programming language for data processing. The results of sentiment analysis reveal striking trends: 56% of Indosat Ooredoo service-related tweets contain unfavorable attitudes, whereas 50% of XL Axiata service-related tweets have comparable negative sentiments. Significantly, the sentiment analysis system constructed using PHP has a remarkable accuracy rate of 78% when compared to manually classified findings. This study adds vital information by throwing light on customer sentiments towards cellular internet service providers. This study allows each provider to fine-tune and optimize their internet services based on a data-driven understanding of consumer sentiment by illuminating user opinions. In conclusion, this study fills a significant information gap by analyzing user attitude towards prominent cellular internet service providers. Its novel methodology, which incorporates PHP and the Naive Bayes Classifier algorithm, not only provides an effective way of sentiment analysis but also gives providers with practical information for improving service quality.
... With the annotated corpus, classification algorithms such as SVM, Decision tree classifier, Linear Regression, Random Forest, and LSTM were investigated, although LSTM and Random Forest showed to be more accurate. As mentioned, Gupta et al. [11] prove that linear SVC with unigram gives the highest level of 84.4% accuracy compared to other classifiers. Using three modalities, such as unigram, bigram, and trigram, Gulati et al. [12] give a comparative examination of machine-learning classifiers such as the Logistic Regression classifier, passiveaggressive classifier, and linear SVC with between 97 and 99% accuracy. ...
Article
Full-text available
Coronavirus COVID-19 has been spreading like wildfire all over the world since the year 2019. Nowadays, everyone is interacting with social media on a regular basis. In this study, the primary objective is to examine how Twitter users feel about COVID-19’s social life and to conclude their opinions. The polarity of feelings is determined by machine-learning classifiers that use popular words such as Coronavirus and COVID-19 to identify them. In addition, viruses such as BA-4 and BA-5 types of omicrons are spreading widely all over the globe. In order to prevent themselves from these types of viruses, the public needs to know the exact sentiment of the current social life problem. Using the newly reported topics/themes/issues and the associated sentiments from various factors, the COVID-19 pandemic can be better understood. A large dataset of tweets conveying information regarding COVID-19 is analyzed in this article, in particular, the credibility of the information shared on Twitter. It was evaluated using unigrams, bigrams, and trigrams with different parameters such as f1 score, precision, recall and compared against accuracy using machine-learning techniques such as Support Vector Machine (SVM), Naive Bayes (NB), and Logistic Regression (LR). The model TDGA (Translator Data pre-processing Gram feature Algorithmic model) performs well in individuals’ assessments of COVID-19 and benchmark COVID datasets, with a maximum efficiency of 86%.
Article
Full-text available
Background: The adoption of artificial intelligence (AI) in public administration (PA) has the potential to enhance transparency, efficiency, and responsiveness, ultimately creating greater public value. However, the integration of AI into PA faces challenges, including conceptual ambiguities and limited knowledge of the practical applications. This study addresses these gaps by offering an overview and categorization of AI research and applications in PA. Methods: Using a dataset of 3149 documents from the Scopus database, this study identifies the top 200 most-cited articles based on citation per year. It conducts descriptive and content analyses to identify the existing state, applications, and challenges regarding AI adoption. Additionally, selected AI use cases from the European Commission’s database are categorized, focusing on their contributions to public value. The analysis centers on three governance dimensions: internal processes, service delivery, and policymaking. Results: The findings provide a categorized understanding of AI concepts, types, and applications in PA, alongside a discussion of best practices and challenges. Conclusion: This study serves as a resource for researchers seeking a comprehensive overview of the current state of AI in PA and offers policymakers and practitioners insights into leveraging AI technologies to improve service delivery and operational efficiency.
Article
Anxiety disorder is a common mental disorder that has received increasing attention due to its high incidence, comorbidity, and recurrence. In recent years, with the rapid development of information technology, social media platforms have become a crucial source of data for studying anxiety disorders. Existing studies on anxiety disorders have focused on utilizing user-generated contents to study correlations with disorders or identify disorders. However, these studies overlook the emotional information in social media posts, restraining the effective capture of users’ emotions or mental states when posting. This article focuses on the sentiment polarity of anxiety-related posts on a Chinese social media and designs sentiment classification models via fuzing linguistic and semantic features of the posts. First, we extract the linguistic features from posts based on the simplified Chinese–Linguistic inquiry and word count (SC-LIWC) dictionary, and propose a novel recursive feature selection algorithm to reserve important linguistic features. Second, we propose a TextCNN-based model to study the deep semantic features of posts and fuze their linguistic features to obtain a better representation. Finally, to conduct anxiety analysis on Chinese social media, we construct a postlevel sentiment analysis dataset based on anxiety-related posts on Sina Weibo. The experimental results indicate that our proposed fusion models exhibit better performance in the task of identifying the sentiment polarity of anxiety-related posts on Chinese social media.
Article
An adverse event (AE) is any unexpected outcome associated with the use of a medical product in a patient. The tracking and recording of AE are important for reevaluation of safety and effectiveness of medical devices. The response to AE may include recall, which attempts to address the reported problem by repairing, re-labeling, etc. Spinal cord stimulator system (SCS) is an implantable device that sends low levels of electricity directly into the spinal cord to relieve pain, its indications include back pain, postsurgical pain, injuries to spinal cord, etc. As a type of active implantable device, SCS represent relatively high rate of AE. In this article, we analyzed 15 694 AE reports of SCS extracted from the Manufacturer and User Facility Device Experience (MAUDE) database of the United States Food and Drug Administration (FDA) between 1 January 2023 and 30 June 2023. In addition, we analyzed 65 recall reports of SCS extracted from the Recalls of Medical Devices database of FDA from 2005 to 2023 as supplement. The text data in AE dataset has been preprocessed using the natural language toolkit (NLTK), a natural language tool kit in Python. To predict the severity of AE, four classifiers as well as four vectorization methods were applied. The experiment achieved the highest accuracy of 90.6% with eXtreme Gradient Boosting Classifier (XGBC) and unigram. This study proposed an automatic approach of labeling the severity of AE data and conclude that SCS AE often occur when recharging the battery of SCS and when transmitting data via wireless communication route.
Article
Code-mixing refers to switching between two or more languages within the same utterance, which is very prevalent in multilingual societies. The amount of code-mixed content has increased due to the spike in multilingual users on review platforms. Analyzing these reviews can be beneficial for both consumers and service providers. Aspect category (AC) sentiment analysis (ACSA) provides a fine-grained analysis of reviews. ACSA identifies the AC and measures the sentiment expressed toward a given AC. The research in this direction has mostly focused on monolingual languages, which are insufficient for analyzing code-mixed reviews. To expedite research in this direction, we propose new tasks in the code-mixed language (ACSA-Mix). We develop a benchmark setup to create a code-mixed Hinglish (i.e., mixing of Hindi and English) dataset for ACSA-Mix, annotated with AC and sentiment values. To demonstrate the practical usage of the dataset, we solve ACSA-Mix tasks in the Seq2Seq framework, where natural language sentences are generated to represent the outputs that allow pretrained language models to be used effectively. Further, a multilingual multitask joint learning framework is proposed that transfers knowledge between Hinglish (ACSA-Mix) and English (ACSA) tasks. We consider ACSA-Mix tasks the primary tasks and enhance their performance by ACSA tasks (auxiliary) by sharing knowledge between them. We observe improvement over the single task ACSA-Mix models. 1 1 The dataset has been made available on https://www.iitp.ac.in/ai-nlp-ml/resources.html and at Github repository: https://github.com/20118/ACSA-Mix </fn
Chapter
The COVID-19 pandemic has brought about unprecedented changes in the world, including the imposition of lockdowns in many countries. In India, a nationwide lockdown was imposed in March 2020 to curb the spread of the virus. The lockdown significantly impacted people's lives, including their mental health and well-being. In this study, we analyzed Twitter data sentiment to understand the public's sentiment toward the lockdown in India. We collected tweets from April 20 to April 27, 2020, using relevant keywords and hashtags and preprocessed the data using natural language processing techniques. We then used machine learning algorithms to classify the tweets as positive, negative, or neutral based on their sentiment. Our results show that the sentiment toward the lockdown in India was predominantly positive, with people expressing their support for the lockdown. We further used some good and bad words to classify the comments. Our study provides insights into the public's perception of the lockdown in India and highlights the need for effective communication and support to address the negative impacts of such measures on people's mental health and well-being.
Chapter
With the massive increase in social media data and hypes around Natural Language Processing, opinion mining has become one of the most popular ways to analyze people’s views on a specific topic. Using hashtags, one can obtain tweet data in millions and analyze sentiments. This can be done effectively using Python with its NLP modules available. Studying the attitudes and sentiments of Indian citizens towards the current unemployment rate is the primary purpose of this study. In situations where there may be negative sequences due to people’s aggression, analyzing such content to gauge people’s sentiments can be extremely valuable in managing the situation. Natural Language Processing and other Machine Learning classifiers to perform opinion mining of the tweets posted by Indians are used in this research. About 10,928 tweets have been accumulated, on which sentiment analysis has been performed, considering tweets as positive, negative or neutral by classifying them into three categories. ‘Tweepy API’ has been used, along with the hashtags ‘UnemploymentInIndia' and ‘Unemployment’. The data has been cleaned and preprocessed using NLPTK, VADER and other modules provided to us using Python. Study findings suggest that most Indian citizens oppose the unemployment rates in their country, but a minority look to political movements to bring about change.
Article
Full-text available
Sentiment analysis of people’s opinions finds widespread application in numerous business and decision-making situations. Despite social media’s informal nature for sharing viewpoints, it has now become a prevalent tool in various business and decision-making contexts. Sentiment analysis, also known as opinion mining, encompasses the utilization of techniques such as text analysis, computational linguistics, natural language processing, and even biometrics to analyze and interpret sentiments, opinions, and emotions expressed within textual data. Subsequently, we conducted an in-depth analysis of this data using a diverse range of machine learning algorithms, including Naive Bayes, Support Vector Machine (SVM), and decision trees. We are proposing a new machine learning model, called a hybrid feature selection-based model, for identifying the sentiment polarity of statements. This model is different from other machine learning models because of its unique architecture. It uses 16 different linguistic features in its hidden layers. To validate and fine-tune the behavior of this algorithm, we employed the Analytic Hierarchy Process method. This technique has been utilized in a wide variety of research projects, and it has also been combined with other methodologies to solve decision-making challenges. By integrating various linguistic features, our model demonstrates enhanced performance compared to other state-of-the-art models, resulting in an improvement of up to 2.8% in Accuracy and 3% in terms of the weighted F1-score.
Article
The COVID-19 pandemic has caused anxiety and fear worldwide, affecting people's physical and mental health. This research work proposes a sentiment analysis approach to better understand the public's perception of COVID-19 in India. Two datasets are created by collecting tweets regarding COVID-19 in India. Pre-processing and analysis of datasets are performed by using natural language processing (NLP) techniques. Various features are extracted from collected tweets using three-word embeddings GloVe, fastText, Elmo. The optimal features are selected by cuckoo search optimization algorithm. Finally, the proposed hybrid model of Gated Recurrent Unit (GRU) and Bidirectional Long Short-Term Memory (BiLSTM) is used to categorize the tweets into three sentiment categories. Proposed model achieved 94.44% accuracy, 90.34% precision, 88.53% sensitivity, and 89.53% F1 score. It significantly improved over previous approaches, which achieved 80% accuracy.
Chapter
Identifying and extracting subjective information from text data through the use of machine learning and natural language processing methods is the process of sentiment analysis. It is a useful tool for understanding the opinions and attitudes of individuals or groups towards a particular topic or event. In this study, we used TextBlob and Vader to analyze the sentiment of comments posted in the YouTube comment section of Indian news channels during India’s matches in the 2022 T20 World Cup. A highly anticipated event for cricket fans in India. TextBlob and Vader were used to categorize a sample of remarks from the YouTube comment sections of several Indian news channels as positive, negative, or neutral. We then analyzed the results to understand the overall sentiment of the comments and to identify any trends or patterns in the data. Our analysis provided insights into the public perception of Indian news channels on YouTube and the opinions and feelings of their viewers during the T20 World Cup. It also demonstrates the potential of sentiment analysis to understand the public’s reaction to sporting events and to gauge the sentiment of online conversations.KeywordsSentiment analysisNatural language processingYoutube comment sectionT20 World CupTextBlobVADER
Preprint
Full-text available
The COVID-19 Infodemic had an unprecedented impact on health behaviors and outcomes at a global scale. While many studies have focused on a qualitative and quantitative understanding of misinformation, including sentiment analysis, there is a gap in understanding the emotion-carriers of misinformation and their differences across geographies. In this study, we characterized emotion carriers and their impact on vaccination rates in India and the United States. A manually labelled dataset was created from 2.3 million tweets and collated with three publicly available datasets (CoAID, AntiVax, CMU) to train deep learning models for misinformation classification. Misinformation labelled tweets were further analyzed for behavioral aspects by leveraging Plutchik Transformers to determine the emotion for each tweet. Time series analysis was conducted to study the impact of misinformation on spatial and temporal characteristics. Further, categorical classification was performed using transformer models to assign categories for the misinformation tweets. Word2Vec+BiLSTM was the best model for misinformation classification, with an F1-score of 0.92. The US had the highest proportion of misinformation tweets (58.02%), followed by the UK (10.38%) and India (7.33%). Disgust, anticipation, and anger were associated with an increased prevalence of misinformation tweets. Disgust was the predominant emotion associated with misinformation tweets in the US, while anticipation was the predominant emotion in India. For India, the misinformation rate exhibited a lead relationship with vaccination, while in the US it lagged behind vaccination. Our study deciphered that emotions acted as differential carriers of misinformation across geography and time. These carriers can be monitored to develop strategic interventions for countering misinformation, leading to improved public health.
Article
COVID-19 epidemic is one of the worst disaster which affected people worldwide. It has impacted whole civilization physically, monetarily, and also emotionally. Sentiment analysis is an important step to handle pandemic effectively. In this work, systematic literature review of sentiment analysis of Indian population towards COVID-19 and its vaccination is presented. Recent exiting works are considered from four primary databases including ACM, Web of Science, IEEE Explore, and Scopus. Total 40 publications from January 2020 to August 2022 are selected for systematic review after applying inclusion and exclusion algorithm. Existing works are analyzed in terms of various challenges encountered by the existing authors with collected datasets. It is analyzed that mainly three techniques namely lexical, machine and deep learning are used by various authors for sentiment analysis. Performance of various applied techniques are comparative analyzed. Direction of future research works with recommendations are highlighted.
Article
Full-text available
The COVID-19 pandemic has revealed the power of internet disinformation in influencing global health. The deluge of information travels faster than the epidemic itself and is a threat to the health of millions across the globe. Health apps need to leverage machine learning for delivering the right information while constantly learning misinformation trends and deliver these effectively in vernacular languages in order to combat the infodemic at the grassroot levels in the general public. Our application, WashKaro, is a multi-pronged intervention that uses conversational Artificial Intelligence (AI), machine translation, and natural language processing to combat misinformation (NLP). WashKaro uses AI to provide accurate information matched against WHO recommendations and delivered in an understandable format in local languages. The primary aim of this study was to assess the use of neural models for text summarization and machine learning for delivering WHO matched COVID-19 information to mitigate the misinfodemic. The secondary aim of this study was to develop a symptom assessment tool and segmentation insights for improving the delivery of information. A total of 5026 people downloaded the app during the study window; among those, 1545 were actively engaged users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of a mHealth platform to mitigate health misinformation. We conclude that a machine learning application delivering bite-sized vernacular audios and conversational AI is a practical approach to mitigate health misinformation.
Article
Full-text available
(Abstracted from Lancet 2020;395:689–697) Wuhan, China, has been investigating the outbreak of the 2019 novel coronavirus (2019-nCoV, COVID-19), now called severe acute respiratory syndrome coronavirus 2 (SARS-CoV2), since December 31, 2019. At the end of January 2020, nearly 6000 cases of infections by SARS-CoV2 have been confirmed in mainland China, and 78 cases had been identified as having been exported from Wuhan to areas outside mainland China.
Article
Full-text available
Background Twitter has been used to track trends and disseminate health information during viral epidemics. On January 21, 2020, the CDC activated its Emergency Operations Center and the WHO released its first situation report about Coronavirus Disease 2019 (COVID-19), sparking significant media attention. How Twitter content and sentiment evolved in the early stages of the COVID-19 pandemic has not been described. Methods We extracted tweets matching hashtags related to COVID-19 from January 14th to 28th, 2020 using Twitter’s application programming interface. We measured themes and frequency of keywords related to infection prevention practices. We performed a sentiment analysis to identify the sentiment polarity and predominant emotions in tweets and conducted topic modeling to identify and explore discussion topics over time. We compared sentiment, emotion, and topics among the most popular tweets, defined by the number of retweets. Results We evaluated 126,049 tweets from 53,196 unique users. The hourly number of COVID-19 related tweets starkly increased from January 21, 2020 onward. Nearly half (49.5%) of all tweets expressed fear and nearly 30% expressed surprise. In the full cohort, the economic and political impact of COVID-19 was the most commonly discussed topic. When focusing on the most retweeted tweets, the incidence of fear decreased and topics focused on quarantine efforts, the outbreak and its transmission, as well as prevention. Conclusion Twitter is a rich medium that can be leveraged to understand public sentiment in real-time and potentially target individualized public health messages based on user interest and emotion.
Article
Full-text available
Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19’s informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.
Article
Full-text available
Corona Virus Infectious Disease (COVID-19) is the infectious disease. The COVID-19 disease came to earth in early 2019. It is expanding exponentially throughout the world and affected an enormous number of human beings starting from the last month. The World Health Organization (WHO) on March 11, 2020 declared COVID-19 was characterized as “Pandemic”. This paper proposed approach for confirmation of COVID-19 cases after the diagnosis of doctors. The objective of this study uses machine learning method to evaluate how much predicted results are close to original data related to Confirmed-NegativeReleased-Death cases of COVID-19. For this purpose, a verification method is proposed in this paper that uses the concept of Deep-learning Neural Network. In this framework, Long shrt-term memory (LSTM) and Gated Recurrent Unit (GRU) are also assimilated finally for training the dataset. The results are obtained from the proposed method with accuracy 87 % for the “confirmed Cases”, 67.8 % for “Negative Cases”, 62% for “Deceased Case” and 40.5 % for “Released Case”. The outbreak of Coronavirus has the nature of exponential growth and so it is difficult to control with limited clinical persons for handling a huge number of patients within a reasonable time. So it is necessary to build an automated model, based on machine learning approach, for corrective measure after the decision of clinical doctors.
Preprint
Full-text available
Background: Twitter has been used to track trends and disseminate health information during viral epidemics. On January 21, 2020, the CDC activated its Emergency Operations Center and the WHO released its first situation report about coronavirus disease 2019 (COVID-19), sparking significant media attention. How Twitter content and sentiment has evolved in the early stages of any outbreak, including the COVID-19 epidemic, has not been described. Objective: To quantify and understand early changes in Twitter activity, content, and sentiment about the COVID-19 epidemic. Design: Observational study. Setting: Twitter platform. Participants: All Twitter users who created or sent a message from January 14th to 28th, 2020. Measurements: We extracted tweets matching hashtags related to COVID-19 and measured frequency of keywords related to infection prevention practices, vaccination, and racial prejudice. We performed a sentiment analysis to identify emotional valence and predominant emotions. We conducted topic modeling to identify and explore discussion topics over time. Results: We evaluated 126,049 tweets from 53,196 unique users. The hourly number of COVID-19-related tweets starkly increased from January 21, 2020 onward. Nearly half (49.5%) of all tweets expressed fear and nearly 30% expressed surprise. The frequency of racially charged tweets closely paralleled the number of newly diagnosed cases of COVID-19. The economic and political impact of the COVID-19 was the most commonly discussed topic, while public health risk and prevention were among the least discussed. Conclusion: Tweets with negative sentiment and emotion parallel the incidence of cases for the COVID-19 outbreak. Twitter is a rich medium that can be leveraged to understand public sentiment in real-time and target public health messages based on user interest and emotion.
Preprint
Full-text available
Background: Countries around the world are facing extraordinary challenges in implementing various measures to slow down the spread of the novel coronavirus (COVID-19). Guided by international recommendations, Saudi Arabia has implemented a series of infection control measures after the detection of the first confirmed case in the country. However, in order for these measures to be effective, public attitudes and compliance must be conducive as perceived risk is strongly associated with health behaviors. The primary objective of this study is to assess Saudis’ attitudes towards COVID-19 preventive measures to guide future health communication content. Methods: Naïve Bayes machine learning model was used to run Arabic sentiment analysis of Twitter posts through the Natural Language Toolkit (NLTK) library in Python. Tweets containing hashtags pertaining to seven public health measures imposed by the government were collected and analyzed. Results: A total of 53,127 tweets were analyzed. All measures, except one, showed more positive tweets than negative. Measures that pertain to religious practices showed the most positive sentiment. Discussion: Saudi Twitter users showed support and positive attitudes towards the infection control measures to combat COVID-19. It is postulated that this conducive public response is reflective of the overarching, longstanding popular confidence in the government. Religious notions may also play a positive role in preparing believers at times of crises. Findings of this study broadened our understanding to develop proper public health messages and promote stronger compliance with control measures to control COVID-19.
Article
Full-text available
COVID-19 (Corona Virus Disease 2019) has significantly resulted in a large number of psychological consequences. The aim of this study is to explore the impacts of COVID-19 on people’s mental health, to assist policy makers to develop actionable policies, and help clinical practitioners (e.g., social workers, psychiatrists, and psychologists) provide timely services to affected populations. We sample and analyze the Weibo posts from 17,865 active Weibo users using the approach of Online Ecological Recognition (OER) based on several machine-learning predictive models. We calculated word frequency, scores of emotional indicators (e.g., anxiety, depression, indignation, and Oxford happiness) and cognitive indicators (e.g., social risk judgment and life satisfaction) from the collected data. The sentiment analysis and the paired sample t-test were performed to examine the differences in the same group before and after the declaration of COVID-19 on 20 January, 2020. The results showed that negative emotions (e.g., anxiety, depression and indignation) and sensitivity to social risks increased, while the scores of positive emotions (e.g., Oxford happiness) and life satisfaction decreased. People were concerned more about their health and family, while less about leisure and friends. The results contribute to the knowledge gaps of short-term individual changes in psychological conditions after the outbreak. It may provide references for policy makers to plan and fight against COVID-19 effectively by improving stability of popular feelings and urgently prepare clinical practitioners to deliver corresponding therapy foundations for the risk groups and affected people.
Article
Full-text available
With the rapid development of the Internet industry, sentiment analysis has grown into one of the popular areas of natural language processing (NLP). Through it, the implicit emotion in text can be effectively mined, which can help enterprises or organizations to make effective decision, and the explosive growth of data undoubtedly brings more opportunities and challenges to sentiment analysis. At the same time, transfer learning has emerged as a new machine learning technique that uses existing knowledge to solve different domain problems and produces state-of-the-art prediction results. Many scholars apply transfer learning to the field of sentiment analysis. This survey summarizes the relevant research results of sentiment analysis in recent years and focuses on the algorithms and applications of transfer learning in sentiment analysis, and we look forward to the development trend of sentiment analysis.
Conference Paper
Full-text available
Emotions and sentiment of software developers can largely influence the software productivity and quality. However, existing work on emotion mining and sentiment analysis is still in the early stage in software engineering in terms of accuracy, the size of datasets used and the specificity of the analysis. In this work, we are concerned with conducting entity-level sentiment analysis. We first build a manually labeled dataset containing 3,000 issue comments selected from 231,732 issue comments collected from 10 open source projects in GitHub. Then we design and develop SentiSW, an entity-level sentiment analysis tool consisting of sentiment classification and entity recognition, which can classify issue comments into <sentiment, entity> tuples. We evaluate the sentiment classification using ten-fold cross validation, and it achieves 68.71% mean precision, 63.98% mean recall and 77.19% accuracy, which is significantly higher than existing tools. We evaluate the entity recognition by manually annotation and it achieves a 75.15% accuracy.
Article
Full-text available
The Internet provides the opportunity for investors to post online opinions that they share with fellow investors. Sentiment analysis of online opinion posts can facilitate both investors' investment decision making and stock companies' risk perception. This paper develops a novel sentiment ontology to conduct context-sensitive sentiment analysis of online opinion posts in stock markets. The methodology integrates popular sentiment analysis into machine learning approaches based on support vector machine and generalized autoregressive conditional heteroskedasticity modeling. A typical financial website called Sina Finance has been selected as an experimental platform where a corpus of financial review data was collected. Empirical results suggest solid correlations between stock price volatility trends and stock forum sentiment. Computational results show that the statistical machine learning approach has a higher classification accuracy than that of the semantic approach. Results also imply that investor sentiment has a particularly strong effect for value stocks relative to growth stocks.
Article
Full-text available
Little is currently known about the factors that promote the propagation of information in online social networks following terrorist events. In this paper we took the case of the terrorist event in Woolwich, London in 2013 and built models to predict information flow size and sur-vival using data derived from the popular social networking site Twitter. We define information flows as the propaga-tion over time of information posted to Twitter via the action of retweeting. Following a comparison with differ-ent predictive methods, and due to the distribution exhib-ited by our dependent size measure, we used the zero-truncated negative binomial (ZTNB) regression method. To model survival, the Cox regression technique was used because it estimates proportional hazard rates for inde-pendent measures. Following a principal component ana-lysis to reduce the dimensionality of the data, social, temporal and content factors of the tweet were used as predictors in both models. Given the likely emotive reaction caused by the event, we emphasize the influence of emotive content on propagation in the discussion sec-tion. From a sample of Twitter data collected following the event (N = 427,330) we report novel findings that identify that the sentiment expressed in the tweet is statistically significantly predictive of both size and survival of infor-mation flows of this nature. Furthermore, the number of offline press reports relating to the event published on the day the tweet was posted was a significant predictor of size, as was the tension expressed in a tweet in relation to sur-vival. Furthermore, time lags between retweets and the co-occurrence of URLS and hashtags also emerged as significant.
Article
Introduction: Corona Virus Infectious Disease (COVID-19) is the infectious disease. The COVID-19 disease came to earth in early 2019. It is expanding exponentially throughout the world and affected an enormous number of human beings starting from the last month. The World Health Organization (WHO) on March 11, 2020 declared COVID-19 was characterized as “Pandemic”. This paper proposed approach for confirmation of COVID-19 cases after the diagnosis of doctors. The objective of this study uses machine learning method to evaluate how much predicted results are close to original data related to Confirmed-Negative-Released-Death cases of COVID-19. Materials and methods: For this purpose, a verification method is proposed in this paper that uses the concept of Deep-learning Neural Network. In this framework, Long shrt-term memory (LSTM) and Gated Recurrent Unit (GRU) are also assimilated finally for training the dataset. The prediction results are tally with the results predicted by clinical doctors. Results: The results are obtained from the proposed method with accuracy 87 % for the “confirmed Cases”, 67.8 % for “Negative Cases”, 62% for “Deceased Case” and 40.5 % for “Released Case”. Another important parameter i.e. RMSE shows 30.15% for Confirmed Case, 49.4 % for Negative Cases, 4.16 % for Deceased Case and 13.72 % for Released Case. Conclusions: The outbreak of Coronavirus has the nature of exponential growth and so it is difficult to control with limited clinical persons for handling a huge number of patients within a reasonable time. So it is necessary to build an automated model, based on machine learning approach, for corrective measure after the decision of clinical doctors.
Article
The COVID-19 outbreak has focused attention on the use of social distancing as the primary defence against community infection. Forcing social animals to maintain physical distance has presented significant challenges for health authorities and law enforcement. Anecdotal media reports suggest widespread dissatisfaction with social distancing as a policy, yet there is little prior work aimed at measuring community acceptance of social distancing. In this paper, we propose a new approach to measuring attitudes towards social distancing by using social media and sentiment analysis. Over a four-month period, we found that 82.5 percent of tweets were in favour of social distancing. The results indicate a widespread acceptance of social distancing in a selected community. We examine options for estimating the optimal (minimal) social distance required at scale, and the implications for securing widespread community support and for appropriate crisis management during emergency health events.
Article
Background The recent coronavirus disease (COVID-19) pandemic is taking a toll on the world’s health care infrastructure as well as the social, economic, and psychological well-being of humanity. Individuals, organizations, and governments are using social media to communicate with each other on a number of issues relating to the COVID-19 pandemic. Not much is known about the topics being shared on social media platforms relating to COVID-19. Analyzing such information can help policy makers and health care organizations assess the needs of their stakeholders and address them appropriately. Objective This study aims to identify the main topics posted by Twitter users related to the COVID-19 pandemic. Methods Leveraging a set of tools (Twitter’s search application programming interface (API), Tweepy Python library, and PostgreSQL database) and using a set of predefined search terms (“corona,” “2019-nCov,” and “COVID-19”), we extracted the text and metadata (number of likes and retweets, and user profile information including the number of followers) of public English language tweets from February 2, 2020, to March 15, 2020. We analyzed the collected tweets using word frequencies of single (unigrams) and double words (bigrams). We leveraged latent Dirichlet allocation for topic modeling to identify topics discussed in the tweets. We also performed sentiment analysis and extracted the mean number of retweets, likes, and followers for each topic and calculated the interaction rate per topic. Results Out of approximately 2.8 million tweets included, 167,073 unique tweets from 160,829 unique users met the inclusion criteria. Our analysis identified 12 topics, which were grouped into four main themes: origin of the virus; its sources; its impact on people, countries, and the economy; and ways of mitigating the risk of infection. The mean sentiment was positive for 10 topics and negative for 2 topics (deaths caused by COVID-19 and increased racism). The mean for tweet topics of account followers ranged from 2722 (increased racism) to 13,413 (economic losses). The highest mean of likes for the tweets was 15.4 (economic loss), while the lowest was 3.94 (travel bans and warnings). Conclusions Public health crisis response activities on the ground and online are becoming increasingly simultaneous and intertwined. Social media provides an opportunity to directly communicate health information to the public. Health systems should work on building national and international disease detection and surveillance systems through monitoring social media. There is also a need for a more proactive and agile public health presence on social media to combat the spread of fake news.
Article
During the ongoing outbreak of coronavirus disease (COVID-19), people use social media to acquire and exchange various types of information at a historic and unprecedented scale. Only the situational information are valuable for the public and authorities to response to the epidemic. Therefore, it is important to identify such situational information and to understand how it is being propagated on social media, so that appropriate information publishing strategies can be informed for the COVID-19 epidemic. This article sought to fill this gap by harnessing Weibo data and natural language processing techniques to classify the COVID-19-related information into seven types of situational information. We found specific features in predicting the reposted amount of each type of information. The results provide data-driven insights into the information need and public attention.
Article
Background: Since Dec 31, 2019, the Chinese city of Wuhan has reported an outbreak of atypical pneumonia caused by the 2019 novel coronavirus (2019-nCoV). Cases have been exported to other Chinese cities, as well as internationally, threatening to trigger a global outbreak. Here, we provide an estimate of the size of the epidemic in Wuhan on the basis of the number of cases exported from Wuhan to cities outside mainland China and forecast the extent of the domestic and global public health risks of epidemics, accounting for social and non-pharmaceutical prevention interventions. Methods: We used data from Dec 31, 2019, to Jan 28, 2020, on the number of cases exported from Wuhan internationally (known days of symptom onset from Dec 25, 2019, to Jan 19, 2020) to infer the number of infections in Wuhan from Dec 1, 2019, to Jan 25, 2020. Cases exported domestically were then estimated. We forecasted the national and global spread of 2019-nCoV, accounting for the effect of the metropolitan-wide quarantine of Wuhan and surrounding cities, which began Jan 23-24, 2020. We used data on monthly flight bookings from the Official Aviation Guide and data on human mobility across more than 300 prefecture-level cities in mainland China from the Tencent database. Data on confirmed cases were obtained from the reports published by the Chinese Center for Disease Control and Prevention. Serial interval estimates were based on previous studies of severe acute respiratory syndrome coronavirus (SARS-CoV). A susceptible-exposed-infectious-recovered metapopulation model was used to simulate the epidemics across all major cities in China. The basic reproductive number was estimated using Markov Chain Monte Carlo methods and presented using the resulting posterior mean and 95% credibile interval (CrI). Findings: In our baseline scenario, we estimated that the basic reproductive number for 2019-nCoV was 2·68 (95% CrI 2·47-2·86) and that 75 815 individuals (95% CrI 37 304-130 330) have been infected in Wuhan as of Jan 25, 2020. The epidemic doubling time was 6·4 days (95% CrI 5·8-7·1). We estimated that in the baseline scenario, Chongqing, Beijing, Shanghai, Guangzhou, and Shenzhen had imported 461 (95% CrI 227-805), 113 (57-193), 98 (49-168), 111 (56-191), and 80 (40-139) infections from Wuhan, respectively. If the transmissibility of 2019-nCoV were similar everywhere domestically and over time, we inferred that epidemics are already growing exponentially in multiple major cities of China with a lag time behind the Wuhan outbreak of about 1-2 weeks. Interpretation: Given that 2019-nCoV is no longer contained within Wuhan, other major Chinese cities are probably sustaining localised outbreaks. Large cities overseas with close transport links to China could also become outbreak epicentres, unless substantial public health interventions at both the population and personal levels are implemented immediately. Independent self-sustaining outbreaks in major cities globally could become inevitable because of substantial exportation of presymptomatic cases and in the absence of large-scale public health interventions. Preparedness plans and mitigation interventions should be readied for quick deployment globally. Funding: Health and Medical Research Fund (Hong Kong, China).
Sentiment analysis on synchronous online delivery of instruction due to extreme community quarantine in the Philippines caused by Covid-19 pandemic
  • C K Pastor
C. K. Pastor, "Sentiment analysis on synchronous online delivery of instruction due to extreme community quarantine in the Philippines caused by Covid-19 pandemic," Asian J. Multidisciplinary Stud., vol. 3, no. 1, pp. 1-6, Mar. 2020.
In the eyes of the beholder: Analyzing social media use of neutral and controversial terms for COVID-19
  • L Chen
  • H Lyu
  • T Yang
  • Y Wang
  • J Luo
L. Chen, H. Lyu, T. Yang, Y. Wang, and J. Luo, "In the eyes of the beholder: Analyzing social media use of neutral and controversial terms for COVID-19," 2020, arXiv:2004.10225. [Online]. Available: http://arxiv.org/abs/2004.10225
Digital India: Technology to transform a connected nation
  • N Kaka
N. Kaka et al., "Digital India: Technology to transform a connected nation," McKinsey Global Inst., India, Tech. Rep., Mar. 2019. [Online].
Text mining & sentiment analysis of GST tweets by naive Bayes algorithm
  • S Das
  • A K Kolya Sense
S. Das and A. K. Kolya Sense GST:, "Text mining & sentiment analysis of GST tweets by naive Bayes algorithm," in Proc. 3rd Int. Conf. Res. Comput. Intell. Commun. Netw. (ICRCICN), Nov. 2017, pp. 239-244.
When Will COVID-19 End Data-Driven Prediction
  • J Luo
J. Luo. (2020). When Will COVID-19 End Data-Driven Prediction. [Online]. Available: https://ddi.sutd.edu.sg
Identification of COVID-19 can be quicker through artificial intelligence framework using a mobile phone-based survey when cities and towns are under quarantin
  • R As
  • V Ja
R. AS and V. JA, "Identification of COVID-19 can be quicker through artificial intelligence framework using a mobile phone-based survey when cities and towns are under quarantin," Infection Control Hospital Epidemiol., vol. 1, pp. 1-5, Jan. 2020.
Sentiment analysis of tweets in Saudi Arabia regarding governmental preventive measures to contain COVID-19
  • alhajji
In the eyes of the beholder: Analyzing social media use of neutral and controversial terms for COVID-19
  • chen
Covid-19 public sentiment insights and machine learning for tweets classification. Nawaz and Rahman, Md. Mokhlesur and Esawi, Ek and Samuel, Yana
  • J Samuel
  • A Gg
  • M Rahman
  • E Esawi
  • Y Samuel