Conference Paper

Data mining of public opinion: An overview

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The United Nations recently published the “E-government survey 2020” with the main aim of assessing the e-government development status of all United Nations member states. The survey outlines 14 leading countries in e-government development (out of 193 member states) some of them claiming to utilize technologies as artificial intelligence (AI), big data and blockchain. Moreover, with the burst of the COVID-19 pandemic the topic on development and implementation of e-government services becomes even hotter. However, along with the research on the process of digitalization of public services, it is important to develop tools measuring how these rapid changes are perceived by the users. Consequently, this paper examines the most recent research devoted on public opinion data mining. On the basis of extensive literature review, we outline the latest developments and trends in the field of public opinion data mining with special focus on sentiment analysis. Our main goal is to provide a self-contained comprehensive summary that might be used as a basis for design and development of AI systems aimed to mine the public opinion.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To the best of the authors' knowledge, this is the first study aimed at applying a combination of transformer-based language models for mining citizens' interests, attitudes and emotions towards e-government development and e-services provision in Bulgaria. This finding is also supported by previous research in the field [12,13]. ...
... Sentiment analysis has numerous applications in various domains when people's opinion on various subjects plays a crucial role in decision-making and management. The increased awareness of policy-makers about the importance of interaction with the public in social networks recently led to applications of sentiment analysis in the government domain too [3,13]. Sentiment analysis can be divided into two general categories, namely -opinion mining and emotion mining [14]. ...
... Our methodology might also be applied in studying the public opinion on other digital services, important issues or emerging problems in the government domain. Furthermore, our approach utilizes data from different social networks used in the country, unlike other studies in the field that use mainly Twitter data for studying public opinion [13]. ...
Chapter
Full-text available
We live in an era of digital revolution not only in the industry, but also in the public sector. User opinion is key in e-services development. Currently the most established approaches for analyzing citizens’ opinions are surveys and personal interviews. However, governments should focus not only on developing public e-services but also on implementing modern solutions for data analysis based on machine learning and artificial intelligence. The main aim of the current study is to engage state-of-the-art natural language processing technologies to develop an analytical approach for public opinion analysis. We utilize transformer-based language models to derive valuable insights into citizens’ interests and expressed sentiments and emotions towards digitalization of educational, administrative and health public services. Our research brings empirical evidence on the practical usefulness of such methods in the government domain.
... Satish and Yusof (2017) found that CVA strategies can increase customer satisfaction and loyalty, which improves customer performance for businesses. Studies similar to this one carried out by Xiang et al. (2015), Kitsios et al. (2021), and Hristova et al. (2022) demonstrate how the application and acceptance of CVA will benefit businesses, particularly SMEs, by enhancing customer satisfaction and performance. Based on the discussion that came before, the study's hypothesis was stated as: ...
... Utilization of sentiment lexicons and/or 2. Manual annotation. This finding is supported by a more extensive literature review carried out in (6). Sentiment lexicons are essentially dictionaries consisting of words with associated labels/scores specifying their sentiment. ...
Conference Paper
Digitalization affects all fields of the modern world including the economic, political, and social aspects of our life. Governments are also involved in this process. Considering the rapid transition towards digital e-services that we evidence since the COVID-19 outbreak, the detection and analysis of public opinion on e-government services becomes even more important. The sentiments and opinions of users on digital public services might be used to foster improvements in the development and implementation of the e-services. In this paper we engage the advances in NLP so as to examine the possibilities to analyze in an automated manner the opinions and sentiments towards e-government services expressed by citizens in various social networks in Bulgaria and this is the main goal of our research. For this purpose, we design an integrated ML-based AI system that aims to support decision makers in e-government and public services provision. The system utilizes a variety of data sources - news websites, web forums and other online social networks. To the best of the authors’ knowledge, this is the first study that develops a methodology for mining public opinion in Bulgaria based on a combination of NLP and ML techniques, rather than relying on surveys or in-depth interviews.
Article
Full-text available
Now a days, customers opinions are plays the major role in the E-commerce applications such as Flipkart, Amazon, eBay etc. Based on customer feedback on the product or seller in the form reviews or comments are the difficulty process by potential buyers to choose a products through online. In the proposed system, the various sentiment analysis techniques to provide a solution in two main areas. 1) Extract customer opinions on specific product or seller. 2) Analyze the sentiments towards that specific product or seller. In this paper, we analyzed several opinion mining techniques and sentiment analysis and their correctness in the categories of opinions or sentiments.
Article
Full-text available
Text analytics is becoming an integral part of modern business and economic research and analysis. However, the extent to which its application is possible and accessible varies for different languages. The main goal of this paper is to outline fundamental research on text analytics applied on data in Bulgarian. A review of key research articles in two main directions is provided – development of language resources for Bulgarian and experimenting with Bulgarian text data in practical applications. By summarizing the results of a large literature review, we draw conclusions about the degree of development of the field, the availability of language resources for the Bulgarian language and the extent to which text analytics has been applied in practical problems. Future directions for research are outlined. To the best of the author’s knowledge, this is the first study providing a comprehensive overview of progress in the field of text analytics in Bulgarian.
Article
Full-text available
Gathering public opinions on the Internet and Internet-based applications like Twitter has become popular in recent times, as it provides decision-makers with uncensored public views on products, government policies, and programs. rough natural language processing and machine learning techniques, unstructured data forms from these sources can be analyzed using traditional statistical learning. e challenge encountered in machine learning method-based sentiment classification still remains the abundant amount of data available, which makes it difficult to train the learning algorithms in feasible time. is eventually degrades the classification accuracy of the algorithms. From this assertion, the effect of training data sizes in classification tasks cannot be overemphasized. is study statistically assessed the performance of Naive Bayes, support vector machine (SVM), and random forest algorithms on sentiment text classification task. e research also investigated the optimal conditions such as varying data sizes, trees, and kernel types under which each of the respective algorithms performed best. e study collected Twitter data from Ghanaian users which contained sentiments about the Ghanaian Government. e data was preprocessed, manually labeled by the researcher, and then trained using the aforementioned algorithms. These algorithms are three of the most popular learning algorithms which have had lots of success in diverse fields. The Naive Bayes classifier was adjudged the best algorithm for the task as it outperformed the other two machine learning algorithms with an accuracy of 99%, F1 score of 86.51%, and Matthews correlation coefficient of 0.9906. e algorithm also performed well with increasing data sizes. The Naive Bayes classifier is recommended as viable for sentiment text classification, especially for text classification systems which work with Big Data.
Article
Full-text available
The spread of Covid-19 has resulted in worldwide health concerns. Social media is increasingly used to share news and opinions about it. A realistic assessment of the situation is necessary to utilize resources optimally and appropriately. In this research, we perform Covid-19 tweets sentiment analysis using a supervised machine learning approach. Identification of Covid-19 sentiments from tweets would allow informed decisions for better handling the current pandemic situation. The used dataset is extracted from Twitter using IDs as provided by the IEEE data port. Tweets are extracted by an in-house built crawler that uses the Tweepy library. The dataset is cleaned using the preprocessing techniques and sentiments are extracted using the TextBlob library. The contribution of this work is the performance evaluation of various machine learning classifiers using our proposed feature set. This set is formed by concatenating the bag-of-words and the term frequency-inverse document frequency. Tweets are classified as positive, neutral, or negative. Performance of classifiers is evaluated on the accuracy, precision, recall, and F1 score. For completeness, further investigation is made on the dataset using the Long Short-Term Memory (LSTM) architecture of the deep learning model. The results show that Extra Trees Classifiers outperform all other models by achieving a 0.93 accuracy score using our proposed concatenated features set. The LSTM achieves low accuracy as compared to machine learning classifiers. To demonstrate the effectiveness of our proposed feature set, the results are compared with the Vader sentiment analysis technique based on the GloVe feature extraction approach.
Article
Full-text available
Belgium is often portrayed as a textbook example of gradual federalization. Today, however, a rather new debate among political elites has emerged: whether to refederalize some of the powers that have been devolved to the substate entities. Yet, little is known about how citizens see the issue. The objective of this article is therefore to explore and compare citizens’ arguments for more or less regional autonomy. To this end, three citizen forums focusing on federalism were organized in 2017–2018: one in French, a second one in Dutch and a third one in German. They were transcribed and analysed using thematic analysis. Our results suggest that citizen opinions are justified based on two major argumentative themes: identity and efficiency. While one would expect the former to be of traditional importance, our analysis revealed that considerations about efficiency have taken the lead. This can above all be understood given the advanced stage of the Belgian federalization process, for which considerations of identity are still latently important but explicitly not sufficient enough anymore to justify further dynamics. Finally, our analysis also outlined the existence of additional argumentative frames related to path dependencies and the peculiar situation of Brussels.
Article
Full-text available
The Digital India initiative by the Government of India is an initiative by Shri Narendra Modi, the Prime Minister of India. Launched in the year 2015, the programme enhanced its scope to various digital services wings. In this study, the researcher attempted to identify public sentiments by studying their comments through Tweets. The researcher considered the Tweets by Digital India for a period of one year from November 21, 2016 to November 20, 2017. With the help of Quasi Systematic Sampling, 572 Tweets were pulled up for analysis. The researcher studied the first two comments of each Tweet. Every comment was categorized based on primary sentiment of the sentence such as ‘Support’, ‘Criticism’, ‘Neutral’, ‘Enquiry’ and ‘Out of Context’. In order to validate the researcher’s conclusion of the sentiment on any comment, the researcher conducted ‘Inter-Rater Reliability Test’ to calculate Kappa Coefficient (k). This study intends to find out the sentiment of the public in majority and the Twitter accounts, which received the higher percentage of supportive, neutral or criticizing comments. The research further tries to identify if the supportive or criticizing comments are higher in English or Hindi. Keywords: Digital India, Sentiment Analysis, Twitter, Public Relations, Twitter Comments.
Article
Full-text available
Online services depend primarily on customer feedback and communications. When this kind of input is lacking, the overall approach of the service provider can shift in unintended ways. These services rely on feedback to maintain consumer satisfaction. Online social networks are a rich source of consumer data related to services and products. Well developed methods like sentiment analysis can offer insightful analyses and aid service providers in predicting outcomes based on their reviews-which, in turn, enables decision-makers to develop effective strategic plans. However, gathering this data is more challenging on Arabic online social networks, due to the complexity of the Arabic language and its dialects. In this study, we propose an approach to sentiment analysis that combines a neutrality detector model with eXtreme Gradient Boosting and a genetic algorithm to effectively predict and analyze customers' opinions of an e-Payment service through an Arabic social network. The proposed approach yields excellent results compared to other approaches. Feature analysis is also conducted on consumer reviews to identify influencing keywords.
Article
Full-text available
COVID-19 has changed our lives forever. The world we knew until now has been transformed and nowadays we live in a completely new scenario in a perpetual restructuring transition, in which the way we live, relate, and communicate with others has been altered permanently. Within this context, risk communication is playing a decisive role when informing, transmitting, and channeling the flow of information in society. COVID-19 has posed a real pandemic risk management challenge in terms of impact, preparedness, response, and mitigation by governments, health organizations, non-governmental organizations (NGOs), mass media, and stakeholders. In this study, we monitored the digital ecosystems during March and April 2020, and we obtained a sample of 106,261 communications through the analysis of APIs and Web Scraping techniques. This study examines how social media has affected risk communication in uncertain contexts and its impact on the emotions and sentiments derived from the semantic analysis in Spanish society during the COVID-19 pandemic.
Article
Full-text available
Financial and economic news is continuously monitored by financial market participants. According to the efficient market hypothesis, all past information is reflected in stock prices and new information is instantaneously absorbed in determining future stock prices. Hence, prompt extraction of positive or negative sentiments from news is very important for investment decision-making by traders, portfolio managers and investors. Sentiment analysis models can provide an efficient method for extracting actionable signals from the news. However, financial sentiment analysis is challenging due to domain-specific language and unavailability of large labeled datasets. General sentiment analysis models are ineffective when applied to specific domains such as finance. To overcome these challenges, we design an evaluation platform which we use to assess the effectiveness and performance of various sentiment analysis approaches, based on combinations of text representation methods and machine-learning classifiers. We perform more than one hundred experiments using publicly available datasets, labeled by financial experts. We start the evaluation with specific lexicons for sentiment analysis in finance and gradually build the study to include word and sentence encoders, up to the latest available NLP transformers. The results show improved efficiency of contextual embeddings in sentiment analysis compared to lexicons and fixed word and sentence encoders, even when large datasets are not available. Furthermore, distilled versions of NLP transformers produce comparable results to their larger teacher models, which makes them suitable for use in production environments.
Article
Full-text available
COVID-19 outbreak was first reported in Wuhan, China and has spread to more than 50 countries. WHO declared COVID-19 as a Public Health Emergency of International Concern (PHEIC) on 30 January 2020. Naturally, a rising infectious disease involves fast spreading, endangering the health of large numbers of people, and thus requires immediate actions to prevent the disease at the community level. Therefore, CoronaTracker was born as the online platform that provides latest and reliable news development, as well as statistics and analysis on COVID-19. This paper is done by the research team in the CoronaTracker community and aims to predict and forecast COVID19 cases, deaths, and recoveries through predictive modelling. The model helps to interpret patterns of public sentiment on disseminating related health information, and assess political and economic influence of the spread of the virus.
Article
Full-text available
Nowadays; key performance indicators (KPI) is considered to be an important factor to evaluate the organizational maturity level. Information and Communication Technology ICT is used as the backbone of the modern countries infrastructure. Kuwait aims to enhance the ICT services for citizens in order to increase citizen satisfaction. This couldn't be reached without evaluating the existing services throughout defining KPIs. The main objective is to provide a solution to the e-government in Kuwait especially in the educational sector in order to facilitate and enhance the decision making process. This paper proposes a road map introduce KPIs measurements for the services in e-government. The proposed road map uses mission, vision, and objectives to define and measure KPIs. We used five key indicators which are loyalty, participation, productivity, communication, and satisfaction. A case study is implemented for the Ministry of Education (MOE) throughout using questionnaire with population with 291 participants. Data mining (DM), Sentiment Analysis (SA), and statistical methods used to analyze the results of the questionnaire which is near similar. The results show that; the clustering process indicates the degree of agreement regarding the predefined Key Result Indicators (KRIs) and based on three clusters reach 63.7% in participation, 64.2% for satisfaction, 65.2% for loyalty, 66.3% communication, and 63.7% for productivity. The sentiment analysis model shows the ability to predict correctly 86 positive reviews with 67.7% and 41 and 32.3% negative reviews. Regarding the statistical methods; after identifying mean, standard deviation and percent shows near values compared to the data mining (clustering) results 64% in participation, 64.8% for satisfaction, 66% for loyalty, 64.4% communication, and 65% for productivity. The results indicate that the output of the three methods of evaluation is near equivalent. This leads to an important implication which is although the excellent infrastructure of Information and Communication Technology (ICT), the proposed road map highlighted that the e-government services need to be enhanced. Enhancements may go through increasing training for teachers and students, developing modern schools, and developing long run educational policies and plans to the Kuwaiti citizens to cope with the tremendous advancements in the ICT sector.
Article
Full-text available
Research on user satisfaction has increased substantially in recent years. To date, most studies have tested the significance of pre‐defined factors thought to influence user satisfaction, with no scalable means of verifying the validity of their assumptions. Digital technology has created new methods of collecting user feedback where service users post comments. As topic models can analyze large volumes of feedback, they have been proposed as a feasible approach to aggregating user opinions. This novel approach has been applied to process reviews of primary‐care practices in England. Findings from an analysis of more than 200,000 reviews show that the quality of interactions with staff and bureaucratic exigencies are the key drivers of user satisfaction. In addition, patient satisfaction is strongly influenced by factors that are not measured by state‐of‐the‐art patient surveys. These results highlight the potential benefits of text mining and machine learning for public administration. This article is protected by copyright. All rights reserved.
Chapter
Full-text available
Government services are available online and can be provided through multiple digital channels, clients’ feedback on these services can be submitted and obtained online. Enormous budgets are invested annually by governments to understand their clients and adapt services to meet their needs. In this paper, a unique dataset that consists of government smart apps Arabic reviews, domain aspects and opinion words is produced. It illustrates the approach that was carried out to manually annotate the reviews, measure the sentiment scores to opinion words and build the desired lexicons. Furthermore, this paper presents an Arabic Aspect-Based Sentiment Analysis (ABSA) that combines lexicon with rule-based models. The proposed model aims to extract aspects of smart government applications Arabic reviews, and classify all corresponding sentiments. This model examines mobile government app reviews from various perspectives to provide an insight into the needs and expectations of clients. In addition, it aims to develop techniques, rules and lexicons for language processing to address variety of SA challenge. The performance of the proposed approach confirmed that applying rules settings that can handle some challenges in ABSA improves the performance significantly. The results reported in the study have shown an increase in the accuracy and f-measure by 6%, and 17% respectively when compared with the baseline.
Article
Full-text available
Government services are available online and can be provided through multiple digital channels, clients’ feedback on these services can be submitted and obtained online. Enormous budgets are invested annually by governments to understand their clients and adapt services to meet their needs. In this paper, a unique dataset that consists of government smart apps Arabic reviews, domain aspects and opinion words is produced. It illustrates the approach that was carried out to manually annotate the reviews, measure the sentiment scores to opinion words and build the desired lexicons. Furthermore, this paper presents an Arabic Aspect-Based Sentiment Analysis (ABSA) that combines lexicon with rule-based models. The proposed model aims to extract aspects of smart government applications Arabic reviews, and classify all corresponding sentiments. This model examines mobile government app reviews from various perspectives to provide an insight into the needs and expectations of clients. In addition, it aims to develop techniques, rules and lexicons for language processing to address variety of SA challenge. The performance of the proposed approach confirmed that applying rules settings that can handle some challenges in ABSA improves the performance significantly. The results reported in the study have shown an increase in the accuracy and f-measure by 6%, and 17% respectively when compared with the baseline.
Article
Full-text available
Artificial Intelligence (AI) has recently advanced the state-of-art results in an ever-growing number of domains. However, it still faces several challenges that hinder its deployment in the e-government applications–both for improving the e-government systems and the e-government-citizens interactions. In this paper, we address the challenges of e-government systems and propose a framework that utilizes AI technologies to automate and facilitate e-government services. Specifically, we first outline a framework for the management of e-government information resources. Second, we develop a set of deep learning models that aim to automate several e-government services. Third, we propose a smart e-government platform architecture that supports the development and implementation of AI applications of e-government. Our overarching goal is to utilize trustworthy AI techniques in advancing the current state of e-government services in order to minimize processing times, reduce costs, and improve citizens’ satisfaction.
Article
Full-text available
With the growing availability of internet and opinion rich resources such as social networks and personal blogs, the task of mining public opinion and exploring facts has become more popular than ever before during the last decade. The latest trend has deeply transformed the way the governments interact with their citizens and offer them various services through continuous public engagement. The proposed framework SCANCPECLENS is an initiative to support performance assessment framework for e-government in Pakistan. The research takes into account the opinion of masses on one of the most crucial and widely discussed development projects, China Pakistan Economic Corridor (CPEC), considered as a game changer due to its promise of bringing economic prosperity to the region. The proposed framework suggests to use machine learning algorithms to automatically discover the public sentiment from microblogs on the matter nationally as well as internationally. We also present an automated way to create sentiment lexicon of positive, negative and neutral words on the subject. To the best of our knowledge, this theme has not been explored for opinion mining before and helps one in effectively assessing public satisfaction over government’s policies in the CPEC region. The research is an initiative to discover new avenues of future research and direction for the government, policy making institutions and other stake holders and demonstrates the power of text mining as an effective tool to extract business value from vast amount of social media data.
Conference Paper
Full-text available
Big Data is, clearly, an integral part of modern information societies. A vast amount of data is daily produced and it is estimated that, for the years to come, this number will grow dramatically. In an effort to transform the hidden information in this ocean of data into a useful one, the use of advanced technologies, such as Machine Learning, is deemed appropriate. Machine Learning is a technology that can handle Big Data classification for statistical or even more complex purposes, such as decision making. This fits perfectly with the scope of the new generation of government, Government 3.0, which explores all the new opportunities to tackle any challenge faced by contemporary societies, by utilizing new technologies for data-driven decision making. Boosted by the opportunities, Machine Learning can facilitate more and more governments participate in the development of such applications in different governmental domains. But is the Machine Learning only beneficial for public sectors? Although there is a huge number of researches in the literature related to Machine Learning applications, there is lack of a comprehensive study focusing on the usage of this technology within governmental applications. The current paper moves towards this research question, by conducting a comprehensive analysis of the use of Machine Learning by governments. Through the analysis, quite interesting findings have been identified, containing both benefits and barriers from the public sectors' perspective, pinpointing a wide adoption of Machine Learning approaches in the public sector.
Article
Full-text available
In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in the real-world problem are discussed.
Article
Full-text available
Research on customer satisfaction has increased substantially in recent years. However, the relative importance and relationships between different determinants of satisfaction remains uncertain. Moreover, quantitative studies to date tend to test for significance of pre-determined factors thought to have an influence with no scalable means to identify other causes of user satisfaction. The gaps in knowledge make it difficult to use available knowledge on user preference for public service improvement. Meanwhile, digital technology development has enabled new methods to collect user feedback, for example through online forums where users can comment freely on their experience. New tools are needed to analyze large volumes of such feedback. Use of topic models is proposed as a feasible solution to aggregate open-ended user opinions that can be easily deployed in the public sector. Generated insights can contribute to a more inclusive decision-making process in public service provision. This novel methodological approach is applied to a case of service reviews of publicly-funded primary care practices in England. Findings from the analysis of 145,000 reviews covering almost 7,700 primary care centers indicate that the quality of interactions with staff and bureaucratic exigencies are the key issues driving user satisfaction across England.
Article
Full-text available
This report presents the findings of a survey on illegitimate economic practices in Bulgaria conducted between July and October 2015. This representative survey of 2005 citizens focused on the experiences of Bulgarians with undeclared work, envelope wages and the practice of “pulling strings”, as well as on their opinion about these types of dishonest behaviour. According to the respondents, illegitimate economic practices are strongly ingrained in Bulgarian society. According to the estimation of Schneider (2013), the undeclared economy accounts for 31% of GDP in Bulgaria in 2013, which is the highest estimation for any country in the EU-28. According to the enterprise surveys of World Bank (2009), 54% of corporations admitted participation in the undeclared economy. In our survey, more than seven out of ten respondents were certain that at least one in five citizens regularly violates tax and labour laws. The most important reasons are believed to be the lack of formal employment opportunities and high tax burden. When it comes to the use of personal connections to circumvent rules and procedures, 74.5% of Bulgarians perceive this particular type of misbehavior as important or very important for achieving certain goals in Bulgaria. Moreover, one in four citizens has a positive attitude towards this illegitimate practice, while a further three in ten are neutral in their attitude towards such practices. It is thus unsurprising that there is a high prevalence of these illicit activities. The survey reveals that 17.1% of Bulgarians acquire goods from the undeclared economy, 22.2% pay for undeclared services, 9.6% of employees are employed without a work contract, 15.3% of registered dependent employees earn more than is stated in their contract, 30.1% of Bulgarians rely on illegitimate help/favours from people, and that 15.0% of the population provides such help/favours. However, these should all be treated as lower-bound estimates, given that surveys tend to under-report participation when sensitive issues are being investigated. Analysing involvement in undeclared work, nevertheless, the findings reveal that tax morale and personal views on the extent to which others participate are key determinants. The lower one’s tax morale, the higher the propensity to participate in the undeclared economy (and this applies to both the demand and supply sides). Likewise, the higher is the perceived number engaged in such activity, the stronger is one’s personal inclination towards such behaviour. However, the extent of personal views is an important factor on the supply side because in Bulgaria a purchaser of undeclared goods and services is not punished, only the supplier. Undeclared work is found to be particularly prevalent in agriculture and the construction industry. More than one-third of the informal buyers had purchased agricultural products (milk, meat, crops, fruits, etc.) without a receipt over the 12 months prior to the survey, while more than one quarter admitted to having hired an undeclared individual for home repair and maintenance tasks. On the supply-side, 14.6% of those reporting participation in undeclared work had provided home repair and renovation services, while 8.9% were selling agricultural products. Social ties play an essential role in unregistered economic transactions in Bulgaria. 26.5% undeclared transactions in Bulgaria are for or by a close social relation. For instance, from the demand side, 48.3% and 13.6% of the surveyed individuals buy undeclared goods and services from “other private persons or houlsholds” and “friends, colleagues or acquaintance”, respectively. This finding that undeclared work is often conducted for close social relations is also reflected in the fact that, one-third of the providers of informal goods and services asserted that both parties benefited from it. Indeed, 19.9% respondents assets that this is a normal way how this is done among friends, neighbors or relatives and 8.1% respondents participate to help someone out. The undeclared economy in Bulgaria thus seems to be a parallel universe to the declared economy, offering a similar range of goods and services but for a lower price than the formal market, with this being identified as the most important motivation by purchasers of undeclared goods and services. This was also confirmed on the supply side: 35.9% of undeclared workers admitted that mutual financial benefit of both parties was a key reason to conceal the transaction from the authorities. Also, 28.6% individuals were engaged in unregistered activities simply due to the lack of government credibility, and 24.7% of suppliers assert that high tax burden is an important determinant of undeclared work. Also, 22.3% of the individual was engaged in unregistered activities simply due to the lack of formal employment, which therefore indicates that undeclared work indeed has an important role in making ends meet for many individuals in Bulgaria. This in large part explains why unemployed and self-employed individuals are more likely to work undeclared in Bulgaria than other occupational groups. Citizens earning more than 1000 euros and citizens without income are also top of the list of groups regarding the prevalence of envelope wages in Bulgaria, with 20.4% and 17.4% respectively of the formal workforce in these two groups receiving more than they report to the authorities. As in the case of completely undeclared work, tax morale and the perceived commonality of undeclared work (i.e., the lack of vertical and horizontal trust) are key determinants of envelope wage practices in Bulgaria. Under-declaration of wages in Bulgaria is most commonly instigated by the employer Indeed, this type of noncompliance seems to primarily serve as an efficient tax and social contribution evasion strategy for employers. In general, underreporting of wages was found to be most common among new entrants to the labour market. Analysing pulling strings to get things done, the survey reveals that Bulgarians most often circumvent procedures related to job seeking, with 5.6% of participants admitting to having relied on personal connections to find a job. Bulgarians also heavily rely on pulling strings for maintenance services, as well as when getting foodstuffs. Also, 2.9% of participants rely on pulling strings to seek services at a better quality or a better price. Almost 70% of participants requested friends to pull strings for them, while 30% used relatives. This explains why pulling strings in Bulgaria is rarely a monetary transaction, given that in most cases either only verbal gratitude was expressed to the provider of the favour/help, or the favour was returned later. In general, younger people are far more likely to provide or use such favours than older generations, while there is no significant difference between genders in this respect. Analysing how illegitimate practices can be tackled, Bulgarians do not believe that increased penalties for violators would be an effective approach, and the same applies to awareness raising campaigns alone. Instead, the prevalent opinion is that undeclared work in Bulgaria cannot be reduced without improving the psychological contract between the authorities and citizens (i.e., vertical trust), and this should be done first and foremost by changing formal institutions. Citizens widely believe that there is a need for a change in the way in which enforcement agencies treats citizens. This primarily refers to more collaboration and less coercion on the part of the inspectors, as well as the provision of equal treatment across all groups of citizens. Finally, citizens believe that ensuring a sense of fair treatment in public and government institutions would reduce the use of personal connections.
Technical Report
Full-text available
This report presents the findings of a survey on illegitimate economic practices in Bulgaria conducted between July and October 2015. This representative survey of 2005 citizens focused on the experiences of Bulgarians with undeclared work, envelope wages and the practice of “pulling strings”, as well as on their opinion about these types of dishonest behaviour. According to the respondents, illegitimate economic practices are strongly ingrained in Bulgarian society. According to the estimation of Schneider (2013), the undeclared economy accounts for 31% of GDP in Bulgaria in 2013, which is the highest estimation for any country in the EU-28. According to the enterprise surveys of World Bank (2009), 54% of corporations admitted participation in the undeclared economy. In our survey, more than seven out of ten respondents were certain that at least one in five citizens regularly violates tax and labour laws. The most important reasons are believed to be the lack of formal employment opportunities and high tax burden. When it comes to the use of personal connections to circumvent rules and procedures, 74.5% of Bulgarians perceive this particular type of misbehavior as important or very important for achieving certain goals in Bulgaria. Moreover, one in four citizens has a positive attitude towards this illegitimate practice, while a further three in ten are neutral in their attitude towards such practices. It is thus unsurprising that there is a high prevalence of these illicit activities. The survey reveals that 17.1% of Bulgarians acquire goods from the undeclared economy, 22.2% pay for undeclared services, 9.6% of employees are employed without a work contract, 15.3% of registered dependent employees earn more than is stated in their contract, 30.1% of Bulgarians rely on illegitimate help/favours from people, and that 15.0% of the population provides such help/favours. However, these should all be treated as lower-bound estimates, given that surveys tend to under-report participation when sensitive issues are being investigated. Analysing involvement in undeclared work, nevertheless, the findings reveal that tax morale and personal views on the extent to which others participate are key determinants. The lower one’s tax morale, the higher the propensity to participate in the undeclared economy (and this applies to both the demand and supply sides). Likewise, the higher is the perceived number engaged in such activity, the stronger is one’s personal inclination towards such behaviour. However, the extent of personal views is an important factor on the supply side because in Bulgaria a purchaser of undeclared goods and services is not punished, only the supplier. Undeclared work is found to be particularly prevalent in agriculture and the construction industry. More than one-third of the informal buyers had purchased agricultural products (milk, meat, crops, fruits, etc.) without a receipt over the 12 months prior to the survey, while more than one quarter admitted to having hired an undeclared individual for home repair and maintenance tasks. On the supply-side, 14.6% of those reporting participation in undeclared work had provided home repair and renovation services, while 8.9% were selling agricultural products. Social ties play an essential role in unregistered economic transactions in Bulgaria. 26.5% undeclared transactions in Bulgaria are for or by a close social relation. For instance, from the demand side, 48.3% and 13.6% of the surveyed individuals buy undeclared goods and services from “other private persons or houlsholds” and “friends, colleagues or acquaintance”, respectively. This finding that undeclared work is often conducted for close social relations is also reflected in the fact that, one-third of the providers of informal goods and services asserted that both parties benefited from it. Indeed, 19.9% respondents assets that this is a normal way how this is done among friends, neighbors or relatives and 8.1% respondents participate to help someone out. The undeclared economy in Bulgaria thus seems to be a parallel universe to the declared economy, offering a similar range of goods and services but for a lower price than the formal market, with this being identified as the most important motivation by purchasers of undeclared goods and services. This was also confirmed on the supply side: 35.9% of undeclared workers admitted that mutual financial benefit of both parties was a key reason to conceal the transaction from the authorities. Also, 28.6% individuals were engaged in unregistered activities simply due to the lack of government credibility, and 24.7% of suppliers assert that high tax burden is an important determinant of undeclared work. Also, 22.3% of the individual was engaged in unregistered activities simply due to the lack of formal employment, which therefore indicates that undeclared work indeed has an important role in making ends meet for many individuals in Bulgaria. This in large part explains why unemployed and self-employed individuals are more likely to work undeclared in Bulgaria than other occupational groups. Citizens earning more than 1000 euros and citizens without income are also top of the list of groups regarding the prevalence of envelope wages in Bulgaria, with 20.4% and 17.4% respectively of the formal workforce in these two groups receiving more than they report to the authorities. As in the case of completely undeclared work, tax morale and the perceived commonality of undeclared work (i.e., the lack of vertical and horizontal trust) are key determinants of envelope wage practices in Bulgaria. Under-declaration of wages in Bulgaria is most commonly instigated by the employer Indeed, this type of noncompliance seems to primarily serve as an efficient tax and social contribution evasion strategy for employers. In general, underreporting of wages was found to be most common among new entrants to the labour market. Analysing pulling strings to get things done, the survey reveals that Bulgarians most often circumvent procedures related to job seeking, with 5.6% of participants admitting to having relied on personal connections to find a job. Bulgarians also heavily rely on pulling strings for maintenance services, as well as when getting foodstuffs. Also, 2.9% of participants rely on pulling strings to seek services at a better quality or a better price. Almost 70% of participants requested friends to pull strings for them, while 30% used relatives. This explains why pulling strings in Bulgaria is rarely a monetary transaction, given that in most cases either only verbal gratitude was expressed to the provider of the favour/help, or the favour was returned later. In general, younger people are far more likely to provide or use such favours than older generations, while there is no significant difference between genders in this respect. Analysing how illegitimate practices can be tackled, Bulgarians do not believe that increased penalties for violators would be an effective approach, and the same applies to awareness raising campaigns alone. Instead, the prevalent opinion is that undeclared work in Bulgaria cannot be reduced without improving the psychological contract between the authorities and citizens (i.e., vertical trust), and this should be done first and foremost by changing formal institutions. Citizens widely believe that there is a need for a change in the way in which enforcement agencies treats citizens. This primarily refers to more collaboration and less coercion on the part of the inspectors, as well as the provision of equal treatment across all groups of citizens. Finally, citizens believe that ensuring a sense of fair treatment in public and government institutions would reduce the use of personal connections.
Article
Full-text available
Natural language processing and machine learning can be applied to student feedback to help university administrators and teachers address problematic areas in teaching and learning. The proposed system analyzes student comments from both course surveys and online sources to identify sentiment polarity, the emotions expressed, and satisfaction versus dissatisfaction. A comparison with direct-assessment results demonstrates the system's reliability.
Article
Full-text available
This paper demonstrates state-of-the-art text sentiment analysis tools while developing a new time-series measure of economic sentiment derived from economic and financial newspaper articles from January 1980 to April 2015. We compare the predictive accuracy of a large set of sentiment analysis models using a sample of articles that have been rated by humans on a positivity/negativity scale. The results highlight the gains from combining existing lexicons and from accounting for negation. We also generate our own sentiment-scoring model, which includes a new lexicon built specifically to capture the sentiment in economic news articles. This model is shown to have better predictive accuracy than existing, “off-the-shelf”, models. Lastly, we provide two applications to the economic research on sentiment. First, we show that daily news sentiment is predictive of movements of survey-based measures of consumer sentiment. Second, motivated by Barsky and Sims (2012), we estimate the impulse responses of macroeconomic variables to sentiment shocks, finding that positive sentiment shocks increase consumption, output, and interest rates and dampen inflation.
Article
Full-text available
Abstract With accelerated evolution of the internet as websites, social networks, blogs, online portals, reviews, opinions, recommendations, ratings, and feedback are generated by writers. This writer generated sentiment content can be about books, people, hotels, products, research, events, etc. These sentiments become very beneficial for businesses, governments, and individuals. While this content meant to be useful, a bulk of this writer generated content require using the text mining techniques and sentiment analysis. But there are several challenges faced the sentiment analysis and evaluation process. These challenges become obstacles in analyzing the accurate meaning of sentiments and detecting the suitable sentiment polarity. Sentiment analysis is the practice of applying Natural Language Processing and Text Analysis techniques to identify and extract subjective information from text. This paper presents a survey on the Sentiment analysis challenges relevant to their approaches and techniques.
Article
Full-text available
With the advent of Web 2.0, people became more eager to express and share their opinions on web regarding day-to-day activities and global issues as well. Evolution of social media has also contributed immensely to these activities, thereby providing us a transparent platform to share views across the world. These electronic Word of Mouth (eWOM) statements expressed on the web are much prevalent in business and service industry to enable customer to share his/her point of view. In the last one and half decades, research communities, academia, public and service industries are working rigorously on sentiment analysis, also known as, opinion mining, to extract and analyze public mood and views. In this regard, this paper presents a rigorous survey on sentiment analysis, which portrays views presented by over one hundred articles published in the last decade regarding necessary tasks, approaches, and applications of sentiment analysis. Several sub-tasks need to be performed for sentiment analysis which in turn can be accomplished using various approaches and techniques. This survey covering published literature during 2002-2015, is organized on the basis of sub-tasks to be performed, machine learning and natural language processing techniques used and applications of sentiment analysis. The paper also presents open issues and along with a summary table of a hundred and sixty-one articles.
Conference Paper
Full-text available
The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly , using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
Article
The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
Article
At the dawn of the year 2020, the world was hit by a significant pandemic COVID-19, that traumatized the entire planet. The infectious spread grew in leaps and bounds and forced the policymakers and governments to move towards lockdown. The lockdown further compelled people to stay under house arrest, which further resulted in an outbreak of emotions on social media platforms. Perceiving people's emotional state during these times becomes critically and strategically important for the government and the policymakers. In this regard, a novel emotion care scheme has been proposed in this paper to analyze multimodal textual data contained in real-time tweets related to COVID-19. Moreover, this paper studies 8-scale emotions (Anger, Anticipation, Disgust, Fear, Joy, Sadness, Surprise, and Trust) over multiple categories such as nature, lockdown, health, education, market, and politics. This is the first of its kind linguistic analysis on multiple modes pertaining to the pandemic to the best of our understanding. Taking India as a case study, we inferred from this textual analysis that 'joy' has been lesser towards everything (∼9-15%) but nature (∼17%) due to the apparent fact of lessened pollution. The education system entailed more trust (∼29%) due to teachers' fraternity's consistent efforts. The health sector witnessed sadness (∼16%) and fear (∼18%) as the dominant emotions among the masses as human lives were at stake. Additionally, the state-wise and emotion-wise depiction is also provided. An interactive internet application has also been developed for the same.
Article
The tracking sentiment of the news entities over time provides important information to governments and enterprises during the decision-making process. Recently, it has attracted the attention of the research community as well due to its popularity in many applications including; tracking news about elections, e-commerce, and e-governance. However, most of the work is focused on English whereas limited contributions have been done for Arabic. Moreover, there are no annotated corpora in the Arabic news domain that can be used to perform the sentiment tracking task. In this research, we present an Arabic news corpus and its associated sentiment tracking system to monitor the sentiments towards news entities in the Arab world. Sentiment classification and Named Entity Recognition techniques are used to prepare the corpus for the tracking task. A sample dataset containing 7200 tweets was manually annotated to be used in building multiple classifiers and annotate more than 2.3M tweets using the semi-supervised technique. The results of sentiment classification by using different machine learning classifiers and internal testing set show that semi-automatically annotated dataset outperforms the manually annotated dataset by 23% and 16% on two-way and three-way classification respectively using F1-score. The tracking results illustrate that over time the sentiment tracking performs well at discovering the most popular entities, from social media and, tracking their shifts in different Arab regions. It can be used to detect the possible reasons for sentiment change over time and, to predict the future sentiment of the news entities.
Article
This paper demonstrates state-of-the-art text sentiment analysis tools while developing a new time-series measure of economic sentiment derived from economic and financial newspaper articles from January 1980 to April 2015. We compare the predictive accuracy of a large set of sentiment analysis models using a sample of articles that have been rated by humans on a positivity/negativity scale. The results highlight the gains from combining existing lexicons and from accounting for negation. We also generate our own sentiment-scoring model, which includes a new lexicon built specifically to capture the sentiment in economic news articles. This model is shown to have better predictive accuracy than existing “off-the-shelf” models. Lastly, we provide two applications to the economic research on sentiment. First, we show that daily news sentiment is predictive of movements of survey-based measures of consumer sentiment. Second, motivated by Barsky and Sims (2012), we estimate the impulse responses of macroeconomic variables to sentiment shocks, finding that positive sentiment shocks increase consumption, output, and interest rates and dampen inflation.
Article
The dual citizenship has been the subject of intense political debate in Pakistan barring the elected representatives, in parliament, the provincial assemblies and the presidency, from holding dual nationality. The perceptions that holding a foreign citizenship challenges the undiluted loyalty to country further engender question mark over their political participation in country’s affairs. A qualitative study has been carried out in the city of Rawalpindi to explore the stance of Pakistani nationals on such exclusion of Pakistani dual citizens from mainstream politics of Pakistan asking should the individual’s association to the state be an exclusive one? In this paper, in‐depth interviews were conducted with a sample of 40 male and female respondents (22 Pakistani citizens, 18 Pakistani dual citizens) and 6 constitutional experts. The results indicate distrust among Pakistani citizens when considering dual citizens’ right of representation in general elections of Pakistan, stemming from a mistrust of their “split loyalties”.
Book
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. Opinion Mining and Sentiment Analysis covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. The focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. The survey includes an enumeration of the various applications, a look at general challenges and discusses categorization, extraction and summarization. Finally, it moves beyond just the technical issues, devoting significant attention to the broader implications that the development of opinion-oriented information-access services have: questions of privacy, vulnerability to manipulation, and whether or not reviews can have measurable economic impact. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. Opinion Mining and Sentiment Analysis is the first such comprehensive survey of this vibrant and important research area and will be of interest to anyone with an interest in opinion-oriented information-seeking systems.
Article
Nowadays, the improvement of governance, ensurance the security and the timely detection of propaganda against the government are major problems of e-government. Extraction of hidden social networks is one of the most actual problems in the term of government security. The extraction of hidden social networks operating against the state in e-government is one of the key factors to ensure the security in e-government. In the paper, a method has been proposed for extracting hidden social networks to improve management in e-government, prevent promotion against the government and ensure the security. In this approach, hidden social networks are extracted through the analysis of user's comments via opinion and text mining technologies.
Chapter
As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.
Conference Paper
The goal of this paper is to propose a methodology comprising a range of visualization techniques to analyze the interactions between government and citizens on the issues of public concern taking place on Twitter, mainly through the official government or ministry accounts. The methodology addresses: 1) the level of government activity in different countries and sectors; 2) the topics that are addressed through such activities; 3) the resources shared between government and citizens as part of interactions; 4) the intensity of citizen response to government announcements; 5) the sentiment expressed by citizens when providing such responses; and 6) the combinations of such issues. Example combinations include identifying topics that generated the largest Twitter activity by government but received the least interest from citizens, identifying topics that generated the most polarized reactions from citizens, or determining correlation between policy announcements and trust, fear and other negative emotions expressed by citizens. The methodology uses visual analytics to reveal patterns and trends associated with various questions, complemented with sentiment analysis to study government-citizen interactions on Twitter. The methodology is validated by examining Twitter presence in five sectors --- health, social development, education, environment and work, in five Latin American countries with mature e-Participation capabilities --- Argentina, Chile, Colombia, Mexico and Uruguay.
Conference Paper
The main goal of any Government is to secure the basic rights of its citizens, promoting the welfare (in general) and economic growth while maintaining domestic tranquilly and achieving sustainable development. Government and its agencies introduce several initiatives for the welfare of its citizens and to improve the quality of public services. The performances of all such initiatives need to be evaluated with the involvement of the citizens at large, to draw insights into the public acceptance of such initiatives. These insights gained can be used to restructure/transform the government initiatives to make them more successful. Routinely citizens use social networks to post their opinions, views, and comments. Analyzing social content media is thus very important for Governments to take decisions. Sentiment Analysis, as a tool can be used to analyze the citizens feedback expressed on all such social media.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Article
Sentiment analysis from text consists of extracting information about opinions, sentiments, and even emotions conveyed by writers towards topics of interest. It is often equated to opinion mining, but it should also encompass emotion mining. Opinion mining involves the use of natural language processing and machine learning to determine the attitude of a writer towards a subject. Emotion mining is also using similar technologies but is concerned with detecting and classifying writers emotions toward events or topics. Textual emotion-mining methods have various applications, including gaining information about customer satisfaction, helping in selecting teaching materials in e-learning, recommending products based on users emotions, and even predicting mental-health disorders. In surveys on sentiment analysis, which are often old or incomplete, the strong link between opinion mining and emotion mining is understated. This motivates the need for a different and new perspective on the literature on sentiment analysis, with a focus on emotion mining. We present the state-of-the-art methods and propose the following contributions: (1) a taxonomy of sentiment analysis; (2) a survey on polarity classification methods and resources, especially those related to emotion mining; (3) a complete survey on emotion theories and emotion-mining research; and (4) some useful resources, including lexicons and datasets.
Conference Paper
A word's sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words. We show that our approach achieves state-of-the-art performance on inducing sentiment lexicons from domain-specific corpora and that our purely corpus-based approach outperforms methods that rely on hand-curated resources (e.g., WordNet). Using our framework, we induce and release historical sentiment lexicons for 150 years of English and community-specific sentiment lexicons for 250 online communities from the social media forum Reddit. The historical lexicons we induce show that more than 5% of sentiment-bearing (non-neutral) English words completely switched polarity during the last 150 years, and the community-specific lexicons highlight how sentiment varies drastically between different communities.
Article
With the rapid growth of social media, sentiment analysis, also called opinion mining, has become one of the most active research areas in natural language processing. Its application is also widespread, from business services to political campaigns. This article gives an introduction to this important area and presents some recent developments.
Book
Sentiment analysis is the computational study of people's opinions, sentiments, emotions, and attitudes. This fascinating problem is increasingly important in business and society. It offers numerous research challenges but promises insight useful to anyone interested in opinion analysis and social media analysis. This book gives a comprehensive introduction to the topic from a primarily natural-language-processing point of view to help readers understand the underlying structure of the problem and the language constructs that are commonly used to express opinions and sentiments. It covers all core areas of sentiment analysis, includes many emerging themes, such as debate analysis, intention mining, and fake-opinion detection, and presents computational methods to analyze and summarize opinions. It will be a valuable resource for researchers and practitioners in natural language processing, computer science, management sciences, and the social sciences.