Article

Methods of sentiment detection towards aspect of economic and social development in Russian sentences

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The article is devoted to the task of the sentiment detection towards an aspect of economic and social development in Russian sentences. The aspect, the attitude to which is determined, can be either explicitly mentioned or implied. The authors investigated possibilities of using neural network classifiers and proposed an algorithm for determining the sentiment towards an aspect based on semantic rules implemented with the use of constituency trees. The sentiment towards an aspect is determined in two stages. At the first stage, aspect terms (explicitly mentioned events or phenomena associated with the aspect) are found in the sentence. At the second stage, the sentiment towards an aspect is calculated as the sentiment towards the aspect term that is most closely associated with the aspect. The paper proposes several methods for searching the aspect terms. The performance was assessed on a corpus of 468 sentences extracted from election campaign materials. The best result for neural network classifiers was obtained using the BERT-SPC neural network pretrained on the task of identifying the sentiment towards an explicitly mentioned aspect, the macro F-score was 0.74. The best result for the semantic rule-based algorithm was obtained using the method of aspect term searching based on semantic similarity, the macro-F-score was 0.63. When combining BERT-SPC and the rule-based algorithm into an ensemble, the macro-F-score was 0.79, which is the best result obtained in this work.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
The paper describes the RuSentNE-2023 evaluation devoted to targeted sentiment analysis in Russian news texts. The task is to predict sentiment towards a named entity in a single sentence. The dataset for RuSentNE-2023 evaluation is based on the Russian news corpus RuSentNE having rich sentiment-related annotation. The corpus is annotated with named entities and sentiments towards these entities, along with related effects and emotional states. The evaluation was organized using the CodaLab competition framework. The main evaluation measure was macro-averaged measure of positive and negative classes. The best results achieved were of 66% Macro F-measure (Positive+Negative classes). We also tested ChatGPT on the test set from our evaluation and found that the zero-shot answers provided by ChatGPT reached 60% of the F-measure, which corresponds to 4th place in the evaluation. ChatGPT also provided detailed explanations of its conclusion. This can be considered as quite high for zero-shot application.
Article
Full-text available
The existing systems for accurate sentiment analysis are mainly based on statistical and mathematical principles. However, more promising are the works that are devoted to the study of the linguistic features of the evaluation expression. The results of this formalization can be applied both in the field of affective computing for further improvement of automatic systems and for linguistics and related sciences. The novelty of this study lies mainly in the development of an algorithm based on the identified linguistic rules. In addition, the research material is political discourse, which has not yet been studied enough by specialists of affective computing. The relevance of this work is justified by the growing need for categorization of information published on the Internet. The purpose of the study is to develop a system for machine sentiment analysis of English-language political texts, as well as to identify aspects and their distribution for subsequent use in enhancement. The article discusses the linguistic features of sentiment analysis and suggests a classification of linguistic units with sentiment potential in relation to levels of language structure. The results of an experiment on testing the operation of the sentiment analysis system, conducted on 300 news articles and user comments taken from reddit.com/r/politics, are also presented. The accuracy of the system is 92%. In addition, the selected 40 comments were manually marked up and tagged; during this process the expert identified 25 aspects. Furthermore, 3 formal patterns were identified in the distribution of aspect terms, which is necessary for creating an automatic system. The first peculiarity is that the aspect terms are repeated in two consecutive sentences. The second is that aspect terms are often the themes of sentences. Finally, the third — a high frequency of distribution of aspect terms at the beginning and end of the text (document) was revealed.
Article
Full-text available
Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.
Conference Paper
Full-text available
In this paper we present an approach to the one of the most popular natural language processing tasks of automatic aspect extraction from product reviews. Our approach is based on using clustering of word embeddings, morphological features, information about syntax dependencies and word frequencies. We use these features in the well-known machine learning method the Decision tree. The primary evaluation of our method quality for the task of identifying the explicit aspects from the reviews demonstrate good performance in the precision and recall for cross-domain aspect extraction task.
Article
Full-text available
The domain of Aspect-based Sentiment Analysis, in which aspects are extracted, their sentiments are analyzed and sentiments are evolved over time, is getting much attention with increasing feedback of public and customers on social media. The immense advancements in the field urged researchers to devise new techniques and approaches, each sermonizing a different research analysis/question, that cope with upcoming issues and complex scenarios of Aspect-based Sentiment Analysis. Therefore, this survey emphasized on the issues and challenges that are related to extraction of different aspects and their relevant sentiments, relational mapping between aspects, interactions, dependencies and contextual-semantic relationships between different data objects for improved sentiment accuracy, and prediction of sentiment evolution dynamicity. A rigorous overview of the recent progress is summarized based on whether they contributed towards highlighting and mitigating the issue of Aspect Extraction, Aspect Sentiment Analysis or Sentiment Evolution. The reported performance for each scrutinized study of Aspect Extraction and Aspect Sentiment Analysis is also given, showing the quantitative evaluation of the proposed approach. Future research directions are proposed and discussed, by critically analysing the presented recent solutions, that will be helpful for researchers and beneficial for improving sentiment classification at aspect-level.
Article
Full-text available
Government services are available online and can be provided through multiple digital channels, clients’ feedback on these services can be submitted and obtained online. Enormous budgets are invested annually by governments to understand their clients and adapt services to meet their needs. In this paper, a unique dataset that consists of government smart apps Arabic reviews, domain aspects and opinion words is produced. It illustrates the approach that was carried out to manually annotate the reviews, measure the sentiment scores to opinion words and build the desired lexicons. Furthermore, this paper presents an Arabic Aspect-Based Sentiment Analysis (ABSA) that combines lexicon with rule-based models. The proposed model aims to extract aspects of smart government applications Arabic reviews, and classify all corresponding sentiments. This model examines mobile government app reviews from various perspectives to provide an insight into the needs and expectations of clients. In addition, it aims to develop techniques, rules and lexicons for language processing to address variety of SA challenge. The performance of the proposed approach confirmed that applying rules settings that can handle some challenges in ABSA improves the performance significantly. The results reported in the study have shown an increase in the accuracy and f-measure by 6%, and 17% respectively when compared with the baseline.
Conference Paper
Full-text available
The paper presents a free and open source toolkit which aim is to quickly deploy web services handling distributed vector models of semantics. It fills in the gap between training such models (many tools are already available for this) and dissemination of the results to general public. Our toolkit, WebVectors, provides all the necessary routines for organizing online access to querying trained models via modern web interface. We also describe two demo installations of the toolkit, featuring several efficient models for English, Russian and Norwegian.
Article
Full-text available
The paper presents a comparison of two modern popular approaches to the analysis of discussions in the media and the way that the media shape the public's attention to certain issues. The author describes the origin and evolution of these concepts; a special attention is paid to the prerequisites for their occurrence and the initial disciplinary affiliation. The major content differences, such as different requirements for the discussions are considered in the article. The agenda-setting theory is largely focused on the study of media reports concerning the issues where personal experiences are lacking among the population (given that the empirical object exists in reality). The constructivist approach to social problems does not refer to lacking experience, it is mostly implies a connection between a constructed problem and reality. Besides, the agenda-setting theory puts aside the influence of the actors of the media discussions, while for the constructivist approach it is one of the key issues under examination.
Conference Paper
Full-text available
The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly , using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
Conference Paper
Full-text available
pymorphy2 is a morphological analyzer and generator for Russian and Ukrainian languages. It uses large efficiently encoded lexi- cons built from OpenCorpora and LanguageTool data. A set of linguistically motivated rules is developed to enable morphological analysis and generation of out-of-vocabulary words observed in real-world documents. For Russian pymorphy2 provides state-of-the-arts morphological analysis quality. The analyzer is implemented in Python programming language with optional C++ extensions. Emphasis is put on ease of use, documentation and extensibility. The package is distributed under a permissive open-source license, encouraging its use in both academic and commercial setting.
Article
The paper compares performance of various methods of automatic implicit aspect detection in publicism sentences in Russian. The task of implicit aspect detection is an auxiliary task in the aspect-oriented sentiment analysis. The experiments were conducted on a corpus of sentences extracted from political campaign materials. The best results, with F1-measure reaching 0.84, were obtained using the Navec embeddings and classifiers based on the support vector machine method. Fairly high results, with F1-measure reaching 0.77, were obtained using the bag-of-words model and the naive Bayesian classifier. Other methods showed lower performance. It was also revealed during the experiments that the detection quality can differ significantly between the aspects. The detection quality is the highest for the aspects associated with characteristic marker words, for example, “health car” and “holding elections”. More general aspects, such as “quality of governance”, are detected with the worst quality.
Article
The article is devoted to the task of sentiment detecton of Russian sentences, which is understood as the author’s attitude on the sentence topic expressed through linguistic expression features. Today most studies on this subject utilize texts of colloquial style, limiting the applicability of their results to other styles of speech, particularly to the publicism. To fill the gap, the authors developed a novel publisism sentences oriented sentiment detection algorithm. The algorithm recursively applies appropriate rules to sentence parts represented as constituency trees. Most of the rules were proposed by a philology expert, based on knowledge on the expression features from Russian philology, and algorithmized using constituency trees generated by the algorithm. A decision tree and a sentiment vocabulary are also used in the work. The article contains the results of evaluation of the algorithm on the publicism sentences corpus OpenSentimentCorpus, F-measure is 0.80. The results of errors analysis are also presented.
Article
As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's opinions at the aspect level, has been attracting considerable interest in the last decade. To handle ABSA in different scenarios, various tasks are introduced for analyzing different sentiment elements and their relations, including the aspect term, aspect category, opinion term, and sentiment polarity. Unlike early ABSA works focusing on a single sentiment element, many compound ABSA tasks involving multiple elements have been studied in recent years for capturing more complete aspect-level sentiment information. However, a systematic review of various ABSA tasks and their corresponding solutions is still lacking, which we aim to fill in this survey. More specifically, we provide a new taxonomy for ABSA which organizes existing studies from the axes of concerned sentiment elements, with an emphasis on recent advances of compound ABSA tasks. From the perspective of solutions, we summarize the utilization of pre-trained language models for ABSA, which improved the performance of ABSA to a new stage. Besides, techniques for building more practical ABSA systems in cross-domain/lingual scenarios are discussed. Finally, we review some emerging topics and discuss some open challenges to outlook potential future directions of ABSA.
Article
The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
Article
Aspect-Based Sentiment Analysis (ABSA) is a popular scheme that looks for the prediction of the sentiment of positive characteristics in text. The sentiment of text sequences is analyzed by deep neural networks and attained noteworthy results. Conversely, these models also have some problems with the limitation of past-training word embeddings and lack of communication between the context and the particular characteristic of the attention scheme. The main part of this task is to develop the novel ABSA concerning both explicit and implicit aspects using demonetization dataset reviews from India. Initially, the pre-processing of online tweets is performed by stop word removal, tokenization, lower case conversion, and stemming. Further, the explicit aspects are extracted, as it is simple to extract from the sentence and the polarity score is computed. A machine learning algorithm termed as Neural Network (NN) is utilized that helps for training the data regarding the implicit aspects, and further, helps to differentiate properly for the testing data with exact polarity score. Optimal feature selection is performed using the Self Adaptive Beetle Swarm Optimization (SA-BSO). These optimal features are given to a deep structured architecture called Recurrent Neural Network (RNN) with hidden neuron optimization by SA-BSO, which categorizes the demonetization reviews into positive, negative, or neutral. While taking the findings, the accuracy of the offered SA-BSO-RNN is secured at 4.67%, 6.56%, 3.54%, and 7.12% progressed than PSO-RNN, FF-RNN, CSA-RNN, and BSO-RNN, at 3-fold analysis for dataset 1. Results show that the designed ABSA concerning both explicit and implicit aspects using the demonetization method that provides enriched performance with diverse performance metrics.
Conference Paper
This study presents the approach to aspect-based sentiment analysis where a named entity of a certain category is considered as an aspect. Such task formulation is a novelty and opens up the opportunity to determine writers' attitudes to organizations and people considered in texts. This task required a dataset of Russian-language sentences where sentiment with respect to certain named entities would be labeled, which we collected using a crowdsourcing platform. Sentiment determination is based on a deep neural network with attention mechanism and ELMo language model for word vector representation. The proposed model is validated on available data on a similar task. The resulting performance (by the f1-micro metric) on the collected dataset is 0.72, which is the new state of the art for the Russian language. © 2020 Copyright for this paper by its authors.
Chapter
Targeted sentiment classification aims at determining the sentimental tendency towards specific targets. Most of the previous approaches model context and target words with RNN and attention. However, RNNs are difficult to parallelize and truncated backpropagation through time brings difficulty in remembering long-term patterns. To address this issue, this paper proposes an Attentional Encoder Network (AEN) which eschews recurrence and employs attention based encoders for the modeling between context and target. We raise the label unreliability issue and introduce label smoothing regularization. We also apply pre-trained BERT to this task and obtain new state-of-the-art results. Experiments and analysis demonstrate the effectiveness and lightweight of our model.
Conference Paper
Aspect-level sentiment classification aims at identifying the sentiment polarity of specific target in its context. Previous approaches have realized the importance of targets in sentiment classification and developed various methods with the goal of precisely modeling thier contexts via generating target-specific representations. However, these studies always ignore the separate modeling of targets. In this paper, we argue that both targets and contexts deserve special treatment and need to be learned their own representations via interactive learning. Then, we propose the interactive attention networks (IAN) to interactively learn attentions in the contexts and targets, and generate the representations for targets and contexts separately. With this design, the IAN model can well represent a target and its collocative context, which is helpful to sentiment classification. Experimental results on SemEval 2014 Datasets demonstrate the effectiveness of our model.
Article
With the rapid growth of social media, sentiment analysis, also called opinion mining, has become one of the most active research areas in natural language processing. Its application is also widespread, from business services to political campaigns. This article gives an introduction to this important area and presents some recent developments.
Neural-network method for determining text author’s sentiment to an aspect specified by the named entity
  • A Naumov
  • Others
Aspect-based sentiment analysis for financial review with implicit aspect and opinion using semantic similarity and hybrid approach
  • B Muljono
  • R Harjo
  • Abdullah
Muljono, B. Harjo, and R. Abdullah, "Aspect-based sentiment analysis for financial review with implicit aspect and opinion using semantic similarity and hybrid approach", International Journal of Intelligent Engineering & Systems, vol. 17, no. 5, pp. 646-658, 2024. doi: 10.22266/ijies2024.1031.49.
Rechevye osobennosti politicheskoj kommunikacii
  • M A Pil'gun
M. A. Pil'gun, "Rechevye osobennosti politicheskoj kommunikacii", Proceedings of Kazan University. Humanities Sciences Series, vol. 152, no. 2, pp. 236-246, 2010, in Russian.
Algorithm of constituency tree from dependency tree construction for a Russian-language sentence
  • A Y Poletaev
  • I V Paramonov
  • E I Boychuk
A. Y. Poletaev, I. V. Paramonov, and E. I. Boychuk, "Algorithm of constituency tree from dependency tree construction for a Russian-language sentence", Informatics and Automation, vol. 22, no. 6, pp. 1323-1353, 2023, in Russian. doi: 10.15622/ia.22.6.3.