Figure - uploaded by Nicolay Rusnachenko
Content may be subject to copyright.
The most negative attitudes found in the 2017 news corpus

The most negative attitudes found in the 2017 news corpus

Source publication
Conference Paper
Full-text available
Texts can convey several types of inter-related information concerning opinions and attitudes. Such information includes the author’s attitude towards mentioned entities, attitudes of the entities towards each other, positive and negative effects on the entities in the described situations. In this paper, we described the lexicon RuSentiFrames for...

Contexts in source publication

Context 1
... also consider the frame entry polarity as inverted, when it was used with negation. Table 5 presents the most negative attitudes found in the corpus. Table 6 shows the most positive attitudes from the same corpus. ...
Context 2
... also consider the frame entry polarity as inverted, when it was used with negation. Table 5 presents the most negative attitudes found in the corpus. Table 6 shows the most positive attitudes from the same corpus. ...

Citations

... В данной работе исследуется применение языковых моделей для извлечения оценочных отношений, предобученных на основе большого автоматического размеченного корпуса извлеченных оценочных отношений по методу опосредованного обучения. Подход основан на использовании лексикона RuSentiFrames [6], который содержит описание оценочных отношений между аргументами слов-предикатов русского языка. Таким образом, вклад настоящей работы следующий: ...
... 202 Построение модели кросс-языкового кодировщика предложений на основе набора текстов ста различных языков [18] стало одним из применений модели RoBERTa. Такая модель получила название XLM-R 6 . SpanBERT [19] представляет собой модификацию BERT, ориентированную под задачу извлечения отношений (Relation Extraction) [20], посредством изменения алгоритма маскирования частей текста последовательности. ...
... Основная идея подхода состоит в следующем. Лексикон оценочной лексики RuSentiFrames [6] используется для автоматической разметки оценочных отношений в заголовках большой неразмеченной новостной коллекции. Извлечение отношений производится из заголовков, поскольку они обычно короче, содержат меньше именованных сущностей. ...
Article
Full-text available
Large text can convey various forms of sentiment information including the author’s position, positive or negative effects of some events, attitudes of mentioned entities towards to each other. In this paper, we experiment with BERT based language models for extracting sentiment attitudes between named entities. Given a mass media article and list of mentioned named entities, the task is to ex tract positive or negative attitudes between them. Efficiency of language model methods depends on the amount of training data. To enrich training data, we adopt distant supervision method, which provide automatic annotation of unlabeled texts using an additional lexical resource. The proposed approach is subdivided into two stages FRAME-BASED: (1) sentiment pairs list completion (PAIR-BASED), (2) document annotations using PAIR-BASED and FRAME-BASED factors. Being applied towards a large news collection, the method generates RuAttitudes2017 automatically annotated collection. We evaluate the approach on RuSentRel-1.0, consisted of mass media articles written in Russian. Adopting RuAttitudes2017 in the training process results in 10-13% quality improvement by F1-measure over supervised learning and by 25% over the top neural network based model results.
... Of course, multilingual embeddings as well as frame-based approaches have proved to be effective in many NLP tasks for Russian other than SRL. For example, a recent study applied a frame-based approach for predicting sentiment attributes towards named entities in political news (Rusnachenko et al., 2019;Loukachevitch and Rusnachenko, 2020) and more recently, multilingual embeddings have been used to achieve state-of-the-art performance on named entity recognition and classification Miftahutdinov et al., 2020). ...
Conference Paper
Full-text available
This work is devoted to semantic role labeling (SRL) task in Russian. We investigate the role of transfer learning strategies between English FrameNet and Russian FrameBank corpora. We perform experiments with embeddings obtained from various types of multilingual language models, including BERT, XLM-R, MUSE, and LASER. For evaluation, we use a Russian FrameBank dataset. As source data for transfer learning, we experimented with the full version of FrameNet and the reduced dataset with a smaller number of semantic roles identical to FrameBank. Evaluation results demonstrate that BERT embeddings show the best transfer capabilities. The model with pretraining on the reduced English SRL data and fine-tuning on the Russian SRL data show macro-averaged F1-measure of 79.8%, which is above our baseline of 78.4%.
Chapter
While addressing the challenge of sentiment analysis, it is crucial to consider not only the polarity of certain words but also the polarity between them, particularly between the arguments of a predicate. For this purpose, the RuSentiFrames lexicon was created. But the training of the ML model requires an annotated collection of data, and since the manual annotation is laborious and expensive, the automation of the process is preferable. In this paper, we describe a rule-based approach to automatic annotation of semantic roles for the predicates of the RuSentiFrames lexicon. The implementation of the algorithm includes the search of the entities with certain morpho-syntactic features in the order that depends on the case of the entity and is based on calculation of the posterior probabilities of the co-occurrence of a certain case and a certain type of predicate arguments. The results of the algorithm evaluation, based on several different characteristics, were relatively high. The solutions of problematic cases have been suggested and are expected to be implemented in further research.
Chapter
The paper describes the approaches to verification of sentiment frames in the RuSentiFrames lexicon describing sentiment connotations related to specific Russian predicates. Two approaches for verification were used: 1) analysis of specific sentences from Russian National Corpus, 2) via crowdsourcing platform Yandex.Toloka. The idea was to find similarities and differences between the annotations made by the experts in RuSentiFrames and by non-experts from Yandex.Toloka, thus verifying the RuSentiFrames descriptions. The first approach showed that implicit information influences greatly on the author’s attitude and that the context plays crucial role. The second approach showed mostly the agreement between the expert’s and non-expert’s annotations in case of relations between the participants in sentiment frames, but the author’s attitudes were estimated differently in some cases.