Ilya V. Paramonov’s research while affiliated with Yaroslavl State University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Fig 1. Main page of the ProseRhythmDetector application
The number of rhythmic features in the original text of Ch. Bronte's "Villette" and in its translation
The number of rhythmic features in the original text of I. Murdoch's "The Black Prince" and in its translation
Automated Search of Rhythm Figures in a Literary Text for Comparative Analysis of Originals and Translations Based on the Material of the English and Russian Languages
  • Article
  • Full-text available

September 2019

·

163 Reads

·

3 Citations

Modeling and Analysis of Information Systems

Nadezhda Stanislavovna Lagutina

·

Ksenia Vladimirovna Lagutina

·

Elena Igorevna Boychuk

·

[...]

·

Ilya Vyacheslavovich Paramonov

Analysis of the functional equivalence of an original text and its translation based on the achievement of rhythm equivalence is an extremely important task of modern linguistics. Moreover, the rhythm component is an integral part of functional equivalence that cannot be achieved without communication of rhythm figures of the text. To analyze rhythm figures in an original literary text and its translation, the authors developed the ProseRhythmDetector software tool that allows to find and visualize lexical and syntactic figures in English- and Russian-language prose texts: anaphora, epiphora, symploce, anadiplosis, epanalepsis, reduplication, epistrophe, polysyndeton, and aposiopesis. The goal of this work is to present the results of ProseRhythmDetector testing on two works by English authors and their translations into Russian: Ch. Bronte “Villette” and I. Murdoch “The Black Prince”. Basing on the results of the tool, the authors compared rhythm figures in an original text and its translation both in aspects of the rhythm and their contexts. This experiment made it possible to identify how the features of the author’s style are communicated by the translator, to detect and explain cases of mismatch of rhythm figures in the original and translated texts. The application of the ProseRhythm-Detector software tool made it possible to significantly reduce the amount of linguistsexperts work by automated detection of lexical and syntactic figures with quite high precision (from 62 % to 93 %) for various rhythm figures.

Download

RussianLanguage Thesauri: Automated Construction and Application For Natural Language Processing Tasks

August 2018

·

42 Reads

·

3 Citations

Modeling and Analysis of Information Systems

The paper reviews the existing Russian-language thesauri in digital form and methods of their automatic construction and application. The authors analyzed the main characteristics of open access thesauri for scientific research, evaluated trends of their development, and their effectiveness in solving natural language processing tasks. The statistical and linguistic methods of thesaurus construction that allow to automate the development and reduce labor costs of expert linguists were studied. In particular, the authors considered algorithms for extracting keywords and semantic thesaurus relationships of all types, as well as the quality of thesauri generated with the use of these tools. To illustrate features of various methods for constructing thesaurus relationships, the authors developed a combined method that generates a specialized thesaurus fully automatically taking into account a text corpus in a particular domain and several existing linguistic resources. With the proposed method, experiments were conducted with two Russian-language text corpora from two subject areas: articles about migrants and tweets. The resulting thesauri were assessed by using an integrated assessment developed in the previous authors’ study that allows to analyze various aspects of the thesaurus and the quality of the generation methods. The analysis revealed the main advantages and disadvantages of various approaches to the construction of thesauri and the extraction of semantic relationships of different types, as well as made it possible to determine directions for future study.


Table 1 . Topical classification of the BBCSport corpus 
Analysis of Influence of Different Relations Types on the Quality of Thesaurus Application to Text Classification Problems

January 2017

·

76 Reads

·

4 Citations

Modeling and Analysis of Information Systems

The main purpose of the article is to analyze how effectively different types of thesaurus relations can be used for solutions of text classification tasks. The basis of the study is an automatically generated thesaurus of a subject area, that contains three types of relations: synonymous, hierarchical and associative. To generate the thesaurus the authors use a hybrid method based on several linguistic and statistical algorithms for extraction of semantic relations. The method allows to create a thesaurus with a sufficiently large number of terms and relations among them. The authors consider two problems: topical text classification and sentiment classification of large newspaper articles. To solve them, the authors developed two approaches that complement standard algorithms with a procedure that take into account thesaurus relations to determine semantic features of texts. The approach to topical classification includes the standard unsupervised BM25 algorithm and the procedure, that take into account synonymous and hierarchical relations of the thesaurus of the subject area. The approach to sentiment classification consists of two steps. At the first step, a thesaurus is created, whose terms weight polarities are calculated depending on the term occurrences in the training set or on the weights of related thesaurus terms. At the second step, the thesaurus is used to compute the features of words from texts and to classify texts by the algorithm SVM or Naive Bayes. In experiments with text corpora BBCSport, Reuters, PubMed and the corpus of articles about American immigrants, the authors varied the types of thesaurus relations that are involved in the classification and the degree of their use. The results of the experiments make it possible to evaluate the efficiency of the application of thesaurus relations for classification of raw texts and to determine under what conditions certain relationships affect more or less. In particular, the most useful thesaurus connections are synonymous and hierarchical, as they provide a better quality of classification.

Citations (3)


... They look into the detection of anadiplosis, anaphora, polysyndeton 4.15, diacope, epanalepsis, epiphora, epizeuxis, and symploke. For anadiplosis, anaphora, epiphora, they reuse the algorithm from their previous paper [57], which is written in Russian. They do not report any performance metrics of their detection algorithm, as they are not using an annotated dataset. ...

Reference:

Computational Approaches to the Detection of Lesser-Known Rhetorical Figures: A Systematic Survey and Research Challenges
Automated Search of Rhythm Figures in a Literary Text for Comparative Analysis of Originals and Translations Based on the Material of the English and Russian Languages

Modeling and Analysis of Information Systems

... Генерация ключевых слов для русскоязычных научных текстов с помощью модели mT5 А. В. Глазкова 1,3 , Д. А. Морозов 2,3 , М. С. Воробьева 1 , А. А. Ступников 1 DOI: 10.18255/1818-1015-2023-4-418-428 1 Тюменский государственный университет, ул. Володарского, д. 6, г. ...

RussianLanguage Thesauri: Automated Construction and Application For Natural Language Processing Tasks
  • Citing Article
  • August 2018

Modeling and Analysis of Information Systems

... It is of interest to study texts tonality classification methods based on machine learning and tonality dictionaries using support vectors and Bayesian classifier [14,15]. Various statistical characteristics are used: TF-IDF, mutual information, Gini coefficient, Kullback-Leibler distance, ꭓ 2 -criterion, etc. [16]. Associative connectivity measures and their effectiveness are investigated when calculating the strength of connectivity of the word combinations components within bigrams and trigrams. ...

Analysis of Influence of Different Relations Types on the Quality of Thesaurus Application to Text Classification Problems

Modeling and Analysis of Information Systems