Ivan Smirnov

Ivan Smirnov
Russian Academy of Sciences | RAS · Artificial Intelligence Research Institute, FRC CSC RAS

Doctor of Philosophy

About

62
Publications
6,067
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
418
Citations

Publications

Publications (62)
Preprint
Full-text available
This paper compares the effectiveness of traditional machine learning methods, encoder-based models, and large language models (LLMs) on the task of detecting depression and anxiety. Five datasets were considered, each differing in format and the method used to define the target pathology class. We tested AutoML models based on linguistic features,...
Preprint
Full-text available
We propose selective debiasing -- an inference-time safety mechanism that aims to increase the overall quality of models in terms of prediction performance and fairness in the situation when re-training a model is prohibitive. The method is inspired by selective prediction, where some predictions that are considered low quality are discarded at inf...
Article
Full-text available
Social media has become an almost unlimited resource for studying social processes. Seasonality is a phenomenon that significantly affects many physical and mental states. Modeling collective emotional seasonal changes is a challenging task for the technical, social, and humanities sciences. This is due to the laboriousness and complexity of obtain...
Conference Paper
Full-text available
Coreference resolution is the task of identifying and grouping mentions referring to the same real-world entity. Previous neural models have mainly focused on learning span representations and pairwise scores for coreference decisions. However, current methods do not explicitly capture the referential choice in the hierarchical discourse, an import...
Preprint
Full-text available
Coreference resolution is the task of identifying and grouping mentions referring to the same real-world entity. Previous neural models have mainly focused on learning span representations and pairwise scores for coreference decisions. However, current methods do not explicitly capture the referential choice in the hierarchical discourse, an import...
Article
Full-text available
Foreign direct investment (FDI) can have a significant impact on economic development in developing economies like Russia. FDI can bring in capital, technology, and management expertise that can stimulate economic growth, increase employment, and improve productivity. In the case of Russia, FDI has played a vital role in the country’s economic deve...
Conference Paper
Full-text available
We show that using the rhetorical structure automatically generated by the discourse parser is beneficial for paragraph-level argument mining in Russian. First, we improve the structure awareness of the current RST discourse parser for Russian by employing the recent top-down approach for unlabeled tree construction on a paragraph level. Then we de...
Article
The paper deals with the problem of causal attribution of emotions in social networks texts. For the solution, it is proposed to use artificial intelligence methods for large text corpora mining. The presented method of causative-emotive analysis is based on the TITANIS - an automatic text analysis tool created at the Federal Research Center Comput...
Chapter
Health preservation is one of the urgent priorities for any group of people. There is a lot of research currently underway on diagnosing and monitoring health using data from social media. In this paper, the problem of the automatic classification of users of the Russian-language social network VK.com in terms of whether they lead a healthy lifesty...
Article
Full-text available
The study is focused on the detection of depression by processing and classification of short essays written by 316 volunteers. The set of 93 essays was provided by two different teams of psychologists who asked patients with clinically confirmed depression to write short essays on the neutral topic. The other 223 essays on the same topic were writ...
Chapter
This paper introduces TITANIS, a new social media text analysis tool specifically designed to assess the reaction of social media users to global events from a psycho-emotional point of view. The tool offers an expanded set of text parameters and natural language processing methods suitable for working with texts from social media. In addition to t...
Chapter
The rise of social media platforms and a growing interest in applying machine learning methods to ever increasing amounts of data creates an opportunity to use data from social media to predict lifestyle choices and behaviors. In this study, we examine the possibility of using machine learning methods to classify users of the Russian-speaking socia...
Chapter
This paper examines the problem of insufficient flexibility in modern cognitive assistants for choosing cars. We believe that the inaccuracy and lack of content information in the synthesised responses negatively affect consumer awareness and purchasing power. The study’s main task is to create a personalised interactive system to respond to the us...
Chapter
This work presents the first fully-fledged discourse parser for Russian based on the Rhetorical Structure Theory of Mann and Thompson (1988). For the segmentation, discourse tree construction, and discourse relation classification we employ deep learning models. With the help of multiple word embedding techniques, the new state of the art for disco...
Article
Full-text available
This paper considers the best-known tools of linguistic and statistical analysis of text corpora and introduces the RSA machine, which is a novel text analysis tool for socio-humanitarian research. The machine architecture and developing tools of the RSA machine are described. The results of a pilot study of texts using the RSA machine are presente...
Article
Full-text available
This paper examines applications of artificial intelligence in socio-humanitarian studies conceptually. The world trend of the last decades is focused on the involvement of social media data for research interests, which allows one to solve tasks of monitoring, analysis, forecasting, and management in relation to parameters of network communication...
Chapter
The paper explores the use of the AQJSM method, which is built upon combining the JSM and AQ methods, to identify cause-effect relationships between psychological characteristics and text parameters produced by individuals with these characteristics. The study included two groups of subjects: the “depression” group (patients with clinical depressio...
Chapter
Public healthcare is a big priority for society. The ability to diagnose and monitor various aspects of public health through social networks is one of the new problems that are of interest to researchers. In this paper, we consider the task of automatically classifying people who lead a healthy lifestyle and users who do not lead a healthy lifesty...
Chapter
The problem of early depression detection is one of the most important in the field of psychology. Social network analysis is widely applied to address this problem. In this paper, we consider the task of automatic detection of depression signs from textual messages and profile information of Russian social network VKontakte users. We describe the...
Article
Full-text available
Aim. Presentation of regression models of the subjectivity of network communities based on automatically determined indicators of the content relational situational analysis (RSA). Methodology. To develop these models 64 network communities of various thematic focus from the open segment of social networks (Facebook, VKontakte, Odnoklassniki, Pikab...
Chapter
Early detection of mental disorders risk is an important task for modern society. A large set of clinical works showed that five-factor personality traits model (Big Five) can predict mental disorders. In this paper, we consider the problem of automatic detection of personality traits from user profiles of Russian social network VKontakte. We descr...
Conference Paper
Full-text available
We build the first full pipeline for semantic role labelling of Russian texts. The pipeline implements predicate identification , argument extraction, argument classification (labeling), and global scoring via integer linear programming. We train supervised neural network models for argument classification using Russian semantically annotated corpu...
Article
Full-text available
Health management, or in other words health optimization, is coming to the forefront in most of the world’s countries. Almost every person has a certain predisposition to the development of chronic diseases, which is determined by corresponding risk factors. Some of these risk factors can be managed in order to minimize the risk of disease, which p...
Conference Paper
Full-text available
Multifactorial nature of human health and need in personifying the approach to each person leads to the fact that full implementation of healthy life style (HLS) technologies is possible only on the basis of artificial intelligence technologies, widely implemented in the preventive medicine via modern Internet technologies. Modern computer systems...
Conference Paper
The paper addresses the task of information extraction from scientific literature with machine learning methods. In particular, the tasks of definition and result extraction from scientific publications in Russian are considered. We note that annotation of scientific texts for creation of training dataset is very labor insensitive and expensive pro...
Article
This paper studies the contribution of semantic and semantic–syntactic analysis to the effectiveness of solving applied text-processing tasks: question answering and extraction of definitions from scientific publications. Methods for solving these problems, which, in addition to morphological and syntactic structures, also use semantic structures o...
Conference Paper
Full-text available
The paper presents the ParaPlag: a large text dataset in Russian to evaluate and compare quality metrics of different plagiarism detection approaches that deal with a big data. The competition PlagEvalRus-2017 aimed to evaluate plagiarism detection methods uses the ParaPlag as a main dataset for source retrieval and text alignment tasks. The ParaPl...
Conference Paper
The paper presents an overview of Exactus Like – a plagiarism detection system. Deep parsing for text alignment helps the system to find moderate forms of disguised plagiarism. The features of the system and its advantages are discussed. We describe the architecture of the system and present its performance.
Chapter
The paper presents the system-“Exactus Expert”-search and analytical engine. The system aims to provide comprehensive tools for analysis of large-scale collections of scientific documents for experts and researchers. The system challenges many tasks, among them full-text search, search for similar documents, automatic quality assessment, term and d...
Article
Unlabelled: The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patient's features importance and for detection of hidden dependencies between features. Results of the experimental evaluati...
Article
This paper provides an overview of the methods and systems that are applied for scientometric analysis of scientific publications. Methods to identify promising research directions are described. The results of an experimental study aimed at determining the directions of scientific research within the subject area of “regenerative medicine” by usin...
Article
Full-text available
Review is dedicated to the problem of remote monitoring of health status. Existing approaches to the organization of an outdoor monitoring of a patient using telemedicine technologies are reviewed and analyzed. A new approach to risk management of a patient which meets the requirements of pediatric hospital is provided.
Conference Paper
Full-text available
To estimate patients risks and make clinical decisions, evidence based medicine (EBM) relies upon the results of reproducible trials and experiments supported by accurate mathematical methods. Experimental and clinical evidence is crucial, but laboratory testing and especially clinical trials are expensive and time-consuming. On the other hand, a n...
Article
Full-text available
This paper presents a multifunctional support system for evidence-based medical technology that permits medical publications and reference information to be searched for in external information resources, makes it possible to formulate questions in medically accepted formats and to perform a critical analysis of publications according to the princi...
Article
Full-text available
The paper introduces two methods for semantic role labeling of Russian texts. The first method is based on semantic dictionary that contains information about predicates, roles and syntaxeme features that correspond to the roles. It also uses heuristics and integer linear programming to find the best joint assignment of roles. The second method is...
Chapter
Full-text available
Research and development (R&D) involves not only researchers but also many other specialists from different areas. All of them solve a variety of tasks that require comprehensive information and analytical support. This chapter discusses the major tasks arising in R&D: study of the state of the art in a given research area, prospects assessment of...
Conference Paper
Full-text available
The paper presents technologies for semantic analysis of scientific publications. We mainly focus on the stages of natural languages processing and analysis results. The results of experiments with scientific publication examining are presented.
Conference Paper
Full-text available
The paper proposes a method of social tension detection and intention recognition based on natural language analysis of social networks, forums, blogs and news comments. Our approach combines natural language syntax and semantics analysis with statistical processing to identify possible indicators of social tension. The universal components of our...
Article
Full-text available
A relational-situational method for analysis of natural language texts is outlined based on the theory of communicative grammar of the Russian language and the theory of heterogeneous semantic networks. It is shown that the relational-situational method can be used for precise search of documents in local and globalnets and electronic libraries
Conference Paper
Full-text available
The paper presents an approach to text representation for search tasks. Heterogeneous semantic networks are defined and their construction from natural language is described. The success of application of semantic networks in intelligent search engine is shown.
Conference Paper
Full-text available
The paper presents methods for semantically relevant search. The authors mainly focus on the usage of linguistic approach to improvement of search precision in search engines. The effectiveness of the presented approach is proved by the experiments with a search engine.
Conference Paper
Full-text available
The paper presents methods and software implementation for semantically relevant search. We mainly focus on the usage of linguistic knowledge for improvement of semantic relevance in search engines. We state the effectiveness of information retrieval systems based on semantic search involving refined linguistic processing tools. Advantages of seman...

Network

Cited By