Vladimir Ivanov

Vladimir Ivanov
Innopolis University · Faculty of Computer Science and Engineering

PhD

About

87
Publications
16,151
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
354
Citations
Introduction
Vladimir Ivanov currently is Assistant Professor at the Faculty of Computer Science and Engineering, Innopolis University.
Additional affiliations
January 2012 - December 2014
July 2005 - present
Kazan (Volga Region) Federal University
Position
  • Senior Researcher

Publications

Publications (87)
Preprint
Full-text available
The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russi...
Article
Full-text available
In a great deal of theoretical and applied cognitive and neurophysiological research, it is essential to have more vocabularies with concreteness/abstractness ratings. Since creating such dictionaries by interviewing informants is labor-intensive, considerable effort has been made to machine-extrapolate human rankings. The purpose of the article is...
Preprint
Full-text available
Requirements identification in textual documents or extraction is a tedious and error prone task that many researchers suggest automating. We manually annotated the PURE dataset and thus created a new one containing both requirements and non-requirements. Using this dataset, we fine-tuned the BERT model and compare the results with several baseline...
Article
Concrete/abstract words are used in a growing number of psychological and neurophysiological research. For a few languages, large dictionaries have been created manually. This is a very time-consuming and costly process. To generate large high-quality dictionaries of concrete/abstract words automatically one needs extrapolating the expert assessmen...
Chapter
Programmers are the most important part of software production and individual developers are hard to substitute. The essential part of the knowledge intensive development process is the developers mind state. Understanding the mental states of software developers has become a main interest of software production companies since it is the most valua...
Preprint
Full-text available
In this paper, we present NEREL, a Russian dataset for named entity recognition and relation extraction. NEREL is significantly larger than existing Russian datasets: to date it contains 56K annotated named entities and 39K annotated relations. Its important difference from previous datasets is annotation of nested named entities, as well as relati...
Conference Paper
In recent years, the use of biological signals to understand the operations of software engineers has emerged, although with a limited understanding of its successful application. This paper provides primary evidence that biological signals obtained by electroencephalography (EEG) may provide valuable information from the perspective of software en...
Chapter
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documen...
Conference Paper
Developers are indeed the most important resource in software production, and the individual developers are hard to substitute. The core of the work of the developers of knowledge intensive systems is in their mind, and this now a growing interest in understanding how to detect and model the state of their mind. Such analysis would enable to determ...
Preprint
Full-text available
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documen...
Chapter
The study explores the problem of assessing complexity of Russian educational texts. In this paper, we focus on measuring conceptual complexity which is rarely selected as a research question and propose to use a thesaurus (or a linguistic ontology) to this end. We also compiled an original corpus of school textbooks on Social Studies, History used...
Chapter
The article presents new method implemented by the authors to generate dictionaries of concrete/abstract words for Russian. The method based on pretrained word embeddings computes concreteness ranking defined as a function of similarity between word vectors and the distance between a word in question and the ‘seed’ of concrete/abstract words. Imple...
Preprint
Full-text available
The paper presents the solution of team "Inno" to a SEMEVAL 2020 task 11 "Detection of propaganda techniques in news articles". The goal of the second subtask is to classify textual segments that correspond to one of the 18 given propaganda techniques in news articles dataset. We tested a pure Transformer-based model with an optimized learning sche...
Conference Paper
Given the emerging importance of individual biophysical data for understanding software development activities, an internship was organized to provide the students with first hand experience on the collection of such data. The specific goal of the internship was to offer students the possibility to collect, analyse, understand, and draw conclusions...
Conference Paper
In this paper, the survey, dedicated to the usage of software systems in a software development process, is analysed. The survey was conducted among the students of Innopolis University. Based on the result of the survey, the following conclusions were made: (1) Windows, macOS and Linux-based operating systems have almost equal share of usage among...
Preprint
Full-text available
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratc...
Article
Creation of dictionaries of abstract and concrete words is a well-known task. Such dictionaries are important in several applications of text analysis and computational linguistics. Usually, the process of assembling of concreteness scores for words begins with a lot of manual work. However, the process can be automated significantly using informat...
Conference Paper
The paper is devoted to the development of the data collectors for Windows OS and MacOS. The purpose of these plugins is to collect the process metrics from the user’s device and send it to the back-end for further processing. The overall open source framework is aimed at energy efficiency analysis of the developing software products. The developme...
Conference Paper
Full-text available
Increasing amount of data the organizations worldwide have at their disposal lead to the need to structure, organize and present the information obtained from it. That is because, in today’s rapid-changing business environment, managers and executives need to be able to gain crucial insights about the ongoing project in as little time as possible....
Conference Paper
Being on of the youngest field of human endeavours, software development absorbed features of other, older fields, especially engineering, mathematics, and economics. However, being software the product of the creation and being based on a systematic discipline and technical excellence of the participants (the developers), there could be also very...
Conference Paper
Software is mostly a “people” business: the single and most important asset of a software company is its developers. Finding an appropriate software developer is a problem that has created the whole area (and business) of IT recruiting, which is mostly an “art” involving a set of practical techniques and approaches. This paper discusses the typical...
Conference Paper
In this study we were able to gather a substantial quantity of detailed responses from a group of individuals and companies that are broadly quite similar to those found in several of the major world centers of technological innovation. As such, our analysis of the results provides some tantalizing hints to organizational and methodological challen...
Book
This book constitutes the refereed proceedings of the 16th IFIP WG 2.13 International Conference on Open Source Systems, OSS 2020, held in Innopolis, Russia, in May 2020.* The 12 revised full papers and 8 short papers presented were carefully reviewed and selected from 42 submissions. The papers cover a wide range of topics in the field of free/li...
Conference Paper
Full-text available
In this paper, we present a shared task on core information extraction problems, named entity recognition and relation extraction. In contrast to popular shared tasks on related problems, we try to move away from strictly academic rigor and rather model a business case. As a source for textual data we choose the corpus of Russian strategic document...
Chapter
Full-text available
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-government research. We demonstrate the pipeline for creating a text corpus from scrat...
Conference Paper
Software systems are the enabling technology for the development of sustainable systems. However, such devices consume power both from the client side and from the server side. This scenario poses to software engineering a new challenge that concerns the development of software for sustainable systems i.e. systems that explicitly characterize the r...
Chapter
Formal specification, model checking and model-based testing are recommended techniques for engineering of mission-critical systems. In the meantime, those techniques struggle to obtain wide adoption due to inherent learning barrier, i.e. it is considered difficult to use those methods. There is also a common difficulty in translating the specifica...
Conference Paper
Full-text available
The use of biological signals to understand software development has become more popular in the last few years but poses new challenges with respect to the overall experimental settings. In this paper we present such challenges and the approach we took to overcome them. We illustrate our approach by evaluating two programming situations: pair progr...
Article
Education policy makers view measuring academic texts readability and profiling classroom textbooks as a primary task of education management aimed at sustaining quality of reading programs. As Russian readability metrics, i.e. “objective” features of texts determining its complexity for readers, are still a research niche, we undertook a comparati...
Conference Paper
This paper describes application of lean methodology in IT education in a context of an undergraduate course on “Lean Software Development” with a full devops pragmatics in mind. Strong connection between software development and delivery processes can be build on top of established lean practices. Which means that implementation of end-to-end auto...
Article
Developing features based solely on requirement documents and specifications has been a traditional way of building software. This paper provides a different approach by combining the notions from Artificial Intelligence (AI)-Evolutionary Algorithms (EA) and Complexity Theory. It represents the software to be build-a dashboard-as a Complex System,...
Conference Paper
Despite increasing popularity of developer dashboards, the effectiveness of dashboards is still in question. In order to design a dashboard that is effective and useful for developers, it is important to know (a) what information developers need to see in a dashboard, and (b) how developers want to use a dashboard with that necessary information. T...
Chapter
Full-text available
The authors of the article offer new readability formulas for academic texts which provide a comparatively higher degree of accuracy than other Russian readability formulas. The results achieved are due to using original syntactic, lexical and frequency metrics ignored in previous research on Russian readability. The methods applied by the authors...
Article
Full-text available
Evaluation of software reliability is an important part of the process of developing modern software. Many studies are aimed at improving models for measuring and predicting the reliability of software products. However, little attention is paid to approaches to comparing existing systems in terms of software reliability. Despite the enormous impor...
Conference Paper
Software is mostly, if not entirely, a knowledge artifact. Software best practices are often thought to work because they induce more productive behaviour in software developers. In this paper we deployed a new generation tool, portable multichannel EEG, to obtain direct physical insight into the mental processes of working software developers enga...
Conference Paper
Designing an effective and useful dashboard is expensive and it would be important to determine if it is possible to elaborate a "generic" useful and effective dashboard, usable in a variety of circumstances. To determine if it is possible to develop such dashboard and, if so, its structure we interviewed 67 software engineers from 44 different com...
Article
Full-text available
Software is often produced under significant time constraints. Our idea is to understand the effects of various software development practices on the performance of developers working in stressful environments, and identify the best operating conditions for software developed under stressful conditions collecting data through questionnaires, non-in...
Article
Full-text available
In this paper we explore to what extent text parameters, such as average number of words per sentence, syllables per word, nouns per sentence, frequency of content words, etc. can successfully rank Russian academic texts for different age and grade levels. We provide a brief overview of previous research on readability of Russian texts and describe...
Conference Paper
Despite that non-invasive software measurement tools have proven their usefulness in software production, their adoption in software industry is still limited. Reasons for the limited distributions have been studied and analysed recently. In this paper, we propose a new architecture for non-invasive software measurement systems that address the pro...
Conference Paper
Full-text available
Analysis of data related to software development helps to increase quality, control and predictability of software development processes and products. However, collecting such data is a complex task. A non-invasive collection of software metrics is one of the most promising approaches to solve the task. In this paper we present an approach which co...
Chapter
Software projects failure rate is still high. It means that many projects experience crises and the managers have to deal with it. We believe that the human behavior is one of the main reasons that the projects fall into the crisis and one of the main drivers in mitigation process. In this paper we are not going to emphasize importance of a process...
Conference Paper
Despite that non-invasive software measurement tools have proven their usefulness in software production, their adoption in software industry is still limited. Reasons for the limited distributions have been studied and analyzed in works like (Coman et al, Proceedings of 476 the 31st International Conference on Software Engineering (ICSE 2009), Van...
Article
Full-text available
Software engineering education and training have obstacles caused by a lack of basic knowledge about a process of program execution. The article is devoted to the development of special tools that help to visualize the process. We analyze existing tools and propose a new approach to stack and heap visualization. The solution is able to overcome maj...
Article
Assessment of software reliability is inevitable in modern software production process. Many works aimed at better models for measurement and prediction of reliability of software products. Tens of approaches have been developed and evaluated so far. However, very few works focus on approaches to compare existing systems with respect to reliability...
Conference Paper
It is a cliche to say that there is a gap between research and practice. As the interest and importance in the practical impact of research has been growing, the gap between research and practice is expected to be narrowing. However, our study reveals that there still seems to be a wide gap. We survey so ware engineers about what they care about wh...
Technical Report
Full-text available
Analysis of data related to software development helps to increase quality, control and predictability of software development processes and products. However, collecting such data for is a complex task. A non-invasive collection of software metrics is one of the most promising approaches to solve the task. In this paper we present an approach whic...
Conference Paper
Full-text available
The study described in the paper deals with the extraction of relations between organizations from the Russian Wikipedia. We experiment with two data sources for supervised methods – manual annotations made from scratch and relations from infoboxes with subsequent sentence matching, as well as different feature sets and learning methods – SVM, CRF,...
Conference Paper
Full-text available
This paper reports on the experience of the authors in quantitatively assessing the development process of an Eastern European software SME (Small or Medium Size Enterprise). The company produces a very successful workflow and documentation tool, features about 30 full time developers and has a customer base of about 40 major organizations. It has...
Article
Full-text available
Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extr...
Conference Paper
Part-of-speech (POS) tagging is an essential step in many text processing applications. Quite a few works focus on solving this task for Russian; their results are not directly comparable due to the lack of shared datasets and tools. We propose a POS tagging evaluation framework for Russian that comprises existing third-party resources available fo...
Book
Full-text available
The monograph is dedicated to the relevant problem of processing scientific and technological information in the high-tech innovational sphere. The book reveals new trends of creating intellectual systems (ontology, semantic procession of the text, forecasting) demonstrated on the example of the perspective subject area of nan omaterials and nan...