Vadim Strijov

Vadim Strijov

Applied Mathematics and Functional Data Analysis

About

68
Publications
13,252
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
540
Citations

Publications

Publications (68)
Article
Full-text available
Over the past few decades, a variety of significant scientific breakthroughs have been achieved in the fields of brain encoding and decoding using the functional magnetic resonance imaging (fMRI). Many studies have been conducted on the topic of human brain reaction to visual stimuli. However, the relationship between fMRI images and video sequence...
Preprint
Full-text available
This paper investigates the problem of regression model generation. A model is a superposition of primitive functions. The model structure is described by a weighted colored graph. Each graph vertex corresponds to some primitive function. An edge assigns a superposition of two functions. The weight of an edge equals the probability of superposition...
Preprint
Full-text available
In this paper, we develop a decision support system for the hierarchical text classification. We consider text collections with a fixed hierarchical structure of topics given by experts in the form of a tree. The system sorts the topics by relevance to a given document. The experts choose one of the most relevant topics to finish the classification...
Preprint
Full-text available
Neural network structures have a critical impact on the accuracy and stability of forecasting. Neural architecture search procedures help design an optimal neural network according to some loss function, which represents a set of quality criteria. This paper investigates the problem of neural network structure optimization. It proposes a way to con...
Article
This paper investigates the problem of cost reduction of data collection procedures. To select an adequate regression or classification model, a sample set of minimum sufficient size must be collected. This sample set is modelled according to follow the data generation hypotheses. Namely, the generalised linear regression models assume the independ...
Article
This paper investigates dimensionality reduction problem for signal decoding. Its main application is brain–computer interface modeling. The challenge is high redundancy in the data description. Data combines time series of two origins: design space: brain cortex signals and target space: limb motion signals. High correlations among measurements of...
Chapter
The paper investigates the problem of deep learning model selection. We propose a method of a neural architecture search with respect to the desired model complexity called DARTS-CC. An amount of parameters in the model is considered as a model complexity. The proposed method is based on a differential architecture search algorithm (DARTS). Instead...
Article
Full-text available
The paper addresses the problem of human activity recognition based on the data from wearable sensors. Human activity recognition depends on a wide context of actions. Activities can not be recognised from the local shape of sensor signals only. We propose a solution to the problem of human activity recognition applied to labour monitoring. The sol...
Article
Machine learning solved many challenging problems in computer-assisted synthesis prediction (CASP). We formulate a reaction prediction problem in terms of node-classification in a disconnected graph of source molecules and generalize a graph convolution neural network for disconnected graphs. Here we demonstrate that our approach can successfully p...
Preprint
Full-text available
Machine learning solved many challenging problems in computer-assisted synthesis prediction (CASP). We formulate a reaction prediction problem in terms of node-classification in a disconnected graph of source molecules and generalize a graph convolution neural network for disconnected graphs. Here we demonstrate that our approach can successfully p...
Article
Full-text available
The paper investigates hyperparameter optimization problem. Hyperparameters are the parameters of model parameter distribution. The adequate choice of hyperparameter values prevents model overfit and allows it to obtain higher predictive performance. Neural network models with large amount of hyperparameters are analyzed. The hyperparameter optimiz...
Chapter
With the increasing popularity of document analysis and recognition systems, text detection (TD) and text binarization (TB) in document images remain challenging tasks. In the paper, we introduced a two-step architecture for the TD task. Firstly, a U-net based model is used to get a text mask in terms of word-level bounding boxes. Secondly, we appr...
Book
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with app...
Article
This paper addresses the problem of optimal recurrent neural network selection. It asserts the neural network evidence lower bound as the optimal criterion for selection. It investigates variational inference methods to approximate the posterior distribution of the network parameters. As a particular case, the normal distribution of the parameters...
Article
The paper is devoted to the problem of constructing a predictive model in the high-dimensional feature space. The space is redundant, there is multicollinearity in the design matrix columns. In this case the model is unstable to changes in data or in parameter values. To build a stable model, the authors solve the dimensionality reduction problem f...
Article
Full-text available
The problem of selecting the optimum system of models for forecasting short-term railway traffic volumes is considered. The historical data is the daily volume of railway traffic between pairs of stations for different types of cargo. The given time series are highly volatile, noisy, and nonstationary. A system is proposed that selects the optimum...
Article
We consider the problem of model selection for deep learning models of suboptimal complexity. The complexity of a model is understood as the minimum description length of the combination of the sample and the classification or regression model. Suboptimal complexity is understood as an approximate estimate of the minimum description length, obtaine...
Article
Full-text available
The paper addresses the problem of designing Brain-Computer Interfaces. It investigates feature selection methods in regression, applied to ECoG-based motion decoding. The problem is to predict hand trajectories from the voltage time series of cortical activity. A special characteristic of this problem is the inherently multi-way structure of featu...
Article
Full-text available
This paper investigates the metric time series classification problem. Distance functions between time series are constructed using the dynamic time warping method. This method aligns two time series and builds a dissimilarity set. The vector-function of distance between the time series is a set of statistics. It describes the distribution of the d...
Article
Full-text available
We address the problem of outlier detection for more reliable credit scoring. Scoring models are used to estimate the probability of loan default based on the customer’s application. To get an unbiased estimation of the model parameters one must select a set of informative objects (customers). We propose an object selection algorithm based on analy...
Conference Paper
Full-text available
We propose here an extended attention model for sequence-to-sequence recurrent neural networks (RNNs) designed to capture (pseudo-)periods in time series. This extended attention model can be deployed on top of any RNN and is shown to yield state-of-the-art performance for time series forecasting on several univariate and multivariate time series.
Article
Full-text available
The paper addresses the classification problem in multidimensional spaces. The authors propose a supervised modification of the t-distributed Stochastic Neighbor Embedding Algorithm. Additional features of the proposed modification are that, unlike the original algorithm, it does not require retraining if new data are added to the training set and...
Article
Full-text available
This paper investigates an approach to construct new ranking models for Information Retrieval. The IR ranking model depends on the document description. It includes the term frequency and document frequency. The model ranks documents upon a user request. The quality of the model is defined by the difference between the documents, which experts asse...
Article
Full-text available
In this paper, we study the use of recurrent neural networks (RNNs) for modeling and forecasting time series. We first illustrate the fact that standard sequence-to-sequence RNNs neither capture well periods in time series nor handle well missing values, even though many real life times series are periodic and contain missing values. We then propos...
Article
Full-text available
This paper provides a new approach to feature selection based on the concept of feature filters, so that feature selection is independent of the prediction model. Data fitting is stated as a single-objective optimization problem, where the objective function indicates the error of approximating the target vector as some function of given features....
Article
Full-text available
The paper is devoted to themulticlass time series classification problem. The feature-based approach that uses meaningful and concise representations for feature space construction is applied. A time series is considered as a sequence of segments approximated by parametric models, and their parameters are used as time series features. This feature...
Article
Full-text available
The paper discusses the problem of metric time series analysis and classification. The proposed classification model uses a matrix of distances between time series which is built with a fixed distance function. The dimension of this distance matrix is very high and all related calculations are time-consuming. The problem of reducing computational c...
Article
Full-text available
This paper is devoted to the problem of multiclass time series classification. It is proposed to align time series in relation to class centroids. Building of centroids and alignment of time series is carried out by the dynamic time warping algorithm. The accuracy of classification depends significantly on the metric used to compute distances betwe...
Article
Full-text available
In this paper we investigate the problem of short-term time series forecasting. We consider the time series of different scales, which are interconnected and have the property of periodicity. Forecasting problem is reduced to the problem of regression, which is solved by using a linear model. To improve its accuracy we propose to use composition of...
Article
Full-text available
We address a problem of increasing quality of forecasting time series by taking into account the information about exogenous time series. We aim to improve a non-parametric forecasting algorithm that minimizes the convolution of a histogram of time series with the loss function. We propose to adjust the histogram, using mixtures of conditional hist...
Article
Full-text available
The paper presents analytic and stochastic methods of structure parameters estimation for a model selection problem. Structure parameters are covariance matrices of parameters of linear and non-linear regression models. To optimize model parameters and structure parameters we maximize a model evidence, a convolution of a data likelihood with a prio...
Conference Paper
Full-text available
The paper presents a method of constructing a supervised topic model of a major conference. The supervised part is the expert information about document-topic correspondence. To exploit the expert information we use a regularization term which penalizes difference between a constructed and an expert-given model. To find optimal parameters we add th...
Article
Full-text available
To detect small movements of the Earth surface (with a velocity of less than several centimeters per year) with use of SAR-interferometry methods it is necessary to find a number of surface areas remaining coherent in radar images over a long period. These areas and corresponding image points are called persistent scatterers. Two methods of persist...
Article
Full-text available
This paper presents a new fast clustering algorithm RhoNet, based on the metric concentration location procedure. To locate the metric concentration, the algorithm uses a reduced matrix of pairwise ranks distances. The key feature of the proposed algorithm is that it doesn't need the exhaustive matrix of pairwise distances. This feature reduces com...
Article
Full-text available
This special issue on “Data Analysis and Intelligent Optimization with Applications” follows a previous special issue of this journal on the interplay of Machine Learning and Optimization, “Model Selection and Optimization in ML” (Machine Learning 85:1-2, October 2011). This time we shift our focus to applications of data analysis and optimization...
Article
Full-text available
The paper presents an ordinal classification method using Pareto fronts. An object is described by a set of ordinal features assigned by experts. We describe the class boundaries by the set of Pareto fronts. We propose to predict the object class using the nearest Pareto front boundary. The proposed method is illustrated by a real-world application...
Article
Full-text available
We address the problem of segmenting nearly periodic time series into period-like segments. We introduce a definition of nearly periodic time series via triplets hbasic shape, shape transformation, time scalingi that covers a wide range of time series. To split the time series into periods we select a pair of principal components of the Hankel matr...
Article
Full-text available
The current generation of portable mobile devices incorporates various types of sensors that open up new areas for the analysis of human behavior. In this paper, we propose a method for human physical activity recognition using time series, collected from a single tri-axial accelerometer of a smartphone. Primarily, the method solves a problem of on...
Article
Full-text available
This study investigates the multicollinearity problem and the performance of feature selection methods in case of data sets have multicollinear features. We propose a stresstest procedure for a set of feature selection methods. This procedure generates test data sets with various configurations of the target vector and features. This procedure prov...
Article
Full-text available
This paper solves the problem of time series classification using deep learning neural networks. The paper proposes to use a multilevel superposition of models belonging to the following classes of neural networks: two-layer neural networks, Boltzmann machines, and autoencoders. Lower levels of superposition extract informative features from noisy...
Article
Full-text available
The paper presents new methods of alternatives ranking using expert estimations and measured data. The methods use expert estimations of objects quality and criteria weights. This expert estimations are changed during the computation. The expert estimation are supposed to be measured in linear and ordinal scales. Each object is described by the set...
Article
Full-text available
The problem of sample size estimation is important in medical applications, especially in cases of expensive measurements of immune biomarkers. This paper describes the problem of logistic regression analysis with the sample size determination algorithms, namely the methods of univariate statistics, logistics regression, cross-validation and Bayesi...
Article
Full-text available
To construct as adequate regression model one has to fulfill the set of measured features with their generated derivatives. Often the number of these features exceeds the number of the samples in the data set. After a feature generation process the problem of feature selection from a set of highly correlated features arises. The proposed algorithm...
Chapter
The paper is devoted to the logistic regression analysis [1], applied to classification problems in biomedicine. A group of patients is investigated as a sample set; each patient is described with a set of features, named as biomarkers and is classified into two classes. Since the patient measurement is expensive the problem is to reduce number of...
Article
Full-text available
To solve the problem of the secondary protein structure recognition, an algorithm for amino-acid subsequences clustering is developed. To reviel clusters it uses the pairwise distances between the subsequences. The algorithm does not require the complete pairwise matrix. This main distinction of it implies the reduction of the computational complex...
Article
Full-text available
The main goal of this paper is to present the methodology of construction of the Integral Indicator for the Croatian Thermal Power Plants and the Combined Heat and Power Plants. The Integral Indicator is intended to compare the Power Plants according to a certain criterion. The criterion of the ecological impact is chosen. The following features of...
Conference Paper
Mathematical modelling has two issues: first, to create a model of a dynamic system using expert knowledge and second, to discover a model using the measured data. We observe the model-driven and the data-driven approaches to the model creation problem and propose the new combined one. It gathers strong sides of classical approaches: the result mod...
Article
An algorithm of the inductive model generation and model selection is proposed to solve the problem of automatic construction of regression models. A regression model is an admissible superposition of smooth functions given by experts. Coherent Bayesian inference is used to estimate model parameters. It introduces hyperparameters, which describe th...
Conference Paper
In the proceedings of PCO 2010, 3rd Global Conference on Power Control and Optimization, February 2-4, 2010, Gold Coast, Queensland, Australia (ISBN: 978-983-44483-1-8)
Article
Full-text available
This paper describes an approach to quantitative analysis of multivariate dynamic system in phase space. The system is used as a mathematical model for various living systems. The model is used in various applications. One of the related problems is to represent phase trajectory as a sequence of clusters to classify the system’s state. The algorith...
Article
Full-text available
This paper describes an integral indicator construction algorithm. The integral indicator is a linear combination of the object features. The features are in the linear scale. Outliers in the features are assumed. So the problem of stable integral indicators construction arises. To construct a stable integral indicator a special-defined subset of t...
Article
Full-text available
The problem of the non-linear regression analysis is considered. The algorithm of the inductive model generation is described. The regression model is a superposition of given smooth functions. To estimate the model parameters two-level Bayesian Inference technique was used. It introduces hyperparameters, which describe the distribution function of...

Network

Cited By