About
12
Publications
1,063
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
81
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (12)
In this paper we present a correlation based safe screening technique for building the complete Lasso path. Unlike many other Lasso screening approaches we do not consider prespecified values of the regularization parameter, but, instead, prune variables which cannot be the next best feature to be added to the model. Based on those results we prese...
We study an iterative discrete information production process (IPP) where we can extend ordered normalised vectors by new elements based on a simple affine transformation, while preserving the predefined level of inequality, G, as measured by the Gini index. Then, we derive the family of Lorenz curves of the corresponding vectors and prove that it...
Inequality is an inherent part of our lives: we see it in the distribution of incomes, talents, resources, and citations, amongst many others. Its intensity varies across different environments: from relatively evenly distributed ones, to where a small group of stakeholders controls the majority of the available resources. We would like to understa...
This paper aims to find the reasons why some citation models can predict a set of specific bibliometric indices extremely well. We show why fitting a model that preserves the total sum of a vector can be beneficial in the case of heavy-tailed data that are frequently observed in informetrics and similar disciplines. Based on this observation, we in...
We analyse the usefulness of Jain’s fairness measure and the related Prathap’s bibliometric z-index as proxies when estimating the parameters of the 3DSI (three dimensions of scientific impact) model.
There are many approaches to the modelling of citation vectors of individual authors. Models may serve different purposes, but usually they are evaluated with regards to how well they align to citation distributions in large networks of papers. Here we compare a few leading models in terms of their ability to correctly reproduce the values of selec...
We demonstrate that by using a triple of simple numerical summaries: an author’s productivity, their overall impact, and a single other bibliometric index that aims to capture the shape of the citation distribution, we can reconstruct other popular metrics of bibliometric impact with a sufficient degree of precision. We thus conclude that the use o...
We present an approach to efficiently construct stepwise regression models in a very high dimensional setting using a multidimensional index. The approach is based on an observation that the collections of available predictor variables often remain relatively stable and many models are built based on the same predictors. Example scenarios include d...
The growing popularity of bibliometric indexes (whose most famous example is the h index by J. E. Hirsch [J. E. Hirsch, Proc. Natl. Acad. Sci. U.S.A. 102, 16569–16572 (2005)]) is opposed by those claiming that one’s scientific impact cannot be reduced to a single number. Some even believe that our complex reality fails to submit to any quantitative...
An observational error of heart rate variability (HRV) may arise from many factors, such as a limited sampling frequency, QRS complexes detection process, preprocessing procedures and others. In our study, we focused on the first two origins of measurement error. We introduced a model of observational error and suggested universal descriptors for t...
The Hirsch's h-index is perhaps the most popular citation-based measure of the scientific excellence. In 2013 G. Ionescu and B. Chopard proposed an agent-based model for this index to describe a publications and citations generation process in an abstract scientific community. With such an approach one can simulate a single scientist's activity, an...
The main focus of research in machine learning and statistics is on building more advanced and complex models. However, in practice it is often much more important to use the right variables. One may hope that recent popularity of open data would allow researchers to easily find relevant variables. However current linked data methodology is not sui...