Valentin Todorov

Valentin Todorov
United Nations Industrial Development Organization | UNIDO

PhD

About

61
Publications
22,126
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
829
Citations
Additional affiliations
January 2011 - present
Technische Universität Wien
January 2007 - present
Instituto Técnico y Cultural

Publications

Publications (61)
Article
For the most part the public debate on fossil fuel energy subsidies has been governed by two arguments. From the position of the profit-maximizing firm, the economic rationale has gravitated towards the issue of cost-competitiveness: The reduction of emissions requires a cutback of energy consumption which, when operating through the pricing mechan...
Chapter
Three-way rating data on student satisfaction contain the scores assigned by students to a set of items measuring different aspects of educational quality at different time points. Such data provide information on the magnitude of satisfaction as well as information on how aspects vary with respect to each other and how they contribute to the total...
Article
The importance of manufacturing as a main determinant of economic growth has been explored extensively in the literature considering manufacturing as the engine of economic growth. In our study, a new database covering 117 countries for 1995–2017 containing four indicators, namely manufacturing value added share in GDP, growth of GDP, manufacturing...
Cover Page
Full-text available
Fresh UNIDO-led research in collaboration with the World Bank provides concrete evidence that removing fossil fuel subsidies would not only make environmental sense but would also improve firms’ economic performance. New firm-level data from Oman show that higher fuel prices are not a drag on competitiveness. In fact, reducing subsidies and allowin...
Cover Page
Full-text available
Empirical evidence from the Sultanate indicates that the rise in fuel prices led to improvements in productivity and efficiency as firms switched to digital technology and information and communications technology (ICT) equipment. Tightening environmental standards and kicking the subsidy habit for good is how we can finally create the competitive...
Article
Full-text available
In a number of recent articles Riani, Cerioli, Atkinson and others advocate the technique of monitoring robust estimates computed over a range of key parameter values. Through this approach the diagnostic tools of choice can be tuned in such a way that highly robust estimators which are as efficient as possible are obtained. This approach is applic...
Article
This paper develops and introduces a new evidence-based tool to systematically measure and benchmark the industrial performance of economies with emphasis on their inclusive and green dimensions. By means of international data sources, we build up a composite index, the inclusive and green industrial performance (IGIP) index, which captures differe...
Article
Full-text available
Composite indicators are widely used to determine the ranking of countries, organizations or individuals in terms of overall performance on multiple criteria. Their calculation requires standardization of the individual statistical criteria and aggregation of the standardized indicators. These operations introduce a potential propagation effect of...
Conference Paper
Both the scientific and political communities agree that significant reductions in CO2 emissions are necessary to limit the magnitude and extent of climate change and of course the energy efficiency is one of the most interesting issues analyzed by economists and policy makers within this debate. Different measures of energy efficiency in manufactu...
Article
The green industrial performance (GIP) index provides a tool to analyse and compare the performance of economies in terms of green manufacturing. The demands for and expectations for sustainable development seem not matched with the existence of an analytical framework to measure how green the industrial sector of a given economy is. In this articl...
Technical Report
Full-text available
This report presents the methodology that underlies GGGI’s Green Growth Index, which measures the performance of 115 countries in four green growth dimensions: (1) efficient and sustainable resource use, (2) natural capital protection, (3) green economic opportunities, and (4) social inclusion. The Green Growth Index and its dimensions draw on 36 i...
Technical Report
Full-text available
This report presents the methodology that underlies GGGI’s Green Growth Index, which measures the performance of 115 countries in four green growth dimensions: (1) efficient and sustainable resource use, (2) natural capital protection, (3) green economic opportunities, and (4) social inclusion. The Green Growth Index and its dimensions draw on 36 i...
Article
The paper of Andrea Cerioli, Marco Riani, Anthony Atkinson and Aldo Corbellini is a fine review of the practical value of the forward search and the other related robust estimation methods based around monitoring of quantities of interest over a range of consecutive values of the tuning parameters. From a practical standpoint in data analysis the a...
Article
This article develops and introduces the green industrial performance (GIP) index that allows policy-makers and practitioners to analyse the performance of countries in terms of green manufacturing. We derive our data from international data sources, namely UNIDO's industrial statistics database (INDSTAT) and UN COMTRADE, to build up a set of green...
Article
Full-text available
Composite indicators are widely used to determine ranking of countries, organizations or individuals in terms of overall performance on multiple criteria. Their calculation requires standardization of the individual statistical criteria and aggregation of the standardized indicators. These operations introduce a potential propagation effect of extr...
Article
Compositional tables can be considered a continuous counterpart to the well‐known contingency tables. Their cells, which generally contain positive real numbers rather than just counts, carry relative information about relationships between two factors. Hence, compositional tables can be seen as a generalization of (vector) compositional data. Due...
Article
Full-text available
Compositional tables - a continuous counterpart to the contingency tables - carry relative information about relationships between row and column factors; thus, for their analysis, only ratios between cells of a table are informative. Consequently, the standard Euclidean geometry should be replaced by the Aitchison geometry on the simplex that enab...
Conference Paper
Multidimensional compositional arrays require special analytical tools to be modeled. Specifically, the variation of the data can be captured by linear combinations of a defined number of parameters, capable of describing the complexity of the data. Usually these models are described as generalizations of Principal Component Analysis to higher orde...
Article
The different parts (variables) of a compositional data set cannot be considered independent from each other, since only the ratios between the parts constitute the relevant information to be analysed. Practically, this information can be included in a system of orthonormal coordinates. For the task of regression of one part on other parts, a speci...
Article
Full-text available
The open-source programming language and software environment R is currently one of the most widely used and popular software tools for statistics and data analysis. This contribution provides an overview of important R packages used in official statistics and survey methodology and discusses the usefulness of R in the daily work of a statistical o...
Conference Paper
Full-text available
Multiway data analysis addresses complex data structures represented as multiway data sets where data have more than two modes. The most popular methods for modeling multiway data are CANDECOMP/PARAFAC and TUCKER3. The standard algorithms for computing these models are based on alternating least squares (ALS) and thus are vulnerable to the presence...
Conference Paper
Full-text available
Double counting is inherent to the output concept, therefore it is preferable to use manufacturing value added (MVA) instead to measure the manufacturing production. While the issue of double counting in production statistics is successfully addressed by using MVA, commodity exchange in trade data is still measured as output. The relevance of value...
Article
Full-text available
The present work discusses robust multivariate methods specifically designed for highdimensions. Their implementation in R is presented and their application is illustratedon examples. The first group are algorithms for outlier detection, already introducedelsewhere and implemented in other packages. The value added of the new package isthat all me...
Article
Full-text available
Compositional tables represent a continuous counterpart to well-known contingency tables. Their cells contain quantitatively expressed relative contributions of a whole, carrying exclusively relative information and are popularly represented in proportions or percentages. The resulting factors, corresponding to rows and columns of the table, can be...
Article
Full-text available
The discordancy measure in terms of the sample L‐moment ratios (L‐CV, L‐skewness, L‐kurtosis) of the at‐site data is widely recommended in the screening process of atypical sites in the regional frequency analysis (RFA). The sample mean and the covariance matrix of the L‐moments ratios, on which the discordancy measure is based, are not robust agai...
Chapter
Full-text available
The main drawback of principal component analysis (PCA) especially for applications in high dimensions is that the extracted components are linear combinations of all input variables. To facilitate the interpretability of PCA various sparse methods have been proposed recently. However all these methods might suffer from the influence of outliers pr...
Data
Full-text available
Data
Full-text available
Article
High-dimensional data often contain many variables that are irrelevant for predicting a response or for an accurate group assignment. The inclusion of such variables in a regression or classification model leads to a loss in performance, even if the contribution of the variables to the model is small. Sparse methods for regression and classificatio...
Article
Data outliers or other data inhomogeneities lead to a violation of the assumptions of traditional statistical estimators and methods. Robust statistics offers tools that can reliably work with contaminated data. Here, outlier detection methods in low and high dimension, as well as important robust estimators and methods for multivariate data are re...
Article
R in the Statistical O ffice, R as Mediator, Survey Methodology with R
Article
Full-text available
The designations employed, descriptions and classi cations of countries, and the presentation ofpar the material in this report do not imply the expression of any opinion whatsoever on the partpar of the Secretariat of the United Nations Industrial Development Organization (UNIDO) concerningpar the legal status of any country, territory, city or ar...
Article
Full-text available
General ideas of robust statistics, and specifically robust statistical methods for calibration and dimension reduction are discussed. The emphasis is on analyzing high-dimensional data. The discussed methods are applied using the packages chemometrics and rrcov of the statistical software environment R. It is demonstrated how the functions can be...
Article
Full-text available
Many different methods for statistical data editing can be found in the literature but only few of them are based on robust estimates (for example such as BACON-EEM, epidemic algorithms (EA) and transformed rank correlation (TRC) methods of Béguin and Hulliger). However, we can show that outlier detection is only reasonable if robust methods are ap...
Article
Full-text available
The Wilks' Lambda Statistic (likelihood ratio test, LRT) is a commonly used tool for inference about the mean vectors of several multivariate normal populations. However, it is well known that the Wilks' Lambda statistic which is based on the classical normal theory estimates of generalized dispersions, is extremely sensitive to the influence of ou...
Article
Full-text available
The termWeb 2.0 refers to collaboration on theWorldWideWeb. Prominent examples for Web 2.0 applications are flickr, facebook or del.icio.us. The 2.0 version identifier plays at the fact that, prior to Web 2.0, the web communicated with its users not in a dialogue but, similarly to print media, in a monologue. Content was published by a single autho...
Article
Full-text available
Taking advantage of the S4 class system of the programming environment R, which facilitates the creation and maintenance of reusable and modular components, an object-oriented framework for robust multivariate analysis was developed. The framework resides in the packages robustbase and rrcov and includes an almost complete set of algorithms for com...
Article
Full-text available
Manufacturing Value Added (MVA) is the key indicator of a country's industrial production. In order to facilitate international comparisons it is published in UNIDO's International Yearbook of Industrial Statistics for a large set of countries. Because of a time-gap of at least one year between the latest year for which data are available and the y...
Conference Paper
Outliers are present in virtually every data set in any application domain, and the identification of outliers has a hundred years long history (see, for example, Barnett and Lewis, 1994). Taking into account the multivariate aspect of the data, the outlyingness of the observations can be measured by the Mahalanobis distance which is based on locat...
Article
Full-text available
The problem of the non-robustness of the classical estimates in the setting of the quadratic and linear discriminant analysis has been addressed by many authors: Todorov et al. [19, 20], Chork and Rousseeuw [1], Hawkins and McLachlan [4], He and Fung [5], Croux and Dehon [2], Hubert and Van Driessen [6]. To obtain high breakdown these methods are b...
Article
Full-text available
A commonly used procedure for reduction of the number of variables in linear discriminant analysis is the stepwise method for variable selection. Although often criticized, when used carefully, this method can be a useful prelude to a further analysis. The contribution of a variable to the discriminatory power of the model is usually measured by th...
Article
Full-text available
The recent advances in the Java technology provoked increasing interest in using Java as a programming language for development of numerical software. Given its attractive language features it worths investigating the possibilities of writing software for robust computing in Java. In this paper the design and implementation of an object-oriented li...
Article
Since linear discriminant analysis (LDA) and multiple linear regression (MR) are numerically equivalent in the two-group case, robust regression can be used to devise a robust discriminant analysis. While M-estimators will be affected by outliers in the classification variables, GM-estimators will resist them. Monte Carlo simulation is used to eval...
Article
The computation of high breakdown point affine equivariant estimates of multivariate location and scatter is a hard problem and usually a heuristic technique is required. The paper presents a simulated annealing scheme for computing the MCD estimator (Rousseeuw and Leroy, Robust Regression and Outlier Detection, Wiley, New York, 1987). The performa...
Chapter
The paper presents a method for selecting the most discriminative variables in the discriminant analysis, using Wilks’ lambda statistic, robustified by means of high breakdown point estimates of multivariate means and covariance matrices. Some examples are given for comparison of the method to the selection of variables based on classical and M-est...
Article
Full-text available
The problem of the non-robustness of the classical estimates in the setting of the quadratic and linear discriminant analysis has been addressed by many authors: Todorov et al. (19, 20), Chork and Rousseeuw (1), Hawkins and McLachlan (4), He and Fung (5), Croux and Dehon (2), Hubert and Van Driessen (6). To obtain high breakdown these methods are b...
Article
Full-text available
Multivariate linear regression, ridge regression and Bayesian regression are considered. A conjugate prior distribution family for the regression parameter matrix and the error covariance matrix is given, namely the normal Wishart distribution. The Bayesian estimators are obtained. It is shown that the ridge regression is a special Bayesian problem...

Projects

Projects (2)
Project
To throw insight into applications of machine learning in official statistics, to identify the techniques that have been explored, to investigate the opportunities for extending the links between official statistics and machine learning in particular and data science in general.