
Juan Aparicio- Professor
- Professor (Full) at Miguel Hernández University of Elche
Juan Aparicio
- Professor
- Professor (Full) at Miguel Hernández University of Elche
Trying to establish links between Data Envelopment Analysis and Machine Learning techniques.
About
216
Publications
77,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,405
Citations
Introduction
Current institution
Publications
Publications (216)
Data Envelopment Analysis (DEA) models with weight restrictions (WRs) have proven valuable for benchmarking and target setting. Although the DEA literature has explored the incorporation of managerial preferences and value judgments regarding the relative worth of inputs and outputs, as well as the establishment of targets in benchmarking contexts,...
The Li Test is a nonparametric test which was originally introduced to evaluate the closeness between two unknown distribution functions and was later adapted to compare vectors of technical efficiency scores in production theory. In this paper, we adapt the Li Test to be used for the selection of variables (inputs and outputs) in production proces...
This paper introduces EATBoosting, a novel application of gradient tree boosting within the Data Envelopment Analysis (DEA) framework, designed to address undesirable outputs in printed circuit board (PCB) manufacturing. Recognizing the challenge of balancing desirable and undesirable outputs in efficiency assessments, our approach leverages machin...
Overfitting is a classical statistical issue that occurs when a model fits a particular observed data sample too closely, potentially limiting its generalizability. While Data Envelopment Analysis (DEA) is a powerful non-parametric method for assessing the relative efficiency of decision-making units (DMUs), its reliance on the minimal extrapolatio...
This paper deals with the analysis of the structure of the core of the transportation game (a kind of many-to-many assignment market) by means of the well known assignment game (a kind of one-to-one assignment market). To do this, we associate an allocation game to each transport game that we call the agents game. This game is obtained by means of...
The main objective of this study is to introduce machine learning-type extensions for the measurement of environmental inefficiency based on regression trees under shape constraints. The new methods developed are implemented using a by-production approach that distinguishes two technologies, one related to the generation of pollution and the other...
This paper presents a novel approach to conduct non-parametric estimations of production technologies that adhere to the basic assumptions of production theory axioms, including free disposability in inputs and outputs and convexity. The methodology is rooted in adapting the highly effective machine learning techniques associated with Random Forest...
The minimum distance models have undoubtedly represented a significant advance for the establishment of targets in Data Envelopment Analysis (DEA). These models may help in defining improvement plans that require the least overall effort from the inefficient Decision Making Units (DMUs). Despite the advantages that come with Closest Targets, in som...
Data Envelopment Analysis (DEA) is nowadays a very famous nonparametric technique for the measurement of technical efficiency. It does so by building a production possibility set that satisfies certain microeconomic and mathematical axioms, such as free disposability in inputs and outputs and convexity, and determines the most conservative estimate...
This chapter surveys the literature on two types of related contributions. The first group is made up of models devoted to adapting well-known machine learning techniques for estimating production frontiers, satisfying shape constraints (free disposability, convexity, …). The second group consists in approaches that apply frontier estimators to cla...
This paper compares the performance of groups of units by composing indicators of corporate social responsibility (CSR) from an efficiency and productivity perspective, applicable across various industries. From a methodological perspective, our work extends the traditional input-oriented Benefit-of-the-Doubt (BoD) model in the multiplier form, by...
In the realm of assessing scale efficiency (SE), it tends to be computed as a firm‐specific phenomenon rather than something associated with the whole shape of the frontier of the technology under evaluation. This circumstance may lead to inaccuracies in the conclusions drawn regarding the returns to scale (RTS) exhibited across the entire frontier...
We introduce benchmarking analysis based on state-of-the-art machine learning techniques applied to the measurement of efficiency to assess the performance of Higher Education Institutions (HEIs). We rely on Efficiency Analysis Trees (EAT) and its Convexified frontier counterpart (CEAT) to assess the efficiency of 144 private HEIs in Colombia and c...
In estimating productivity change over time, technical change is frequently miscalculated as the geometric average of technological changes between two periods based on firm-specific information in the dataset. However, the frontier shift over time is a global phenomenon linked to relative technological progress or regress across the entire frontie...
boostingDEA is a new package for R that includes functions to estimate production frontiers and make ideal output predictions in the Data Envelopment Analysis (DEA) context. The package implements both standard models from DEA and Free Disposal Hull (FDH) and, for the first time, incorporates boosting techniques. Boosting is a method used in machin...
We introduce a new method for the estimation of production technologies in a multi-input multi-output context, based on OneClass Support Vector Machines with piecewise linear transformation mapping. We compare via a finite-sample simulation study the new technique with Data Envelopment Analysis (DEA) to estimate technical efficiency. The criteria a...
This paper introduces a new methodology for the estimation of production functions satisfying some classical production theory axioms, such as monotonicity and concavity, which is based upon the adaptation of an additive version of the machine learning technique known as Multivariate Adaptive Regression Splines (MARS). The new approach shares the p...
In this paper, we introduce an unsupervised machine learning method for production frontier estimation. This new approach satisfies fundamental properties of microeconomics, such as convexity and free disposability (shape constraints). The new method generalizes Data Envelopment Analysis (DEA) through the adaptation of One-Class Support Vector Mach...
In this paper, we propose and compare new methodologies for ranking the importance 1
of variables in productive processes via an adaptation of OneClass Support Vector Machines. In 2
particular, we adapt two methodologies inspired by the machine learning literature: one involving 3
the random shuffling of values of a variable and another one using t...
This paper contributes by developing new models for assessing dynamic inefficiency that incorporate machine learning techniques. In particular, the new approaches apply decision trees models for the estimation of dynamic production technologies that account for investment adjustment costs. Methodologically, the new models build on the recently deve...
This paper contributes to the literature with a methodology that helps identify the functions that constrain the overall performance of an innovation system, hence providing clear guidelines to policymakers on the direction of their interventions. This methodology relies on the notion of penalty for bottleneck, which is defined as the
weakest link...
Considering any graph technical inefficiency measure, we show that the so-called standard or traditional approach for decomposing profit inefficiency relying on Fenchel-Mahler inequalities obtained from duality theory, establishes that profit inefficiency is greater than or equal to the product of technical inefficiency times a positive factor expr...
The traditional approach decomposing profit inefficiency into the sum of its technical and its allocative components identifies first the frontier projection of each firm based on the exogenous choice of a specific technical measure, e.g., based on slacks, directional, etc. However, in real life situations, firms and organizations are interested in...
We introduce two desirable properties concerning the allocative efficiency term in the decomposition of economic efficiency. Since Farrell, economic efficiency, defined in terms of the cost, revenue, profitability, or profit functions, is decomposed into a technical efficiency measure and allocative efficiency. Resorting to duality theory, allocati...
We introduce complementary decompositions of profit change that, relying on the duality between the profit function and the directional distance function, shed light on the different sources of profit growth including measures of technical efficiency, allocative efficiency and technological change. Our decompositions extend the literature on Konüs...
The main purpose of this paper is to find the determinants of productivity growth and its evolution over time for 27 North Atlantic Treaty Organization (NATO) countries over the period 2010-2017, with respect to the Global Peace Index. To this end, a production function was estimated for each year using Data Envelopment Analysis (DEA) methodology w...
Data Envelopment Analysis (DEA) presents the typical characteristics of a data-driven approach with the specific objective of determining technical efficiency and production frontiers in Engineering and Microeconomics. However, by construction, the frontier estimator generated by DEA suffers from overfitting problems; something that contrasts with...
This paper aims to show how to calculate different efficiency measures using a technology estimator defined through the adaptation of the Gradient Tree Boosting algorithm. This adaptation shares some features with the standard non-parametric FDH (Free Disposal Hull) approach, but it overcomes data overfitting problems. Nevertheless, from a computat...
In production theory and engineering, a topic of interest is the determination of technical efficiency of firms from the estimation of a technology. By definition, a technology must satisfy a set of micro-economic postulates. Likewise, a valid estimator of a technology should meet the same set of axioms. In this paper, for the first time, we adapt...
This paper contributes to research on the corporate social responsibility (CSR) field and the inefficiency measurement of firms by proposing a new method for evaluating inefficiency accounting for firms’ CSR activities. The new approach considers the imprecise nature of CSR data through the fuzzy data envelopment analysis (FDEA) method and further...
eat is a new package for R that includes functions to estimate production frontiers and technical efficiency measures through non-parametric techniques based upon regression trees. The package specifically implements the main algorithms associated with a recently introduced methodology for estimating the efficiency of a set of decision-making units...
In the technical efficiency evaluation area, it may happen that many observations obtain a similar relative technical efficiency status, making it difficult to discriminate between them. The determination of super-efficiency has been a way of solving this problem by providing a method to differentiate between the performance of observations. Despit...
The main objective of this paper is to propose a tool for measuring the productivity and performance gaps across a set of Decision Making Units (DMUs) for monitoring their evolution and analyzing their components over time. To do this, we use the approach proposed by Aparicio and Santín (2018), which is grounded on a base-group base-period producti...
Data Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, by construction, DEA is a descriptive methodology t...
Data Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, by construction, DEA is a descriptive methodology t...
In this paper, we show that both Free Disposal Hull (FDH) and Data Envelopment Analysis (DEA), which are well-known modern techniques for efficiency measurement, can be seen as particular cases of a more general model based upon Support Vector Regression (SVR) within machine learning. Our approach is based on the adaptation of SVR in a multi-respon...
In this paper, we propose and discuss a new model in order to obtain the weight profiles to be used in the calculation of cross-efficiency scores in Data Envelopment Analysis (DEA). In the standard DEA literature, total flexibility in the selection of weights has been strongly criticized due to the fact that some inputs and/or outputs may have no c...
The results-based budgeting (RBB) framework is a public management strategy in which economic resources are allocated to certain budget programmes, oriented towards delivering specific products and results to the population. The present paper analyzes the regional governments' efficiency in using their economic resources, under the RBB framework, w...
The measurement of technical efficiency is a topic of great interest in microeconomics and engineering. Data Envelopment Analysis (DEA) is one of the the existing techniques for measuring technical efficiency. One of the challenges related to DEA is to introduce a "well-defined" efficiency measure. Overall, it means that the technical efficiency me...
The reverse directional distance function, shortly RDDF, is a relatively recent concept introduced by Pastor et al. (2016). It is an apparently simple idea and, at the same time, a fruitful one. Let us start considering an efficiency measure (EM) and a finite sample of firms to be analyzed, FJ. As we have shown throughout most of the previous chapt...
The usual and well-established methods, explained and used in most of the previous chapters, for deriving a specific overall economic inefficiency decomposition associated with a given technical efficiency measure (multiplicative or additive), which we refer to as the traditional approaches, rely on the same “modus operandi”; i.e., they are based o...
The canonical model of perfect competition, resulting in social welfare maximization, assumes all kinds of technical and allocative inefficiencies away. In equilibrium, economic theory establishes that in contestable markets, competition forces draw firms towards profit maximization (Vickers, 1995). This in turn requires that, on one hand, firms ex...
The enhanced Russell graph measure, ERG, (Pastor et al., 1999) was designed as a new global efficiency measure to overcome the computational difficulties of the Russell graph measure of technical efficiency, RG (Färe et al., 1985). Historically, Farrell (1957) implemented the first measure of technical efficiency, while Färe and Lovell (1978), afte...
Over past decades, one of the aspects related to the measurement of technical efficiency that has attracted the attention of Data Envelopment Analysis (DEA) researchers for the last decades has been the development of the generalized efficiency measures (GEMs), also called graph measures. The initial motivation for these measures was the design of...
From the beginning of DEA as a well-defined multi-output-multi-input tool for measuring efficiency, a huge number of technical efficiency measures have been introduced in the literature. Each of them implements a different way of gauging the “distance” from a firm in the interior of the technology to its efficient frontier (radially, hyperbolically...
This chapter is concerned with the measurement of profitability efficiency, defined as the ratio of revenue to cost, and its multiplicative decomposition into a productive efficiency measure—including technical and scale efficiencies, corresponding to the generalized distance function introduced by Chavas and Cox (1999), and allocative efficiency....
Data Envelopment Analysis can determine both a technical efficiency score and benchmarking information on how to change inputs and outputs to reach the efficient frontier if the firm under evaluation is technically inefficient. All measures studied in this book resort to the determination of benchmarking information through the calculation of the f...
After the introduction in the Data Envelopment Analysis literature of the radial measures, other approaches and ways of measuring the distance from a firm to the efficient frontier were defined, with the aim of solving certain drawbacks of the radial ones. On the one hand, radial measures are too restrictive in the sense that each assessed unit is...
The birth of the directional distance function as an inefficiency measure was linked to the consumer theory work developed by Luenberger in the early 1900s. Luenberger (1992a) introduced the concept of the benefit function in consumer theory in order to develop group welfare relations and, particularly, considered the Shephard’s input distance func...
In this chapter, we summarize the analytical framework found in the book by presenting the main concepts in an intuitive and accessible way while relying on supporting graphical illustrations to ease comprehension. Arguably, the measurement of economic efficiency dates back to the seminal paper by Farrell (1957), who introduced the definition, deco...
As we showed in Chap. 8, by duality, the directional distance function (DDF) is related to a measure of profit inefficiency that is calculated as the normalized deviation between optimal and actual profit at market prices. However, in the most usual case where the selected directional vector corresponds to the observed values in inputs and outputs...
In this chapter, we present the classic approach to calculate and decompose cost and revenue efficiency based on Shephard’s radial input and output distance functions. These decompositions follow closely the presentation done in Chap. 2 where both economic efficiency measures can be separated into technical and allocative components, i.e., expressi...
Mixed Integer Linear Programs (MILPs) are usually NP-hard mathematical programming problems, which present difficulties to obtain optimal solutions in a reasonable time for large scale models. Nowadays, metaheuristics are one of the potential tools for solving this type of problems in any context. In this paper, we focus our attention on MILPs in t...
The main objective of this article is the evaluation of the efficiency and its evolution over time of 27 member countries of the North Atlantic Treaty Organization (NATO). We analyse the relationship between defense expenditure and the spending on military personnel as well as how security is perceived by citizens. To this end, a data panel for the...
There is extensive literature focused on the evaluation of efficiency in the education sector, both at the micro level, analyzing the performance of students or schools, and at the macro level, exploring the behavior of regions or countries. The development of this type of study has been driven in recent years by exploiting data available in intern...
The main objective of this article is the evaluation of the efficiency and its evolution over time of 27 member countries of the North Atlantic Treaty Organization (NATO). We analyse the relationship between defense expen-diture and the spending on military personnel as well as how security is perceived by citizens. To this end, a data panel for th...
International large-scale assessments (ILSAs) provide several measures as a representation of educational outcomes, the so-called plausible values, which are frequently interpreted as a representation of the ability range of students. In this paper we focus on how this information should be incorporated into the estimation of efficiency measures of...
This paper introduces the methodology necessary to evaluate inefficiency of regulated decision making units that operate under quotas through Data Envelopment Analysis (DEA), accounting for both quotas’ restrictions and negative environmental externalities of production. Three technical inefficiency measures are proposed: inefficiency in the produc...
In microeconomics, a topic of interest is the estimation of production functions. By definition, a production function is a non-decreasing function that envelops all the observations (firms) from above in the input-output space, capturing the extreme behavior of the data. These characteristics are far from the usual ones assumed by machine learning...
Most empirical studies examining the export competitiveness of a country in a target market are undertaken by focusing on supply, only analysing the group of competing countries. In addition, if the target market to be analysed is extensive, like the European Union, it is generally analysed as a whole. This study presents an evaluation of the tomat...
This is the package of R created for estimating technical efficiency improving FDH and DEA. The new technique, called EAT (Efficiency Analysis Trees), avoids the problem of overfitting associated with the standard non-parametric methodologies. We show how to use the package through some numerical examples. This is related to our objective of creati...
Within the framework of data envelopment analysis (DEA) methodology, the problem of determining the closest targets on the efficient frontier is receiving increased attention from both academics and practitioners. In the literature, the number of approaches to this problem are increasing, most of which are based on the computation of closest target...
Innovation is one of the main determinants of economic development of modern societies. The extant evidence points to increasing territorial disparities in Europe in relation to innovation. Relying on production theory, we examine the nature of these disparities. In particular, we are interested in finding out whether catching up processes are taki...
This article aims to provide a systemic instrument to evaluate the functioning of higher education systems. Despite systemic instruments have had a strong impact on the management of public policy systems in fields such as health and innovation, higher education has not been widely discussed in applying this type of instrument. Herein lies the main...
Today’s global competition and rapid development of information technology have led to the creation of massive amounts of data that are, moreover, exponentially increasing day by day. Analysing these large data sets is a key basis of competition and innovation, and supports new waves of productivity growth. This has challenged organisations to find...
The determination of technical efficiency through the previous estimation of a production frontier has been a relevant topic in the literature related to production theory and engineering. Many parametric and nonparametric approaches have been introduced in the last forty years for estimating production frontiers given a data sample. However, few o...
This paper extends the Camanho and Dyson (2006) one-period Malmquist-type index (CDMI) and the recent pseudo-panel Malmquist index (PPMI) by Aparicio et al. (2017) and Aparicio and Santín (2018) to a context where additive efficiency measures are used. In particular, we apply the Luenberger productivity indicator. Unlike the CDMI, the new approach...
In the literature of Economics, Engineering and Operations Research, the estimation of
production frontiers is a current hot topic. Many parametric and nonparametric methodologies have been introduced for estimating technical efficiency of a set of units (for example, firms) from the production frontier. However, few of these methodologies are base...
El objetivo de este artículo es identificar grupos estratégicos en el sector privado de un sistema de educación superior. Las técnicas de clasificación tienden a comparar conjuntamente a las instituciones de educación superior privadas con sus contrapartes públicas, y corren el riesgo de subestimar el impacto del mercado como fuente de diferenciaci...
The increasing demands for accountability have led to the development of a significant research stream related to the assessment of the performance of higher education institutions. In particular, research in this field has focused on evaluating the efficiency and productivity of higher education institutions in relation to their basic missions, wi...
The measurement of technical efficiency attracts considerable interest in the literature. In fact, following on from the seminal works by Koopmans (1951); Debreu (1951); Shephard (1953); Farrell (1957), a substantial amount of literature has been dedicated to methods for estimating production frontiers and measuring the technical efficiency of prod...
El objetivo de este artículo es identificar grupos estratégicos en el sector privado de un sistema de educación superior. Las técnicas de clasificación tienden a comparar conjuntamente a las instituciones de educación superior privadas con sus contrapartes públicas, y corren el riesgo de subestimar el impacto del mercado como fuente de diferenciaci...
This paper is concerned with introducing a series of new concepts under the name of Economic Cross-Efficiency, which is rendered operational through Data Envelopment Analysis (DEA) techniques. To achieve this goal, from a theoretical perspective, we connect two key topics in the efficiency literature that have been unrelated until now: economic eff...
The measurement of technical efficiency is a topic of great interest. Since the beginning, many researchers have developed new approaches to gauge technical efficiency, mainly in the non-parametric area of Data Envelopment Analysis (DEA). However, the first measures in DEA, the well-known radial models, only accounted for radial inefficiency, which...
Metaheuristic and exact methods are one of the most common tools to solve Mixed-Integer optimization Problems (MIP). Most of these problems are NP-hard problems, being intractable to obtain optimal solutions in a reasonable time when the size of the problem is huge. In this paper, a hybrid parallel optimization algorithm for matheuristics is studie...
The paper presents an innovative empirical application to assess the efficiency of regional tax offices in Spain. The existing evidence about the performance of those administrative units is still limited, thus our aim is to contribute to extend this line of research by incorporating three relevant issues into our empirical analysis. First, we cons...
In this paper, we introduce a new methodology based on regression trees for estimating production frontiers satisfying fundamental postulates of microeconomics, such as free disposability. This new approach, baptized as Efficiency Analysis Trees (EAT), shares some similarities with the Free Disposal Hull (FDH) technique. However, and in contrast to...
This is a presentation of a work on how to estimate production functions by means of machine learning techniques. In particular, we resort to an adaptation of Regression Trees (CART).
This paper introduces a model to construct composite indicators for performance evaluation of decision making units, which is based upon the determination of the least distance from each assessed unit to a frontier estimated by Data Envelopment Analysis. This generates less-demanding targets from a benchmarking point of view. The model also makes i...
There is variation in how long it takes a golfer to win for the first time on the PGA tour. There are also golfers who never win. We investigate the time to first win using a survival function methodology, including shots gained measures of golfing ability, number of events played, experience prior to joining the tour, the competition, as explanato...
This study introduces the measurement of environmental inefficiency from an economic perspective. We develop our proposal using the latest by-production models that consider two separate and parallel technologies: a standard technology generating good outputs, and a polluting technology for the by-production of bad outputs. While research into envi...
This paper proposes a new method to measure economic inefficiency of decision making units based on the calculation of the least distance to the Pareto-efficient frontier in Data Envelopment Analysis (DEA). While all previously published approaches that have dealt with the problem of determining least distances to the efficient frontier are focus o...
This book includes a spectrum of concepts, such as performance, productivity, operations research, econometrics, and data science, for the practically and theoretically important areas of ‘productivity analysis/data envelopment analysis’ and ‘data science/big data’. Data science is defined as the collection of scientific methods, processes, and sys...
Data envelopment analysis (DEA) has been widely applied to empirically measure the technical efficiency of a set of schools for benchmarking their performance. However, the endogeneity issue in the production of education, that plays a central role in education economics, has received minor attention in the DEA literature. Under a DEA framework, en...
We begin by providing contextual background for this book, the second in a series. We continue by proclaiming the importance of efficiency and productivity, for businesses, industries and nations. We then summarise the chapters in the book, which consist of an equal number of advances in the analytical foundations of efficiency and productivity mea...
Overall efficiency measures were introduced in the literature for evaluating the economic performance of firms when reference prices are available. These references are usually observed market prices. Recently, Aparicio and Zofío (2019) have shown that the result of applying cross-efficiency methods (Sexton et al., 1986), yielding an aggregate mult...
In this paper we present a comparison of robust efficiency scores for the scenario in which the specification of the inputs/outputs to be included in the Data Envelopment Analysis (DEA) model is modeled with a probability distribution, through the traditional cross efficiency evaluation procedure. We evaluate the ranking obtained from these scores...
This book surveys the state-of-the-art in efficiency and productivity analysis, examining advances in the analytical foundations and empirical applications. The analytical techniques developed in this book for efficiency provide alternative ways of defining optimum outcome sets, typically as a (technical) production frontier or as an (economic) cos...
Average efficiency is popular in the empirical education literature for comparing the aggregate performance of regions or countries using the efficiency results of their disaggregated decision-making units (DMUs) microdata. The most common approach for calculating average efficiency is to use a set of inputs and outputs from a representative sample...
Given a ranking of elements of a set and given a disjoint partition of the same set, the ranking does not generally imply a total order of the partition. In this paper, we introduce the Kendall-t partition ranking, a linear order of the subsets of the partition which follows from the given ranking. We prove that, under certain assumptions, the Kend...
Cross-efficiency has been developed for the evaluation of cross-sectional data in Data Envelopment Analysis. This paper extends cross-efficiency evaluation introducing the concept of cross-productivity. The extension proposed here is aimed at providing a dynamic peer-evaluation of decision making units, based on the standard Luenberger indicator, w...
In recent decades, there has been widespread literature dedicated to evaluating efficiency in the educational sector, both at the micro level, analyzing the performance of students or schools, and at the macro level, exploring the behavior of regions and even countries These types of studies have been driven by the proliferation of international da...
Tools and practices in ICT for training teachers on ICT integration in education.
A natural multiplicative efficiency measure for the Constant Returns to Scale Proportional Directional Distance Function (pDDF) is derived, relating its associated linear program to that of the well-known output-oriented radial efficiency measurement model. Based on this relationship, a traditional CCD (Caves, Christensen and Diewert) Malmquist ind...
Questions
Question (1)
Do you have some ideas on how to apply DEA or general efficiency measures to help fight COVID? Thanks!