# Emilio CarrizosaUniversidad de Sevilla | US · Facultad de Matemáticas

Emilio Carrizosa

PhD

President of math-in, the Spanish network for Maths in Industry.
Professor of Stats & OR at University of Seville

## About

230

Publications

32,884

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

2,665

Citations

Introduction

Professor of Operations Research and Statistics. Interested in Data Science, viewed from the perpective of a Mathematical Optimization researcher.
Any type of mathematical modeling issue applicable to realworld problems also makes me feel some type of click.

## Publications

Publications (230)

Many applications in data analysis study whether two categorical variables are independent using a function of the entries of their contingency table. Often, the categories of the variables, associated with the rows and columns of the table, are grouped, yielding a less granular representation of the categorical variables. The purpose of this is to...

In this paper, we tailor optimal randomized regression trees to handle multivariate functional data. A compromise between prediction accuracy and sparsity is sought. Whilst fitting the tree model, the detection of a reduced number of intervals that arecritical for prediction, as well as the control of their length, is performed. Local and global sp...

Many real-life applications consider nominal categorical predictor variables that have a hierarchical structure, e.g. economic activity data in Official Statistics. In this paper, we focus on linear regression models built in the presence of this type of nominal categorical predictor variables, and study the consolidation of their categories to hav...

In recent years, supervised classification has been used to support or even replace human decisions in high stakes domains. The training of these algorithms uses historical data which might be biased against individuals with certain sensitive attributes. The increasing concern about potential biases has motivated anti-discrimination laws prohibitin...

Counterfactual explanations have become a very popular interpretability tool to understand and explain how complex machine learning models make decisions for individual instances. Most of the research on counterfactual explainability focuses on tabular and image data and much less on models dealing with functional data. In this paper, a counterfact...

In this paper, we model an optimal regression tree through a continuous optimization problem, where a compromise between prediction accuracy and both types of sparsity, namely local and global, is sought. Our approach can accommodate important desirable properties for the regression task, such as cost-sensitivity and fairness. Thanks to the smoothn...

The Naïve Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. However, features are usually correlated, a fact that violates the Naïve Bayes’ assumption of conditional independence, and may deteriorate the method’s performance. Moreover, datasets are often characterized by a large number of features,...

The Naïve Bayes is a tractable and efficient approach for statistical classification. In general classification problems, the consequences of misclassifications may be rather different in different classes, making it crucial to control misclassification rates in the most critical and, in many realworld problems, minority cases, possibly at the expe...

We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors. The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting, especially when dealing with high-cardinality categorical predictors. This p...

In this paper, we tackle the problem of enhancing the interpretability of the results of Cluster Analysis. Our goal is to find an explanation for each cluster, such that clusters are characterized as precisely and distinctively as possible, i.e., the explanation is fulfilled by as many as possible individuals of the corresponding cluster, true posi...

In this paper, we make Cluster Analysis more interpretable with a new approach that simultaneously allocates individuals to clusters and gives rule-based explanations to each cluster. The traditional homogeneity metric in clustering, namely the sum of the dissimilarities between individuals in the same cluster, is enriched by considering also, for...

Due to the increasing use of Machine Learning models in high stakes decision making settings, it has become increasingly important to be able to understand how models arrive at decisions. Assuming an already trained Supervised Classification model, an effective class of post-hoc explanations are counterfactual explanations, i.e., a set of actions t...

In this paper, we design a Branch and Bound algorithm based on interval arithmetic to address nonconvex robust optimization problems. This algorithm provides the exact global solution of such difficult problems arising in many real life applications. A code was developed in MatLab and was used to solve some robust nonconvex problems with few variab...

Dynamic optimisation provides complex challenges for optimal solution, but greatly in- creases applicability when considering time dependent situations. In this work, a constrained dynamic optimisation problem is analysed and subsequently applied to the resolution of a real-world engineering problem concerning Solar Power Tower plants. We study the...

We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors. The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting, especially when dealing with high-cardinality categorical predictors. This p...

The Lasso has become a benchmark data analysis procedure, and numerous variants have been proposed in the literature. Although the Lasso formulations are stated so that overall prediction error is optimized, no full control over the accuracy prediction on certain individuals of interest is allowed. In this work we propose a novel version of the Las...

In this paper, our goal is to enhance the interpretability of Generalized Linear Models by identifying the most relevant interactions between categorical predictors. In the presence of categorical predictors, searching for interaction effects can quickly become a highly combinatorial problem when we have many categorical predictors or even a few, b...

Since the seminal paper by Bates and Granger in 1969, a vast number of ensemble methods that combine different base regressors to generate a unique one have been proposed in the literature. The so-obtained regressor method may have better accuracy than its components, but at the same time it may overfit, it may be distorted by base regressors with...

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decisi...

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classificatio...

In this paper, we tackle the problem of enhancing the interpretability of the results of Cluster Analysis. Our goal is to find an explanation for each cluster, such that clusters are characterized as precisely and distinctively as possible, i.e., the explanation is fulfilled by as many as possible individuals of the corresponding cluster, true posi...

We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors. The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting, especially when dealing with high-cardinality categorical predictors. This p...

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decisi...

In this article we consider an aggregate loss model with dependent losses. The loss occurrence process is governed by a two-state Markovian arrival process (MAP 2), a Markov renewal process that allows for (1) correlated inter-loss times, (2) non-exponentially distributed inter-loss times and, (3) overdisperse loss counts. Some quantities of intere...

This paper investigates how the production policy, as well as other factors, affect the facility location-allocation decisions. We focus on a p-median location problem in which one single perishable product is to be produced and shipped to a set of users. The time-correlated demands of the clients are generated by autoregressive processes, and they...

When continuously monitoring processes over time, data is collected along a whole period, from which only
certain time instants and certain time intervals may play a crucial role in the data analysis. We develop a method that addresses the problem of selecting a finite and small set of short intervals (or instants) able to capture the information n...

When continuously monitoring processes over time, data is collected along a whole period, from which only certain time instants and certain time intervals may play a crucial role in the data analysis. We develop a method that addresses the problem of selecting a finite and small set of short intervals (or instants) able to capture the information n...

Since the seminal paper by Bates and Granger in 1969, a vast number of ensemble methods that combine different base regressors to generate a unique one have been proposed in the literature. The so-obtained regressor method may have better accuracy than its components , but at the same time it may overfit, it may be distorted by base regressors with...

In this paper, we model an optimal regression tree through a continuous optimization problem, where a compromise between prediction accuracy and both types of sparsity, namely local and global, is sought. Our approach can accommodate important desirableproperties for the regression task, such as cost-sensitivity and fairness. Thanks to thesmoothnes...

Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models for 2-class classification. Classification in SVM is based on a score procedure, yielding a deterministic classification rule, which can be transformed into a probabilistic rule (as implemented in off-the-shelf SVM libraries), but...

In this paper, we propose a mathematical optimization approach to cluster the rows and/or columns of contingency tables to detect possible statistical dependencies among the observed variables. With this, we obtain a clustered contingency table of smaller size, which is desirable when interpreting the statistical dependence results of the observed...

In this paper, we study linear regression models built on categorical predictor variables that have a hierarchical structure. For such variables, the categories are arranged as a directed tree, where the categories in the leaf nodes give the highest granularity in the representation of the variable. Instead of taking the fully detailed model, the u...

COVID-19 is an infectious disease that was first identified in China in December 2019. Subsequently COVID-19 started to spread broadly, to also arrive in Spain by the end of Jan-uary 2020. This pandemic triggered confinement measures, in order to reduce the expansion of the virus so as not to saturate the health care system. With the aim of providi...

Identifying key members in a social network is critical to understand the underlying system behavior. Whereas there are several measures designed to discern the most central member, they fail to identify a central set of members and at the same time reveal the spheres of influence of the individuals in such central set. Here, we combine eigenvector...

One of the main challenges researchers face is to identify the most relevant features in a prediction model. As a consequence, many regularized methods seeking sparsity have flourished. Although sparse, their solutions may not be interpretable in presence of spurious coefficients and correlated features. In this paper we aim to enhance interpretabi...

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions...

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions...

Exploratory Factor Analysis (EFA) is a widely used statistical technique to discover the structure of latent unobserved variables, called factors, from a set of observed variables. EFA exploits the property of rotation invariance of the factor model to enhance factors' interpretability by building a sparse loading matrix. In this paper, we propose...

Optimising the aiming strategy is crucial for Solar Power Tower plants, in order to maximise the energy generated, whilst also preventing catastrophic damage to receiver components. In this work, a bi-objective optimisation model is developed to find optimal aiming strategies for a Solar Power Tower plant. The primary objective to maximise the radi...

In this paper we propose an optimization model and a solution approach to visualize datasets which are made up of individuals observed along different time periods. These individuals have attached a time-dependent magnitude and a dissimilarity measure, which may vary over time. Difference of convex optimization techniques, namely, the so-called Dif...

Exploratory Factor Analysis (EFA) is a widely used statistical technique to discover the structure of latent unobserved variables, called factors, from a set of observed variables. EFA exploits the property of rotation invariance of the factor model to enhance factors' interpretability by building a sparse loading matrix. In this paper, we propose...

We consider a nonlinear version of the Uncapacitated Facility Location Problem (UFLP). The total cost in consideration consists of a fixed cost to open facilities, a travel cost in proportion to the distance between demand and the assigned facility, and an operational cost at each open facility, which is assumed to be a concave nondecreasing functi...

In covering location models, one seeks the location of facilities optimizing the weight of individuals covered, i.e., those at the distance from the facilities below a threshold value. Attractive facilities are wished to be close to the individuals, and thus the covering is to be maximized, while for repulsive facilities the covering is to be minim...

When classification methods are applied to high-dimensional data, selecting a
subset of the predictors may lead to an improvement in the predictive ability of
the estimated model, in addition to reducing the model complexity. In Functional
Data Analysis (FDA), i.e., when data are functions, selecting a subset of
predictors corresponds to selecting...

Functional Data Analysis (FDA) is devoted to the study of data which are functions. Support Vector Machine (SVM) is a benchmark tool for classification, in particular, of functional data. SVM is frequently used with a kernel (e.g.: Gaussian) which involves a scalar bandwidth parameter. In this paper, we propose to use kernels with functional bandwi...

Soiling of heliostat surfaces due to local climate has a direct impact on their optical efficiency and therefore a direct impact on the productivity of the Solar Power Tower plant. Cleaning techniques applied are dependent on plant construction and current schedules are normally developed considering heliostat layout patterns, providing sub-optimal...

Functional Data Analysis (FDA) is devoted to the study of data which are functions. Support Vector Machine (SVM) is a benchmark tool for classification, in particular, of functional data. SVM is frequently used with a kernel (e.g.: Gaussian) which involves a scalar bandwidth parameter. In this paper, we propose to use kernels with functional bandwi...

Decision trees are popular Regression and Classification tools, easy to interpret and with excellent performance. The training process is very fast, since a greedy approach is used to build the tree. In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance. In this paper,...

Inclement weather effects have a direct impact on the efficiency of a Solar Power Tower plant and have the potential to damage the receiver by flash heating. An optimised aiming strategy for the heliostat field mitigates the risk of receiver damage and maximises plant efficiency. A stochastic integer programming approach is applied to optimise the...

Classification and Regression Trees (CARTs) are an off-the-shelf technique in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, the classificatio...

Support vector machine (SVM) is a powerful tool in binary classification, known to attain excellent misclassification rates. On the other hand, many realworld classification problems, such as those found in medical diagnosis, churn or fraud prediction, involve misclassification costs which may be different in the different classes. However, it may...

In this article we develop a novel online framework to visualize news data over a time horizon. First, we perform a Natural Language Processing analysis, wherein the words are extracted, and their attributes, namely the importance and the relatedness, are calculated. Second, we present a Mathematical Optimization model for the visualization problem...

Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to t...

In this paper we develop a new framework to visualize datasets which are made up of individuals observed along different time periods. These individuals have attached a time-dependent magnitude and a dissimilarity measure, which may vary over time. A mathematical optimization model is proposed and solved by means of difference of convex optimizatio...

In this paper we develop an online tool to visualize news data over a time horizon. First, we perform a Natural Language Processing analysis, where the words are extracted, and their attributes, namely the importance and the relatedness, are calculated. Second, we present a Mathematical Optimization model for the visualization problem and a numeric...

Poster presented at SolarPACES 2017 conference in Santiago, Chile.

In this work we present a new methodology for solving an inverse identification problem with application in chemistry, using two approaches in cascade. More precisely, we are interested in the identification of kinetic models and their corresponding parameters in stirred tank reactors, using a set of experimental data and the reactions taking place...

In this paper we address the problem of visualizing a frequency distribution and an ad-jacency relation attached to a set of individuals. We represent this information using a rectangular map, i.e., a subdivision of a rectangle into rectangular portions so that each portion is associated with one individual, their areas reflect the frequencies, and...