# Rafael BlanqueroUniversidad de Sevilla | US · Facultad De Matemáticas

Rafael Blanquero

PhD

## About

53

Publications

8,152

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

541

Citations

Citations since 2016

Introduction

Additional affiliations

January 1996 - present

## Publications

Publications (53)

In this paper, we tailor optimal randomized regression trees to handle multivariate functional data. A compromise between prediction accuracy and sparsity is sought. Whilst fitting the tree model, the detection of a reduced number of intervals that arecritical for prediction, as well as the control of their length, is performed. Local and global sp...

In this paper, we model an optimal regression tree through a continuous optimization problem, where a compromise between prediction accuracy and both types of sparsity, namely local and global, is sought. Our approach can accommodate important desirable properties for the regression task, such as cost-sensitivity and fairness. Thanks to the smoothn...

The Naïve Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. However, features are usually correlated, a fact that violates the Naïve Bayes’ assumption of conditional independence, and may deteriorate the method’s performance. Moreover, datasets are often characterized by a large number of features,...

The Naïve Bayes is a tractable and efficient approach for statistical classification. In general classification problems, the consequences of misclassifications may be rather different in different classes, making it crucial to control misclassification rates in the most critical and, in many realworld problems, minority cases, possibly at the expe...

The Lasso has become a benchmark data analysis procedure, and numerous variants have been proposed in the literature. Although the Lasso formulations are stated so that overall prediction error is optimized, no full control over the accuracy prediction on certain individuals of interest is allowed. In this work we propose a novel version of the Las...

Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classificatio...

When continuously monitoring processes over time, data is collected along a whole period, from which only
certain time instants and certain time intervals may play a crucial role in the data analysis. We develop a method that addresses the problem of selecting a finite and small set of short intervals (or instants) able to capture the information n...

When continuously monitoring processes over time, data is collected along a whole period, from which only certain time instants and certain time intervals may play a crucial role in the data analysis. We develop a method that addresses the problem of selecting a finite and small set of short intervals (or instants) able to capture the information n...

In this paper, we model an optimal regression tree through a continuous optimization problem, where a compromise between prediction accuracy and both types of sparsity, namely local and global, is sought. Our approach can accommodate important desirableproperties for the regression task, such as cost-sensitivity and fairness. Thanks to thesmoothnes...

Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models for 2-class classification. Classification in SVM is based on a score procedure, yielding a deterministic classification rule, which can be transformed into a probabilistic rule (as implemented in off-the-shelf SVM libraries), but...

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions...

Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions...

When classification methods are applied to high-dimensional data, selecting a
subset of the predictors may lead to an improvement in the predictive ability of
the estimated model, in addition to reducing the model complexity. In Functional
Data Analysis (FDA), i.e., when data are functions, selecting a subset of
predictors corresponds to selecting...

Functional Data Analysis (FDA) is devoted to the study of data which are functions. Support Vector Machine (SVM) is a benchmark tool for classification, in particular, of functional data. SVM is frequently used with a kernel (e.g.: Gaussian) which involves a scalar bandwidth parameter. In this paper, we propose to use kernels with functional bandwi...

Functional Data Analysis (FDA) is devoted to the study of data which are functions. Support Vector Machine (SVM) is a benchmark tool for classification, in particular, of functional data. SVM is frequently used with a kernel (e.g.: Gaussian) which involves a scalar bandwidth parameter. In this paper, we propose to use kernels with functional bandwi...

Decision trees are popular Regression and Classification tools, easy to interpret and with excellent performance. The training process is very fast, since a greedy approach is used to build the tree. In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance. In this paper,...

Classification and Regression Trees (CARTs) are an off-the-shelf technique in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, the classificatio...

Support vector machine (SVM) is a powerful tool in binary classification, known to attain excellent misclassification rates. On the other hand, many realworld classification problems, such as those found in medical diagnosis, churn or fraud prediction, involve misclassification costs which may be different in the different classes. However, it may...

Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to t...

A fundamental problem in the analysis of chemical reactions networks consists of identifying concentrations values along time or in steady state which are coherent with the experimental concentrations data available. When concentrations measurements are incomplete, either because information is missing about the concentration of a species at a part...

Official Statistics call for data by individual age, since a significant number of statistical operations, such as the calculation of demographic indicators, require the use of degrouped population figures. However, in some countries or regions population data are only available in a grouped form, usually as quinquennial age groups plus a large ope...

Model inference is a challenging problem in the analysis of chemical reactions networks. In order to empirically test which, out of a catalogue of proposed kinetic models, is governing a network of chemical reactions, the user can compare the empirical data obtained in one experiment against the theoretical values suggested by the models under cons...

The p-facility Huff location problem aims at locating facilities on a competitive environment so as to maximize the market share. While it has been deeply studied in the field of continuous location, in this paper we study the p-facility Huff location problem on networks formulated as a Mixed Integer Nonlinear Programming problem that can be solved...

In certain countries population data are available in grouped form only, usually as quinquennial age groups plus a large open-ended range for the elderly. However, official statistics call for data by individual age since many statistical operations, such as the calculation of demographic indicators, require the use of ungrouped population data. In...

Covering problems are well studied in the Operations Research literature under the assumption that both the set of users and the set of potential facilities are finite. In this paper, we address the following variant, which leads to a Mixed Integer Nonlinear Program (MINLP): locations of p facilities are sought along the edges of a network so that...

Where to locate one or several facilities on a network so as to minimize the expected users-closest facility transportation cost is a problem well studied in the OR literature under the name of median problem.
In the median problem users are usually identified with nodes of the network. In many situations, however, such assumption is unrealistic, s...

Huff location problems have been extensively analyzed within the field of competitive
continuous location.
In this work, two Huff location models on networks are addressed, by considering that
users go directly to the facility or they visit the facility in their way to a destination.
Since the problems are multimodal, a branch and bound algorithm i...

What is the optimum way of describing the age-specific fertility pattern by mathematical functions? We propose a parametric fitting model, based on a mixture of Weibull functions, which performs well for countries where the fertility curve shows a non-traditional pattern. We also consider a simplified version of this model with a reduced number of...

La comunicación aborda el diseño de un modelo estadístico de predicción dinámica de la sequía
en Andalucía y su persistencia en un horizonte temporal de 12 meses a partir de los datos históricos
(1950-2012) del Índice Estandarizado de Sequía Pluviométrica (IESP) en 243 observatorios de
Andalucía. Se emplea un algoritmo kNN (k-Nearest Neighbors), qu...

A new model for locating a competitive facility in the plane in a robust way is presented and embedded in the literature on robustness in facility location. Its mathematical properties are investigated and new sharp bounds for a deterministic method that guarantees the global optimum are derived and evaluated.

A global optimization procedure is proposed to find a line in the Euclidean three-dimensional space which minimizes the sum of distances to a given finite set of three-dimensional data points.Although we are using similar techniques as for location problems in two dimensions, it is shown that the problem becomes much harder to solve. However, a pro...

We address the following single-facility location problem: a firm is entering into a market by locating one facility in a region of the plane. The demand captured from each user by the facility will be proportional to the users buying power and inversely proportional to a function of the user-facility distance. Uncertainty exists on the buying powe...

It is well known that, if a vector-valued function can be written as difference of componentwise convex functions, the norm
of such function inherits this property. In this note we show that, if the norm in use is monotonic in the positive orthant
and the functions are non-negative, a sharper decomposition can be obtained.

The Big Triangle Small Triangle method has shown to be a powerful global optimization procedure to address continuous location
problems. In the paper published in J. Global Optim. (37:305–319, 2007), Drezner proposes a rather general and effective approach for constructing the bounds needed. Such bounds
are obtained by using the fact that the objec...

We address the problem of locating objects in the plane such as segments, arcs of circumferences, arbitrary convex sets, their complements or their boundaries. Given a set of points, we seek the rotation and translation for such an object optimizing a very general performance measure, which includes as a particular case the classical objectives in...

Several multi-criteria-decision-making methodologies assume the existence of weights associated with the different criteria, reflecting their relative importance.One of the most popular ways to infer such weights is the analytic hierarchy process, which constructs first a matrix of pairwise comparisons, from which weights are derived following one...

One of the practical handicaps for the application of the percolation theory to estimate the percolation threshold of drugs in controlled release systems is the fact that the dissolution studies must be carried out so that only one surface of the tablet is exposed to the dissolution medium. The aim of this work is to estimate the percolation thresh...

In this paper we address the biobjective problem of locating a semiobnoxious facility, that must provide service to a given set of demand points and, at the same time, has some negative effect on given regions in the plane. In the model considered, the location of the new facility is selected in such a way that it gives answer to these contradictin...

In this note we address the problem of finding the GM-estimator for the location parameter of a univariate random variable. When this problem is non-convex but d.c. one can use a standard covering method, which, in the one-dimensional case has a simple form. In this paper we exploit the structure of the problem in order to obtain d.c. decomposition...

In this paper, we show that a DC representation can be obtained explicitly for the composition of a gauge with a DC mapping, so that the optimization of certain functions involving terms of this kind can be made by using standard DC optimization techniques. Applications to facility location theory and multiple-criteria decision making are presented...

Covering methods constitute a broad class of algorithms for solving multivariate Global Optimization problems. In this note we show that, when the objective f is d.c. and a d.c. decomposition for f is known, the computational burden usually suffered by multivariate covering methods is significantly reduced. With this we extend to the (non-different...

In this paper we address the problem of locating a semiobnoxious facility i n t h e plane, where the transportation costs are measured on a network and the environmental costs are measured through planar distances. Since the facility is not forced to belocated in the network, the construction of a link connecting them may be necessary. We derive so...

Here we consider the multiobjective quadratic problem with convex objective functions: (MQP)
$$
\begin{array}{*{20}{c}}
{Min\left( {{f_1}\left( x \right), \ldots ,{f_m}\left( x \right)} \right)} \\
{x \in \mathbb{R}}
\end{array}
$$ (1)
where ƒi
(x)=1/2 x
t
A
i
x+b
i
t
x, i = 1,…,m are convex functions, x,b
i
∈ ℝn
, and A
i
∈ M
n×n
We present the re...

INTRODUCCIÓN Los estudios descriptivos de utilización de antimicro-bianos en el medio hospitalario, basados en la cuantifi cación en base a Dosis Diarias Definidas por 100 estan-cias-día (DDD/100 Est-día) son una herramienta útil para monitorizar la utilización de dichos fármacos. Así, son de gran interés para realizar comparaciones entre centros o...

## Projects

Projects (2)

NeEDS (Network of European Data Scientists) provides an integrated modelling and computing environment that facilitates data analysis and data visualization to enhance interaction. NeEDS brings together an excellent interdisciplinary research team that integrates expertise from three relevant academic disciplines, Mathematical Optimization, Visualization and Network Science, and is excellently placed to tackle the challenges. NeEDS develops mathematical models, yielding results which are interpretable, easy-to-visualize, and flexible enough to incorporate user knowledge from complex data. These models require the numerical resolution of computationally demanding Mixed Integer Nonlinear Programming formulations, and for this purpose NeEDS develops innovative mathematical optimization based heuristics.

New mathematical programming problems will be provided to address Supervised Classification problems in which misclassification costs are class-dependent. Exact and heuristic approaches will be designed and tested