Fig 7 - uploaded by Dolores Romero Morales

Content may be subject to copyright.

# The timeline of building the base regressors in F, solving Problem (1) to obtain the sparse ensemble for a given value of λ, and making the out-of-sample predictions.

Source publication

Since the seminal paper by Bates and Granger in 1969, a vast number of ensemble methods that combine different base regressors to generate a unique one have been proposed in the literature. The so-obtained regressor method may have better accuracy than its components, but at the same time it may overfit, it may be distorted by base regressors with...

## Contexts in source publication

**Context 1**

... due to the small amount of data and the lack of observations in some regions. Such cross-validation estimates are used to select the best values of the parameters. With those best values, for each combination of feeding data and methodology, the base regressors f ∈ F are built using information from t ∈ { 1 , . . . , T − 4 } , see Fig. 7 ...

**Context 2**

... complete procedure for making short-term predictions with our selective sparse ensemble methodology is summarized in Algorithm 1 and can be visualized in Fig. 7 . For the tests considered in this section, this grid is wide enough. On one extreme, we have included the trivial value λ = 0 , for which the selective sparsity term does not play a role. On the other extreme, with this grid we ensure that λ = λ • is reached, for which, by Proposition 1 , the ensemble shows the highest level of ...

## Similar publications

Nonconvex constrained optimization problems can be used to model a number of machine learning problems, such as multi-class Neyman-Pearson classification and constrained Markov decision processes. However, such kinds of problems are challenging because both the objective and constraints are possibly nonconvex, so it is difficult to balance the redu...

## Citations

... On the other hand, the government of Andalucía reported the accumulated demand of ICU. Tons of studies have been published during the pandemic all around the world in which different estimations of the demand were proposed with different statistical methodologies (see e.g., Benítez-Peña et al. (2021); Garcia-Vicuña et al. (2022); Mahmoudi et al. (2020; , among many others). We adopt a simplified estimation of the demand. ...

In this paper we provide a mathematical programming based decision tool to optimally reallocate and share equipment between different units to efficiently equip hospitals in pandemic emergency situations under lack of resources. The approach is motivated by the COVID-19 pandemic in which many Heath National Systems were not able to satisfy the demand of ventilators, sanitary individual protection equipment or different human resources. Our tool is based in two main principles: (1) Part of the stock of equipment at a unit that is not needed (in near future) could be shared to other units; and (2) extra stock to be shared among the units in a region can be efficiently distributed taking into account the demand of the units. The decisions are taken with the aim of minimizing certain measures of the non-covered demand in a region where units are structured in a given network. The mathematical programming models that we provide are stochastic and multiperiod with different robust objective functions. Since the proposed models are computationally hard to solve, we provide a divide-et-conquer math-heuristic approach. We report the results of applying our approach to the COVID-19 case in different regions of Spain, highlighting some interesting conclusions of our analysis, such as the great increase of treated patients if the proposed redistribution tool is applied.

... The approach in [12] was recently taken up in [4] in the context of predicting time series from the COVID-19 pandemic and an alternative penalizing term was introduced that is similar to the approach used in the LASSO regression model, where instead of models, variables are selected based on their marginal distribution. ...

Automated model selection is often proposed to users to choose which machine learning model (or method) to apply to a given regression task. In this paper, we show that combining different regression models can yield better results than selecting a single ('best') regression model, and outline an efficient method that obtains optimally weighted convex linear combination from a heterogeneous set of regression models. More specifically, in this paper, a heuristic weight optimization, used in a preceding conference paper, is replaced by an exact optimization algorithm using convex quadratic programming. We prove convexity of the quadratic programming formulation for the straightforward formulation and for a formulation with weighted data points. The novel weight optimization is not only (more) exact but also more efficient. The methods we develop in this paper are implemented and made available via github-open source. They can be executed on commonly available hardware and offer a transparent and easy to interpret interface. The results indicate that the approach outperforms model selection methods on a range of data sets, including data sets with mixed variable type from drug discovery applications.

... Thus, addressing the problem of estimating smooth curves which satisfy, for instance, non-negativity conditions, becomes a must to avoid situations like the one shown in Figure 1, which actually appeared on Spanish media in October 2020. Furthermore, being able to simulate different constrained prediction scenarios by incorporating expert knowledge is a challenge which has not been fully solved by existing short-term prediction approaches [1,2,22]. The methodology proposed in this paper allows the use to constrain the out-of-range predictions to emulate, for example, the evolution of the pandemic under different conditions such as, the growth rate during the second wave doubles or halves the one of the first wave. ...

... Estimating f is, in general, challenging and many possibilities exist. In this work, the penalized regression smoothing spline approach, also known as P −splines [20], is used to estimate f in (1). P −splines consist of a basis function approach using splines for regression, together with a penalization term. ...

In an era when the decision-making process is often based on the analysis of complex and evolving data, it is crucial to have systems which allow to incorporate human knowledge and provide valuable support to the decider. In this work, statistical modelling and mathematical optimization paradigms merge to address the problem of estimating smooth curves which verify structural properties, both in the observed domain in which data have been gathered and outwards. We assume that the smooth curve to be estimated is defined through a reduced-rank basis (B-splines) and fitted via a penalized splines approach (P-splines). In order to incorporate requirements about the sign, monotonicity and curvature in the fitting procedure, a conic programming approach is developed which, for the first time, successfully conveys out-of-range constrained prediction. In summary, the contributions of this paper are fourfold: first, a mathematical optimization formulation for the estimation of non-negative $P-$splines is proposed; second, previous results are generalized for the first time to the out-of-range prediction framework; third, both approaches, namely non-negative smoothing and out-of-sample prediction, are extended to other shape constraints and to multiple curves fitting;
and fourth, the approaches proposed in this paper has been implemented in the open source Python library cpsplines. The methodologies presented in this paper are illustrated using simulated instances and data about of the evolution of the COVID-19 pandemic and of mortality rates for different age groups.

... 1 selecting the features that have the greatest impact on the model as a whole [2, 4,35], but also knowing these locally for the decision made for each individual [24,25,28]. ...

Counterfactual explanations have become a very popular interpretability tool to understand and explain how complex machine learning models make decisions for individual instances. Most of the research on counterfactual explainability focuses on tabular and image data and much less on models dealing with functional data. In this paper, a counterfactual analysis for functional data is addressed, in which the goal is to identify the samples of the dataset from which the counterfactual explanation is made of, as well as how they are combined so that the individual instance and its counterfactual are as close as possible. Our methodology can be used with different distance measures for multivariate functional data and is applicable to any score-based classifier. We illustrate our methodology using two different real-world datasets, one univariate and another multivariate.

... Ensemble model predicts outbreak 7 days ahead for hospitalized and ICU patients. United Kingdom ARIMA model [79], nonlinear autoregressive artificial neural network (ANN) [80] The challenge of exponential growth should be combated with aggressive interventions. To control the pandemic and its infection at the hospital level, there is the need to adopt rapid control measures. ...

The world is currently overwhelmed with the perils of the outbreak of the coronavirus disease 2019 (COVID-19) pandemic. As of May 18, 2020, there were 4,819,102 confirmed cases, of which there were 316,959 deaths worldwide. The devastating effects of the COVID-19 pandemic on the world economy are more grievous than many natural disasters like earthquakes and tsunamis in history. Understanding the spread pattern of COVID-19 and predicting the disease dynamics have been essential to assist policymakers and health practitioners in the public and private health sector in providing an efficient way of alleviating the effects of the pandemic across continents. Scholars have steadily worked to provide timely information. Nevertheless, there is a lack of information on which insights can be derived from all these endeavors, especially with regard to modeling and prediction techniques. In this study, we used a literature synthesis approach to provide a narrative review of the current research efforts geared toward predicting the spread of COVID-19 across continents. Such information is useful to provide a global perspective of the virus particularly with regard to modeling and prediction techniques and their outcomes. A total of 69 peer-reviewed articles were reviewed. We found that most articles were from Asia (34.8%) and Europe (23.2%), followed by North America (14.5%), and very few emanated from other continents including Africa and Australia (6.8% each), while no study was reported in Antarctica. Most of the modeling and predictions were based on compartmental epidemiologic models and a few used advanced machine learning techniques. While some models have accurately predicted the end of the epidemic in some countries, other predictions strongly deviate from reality. Interestingly, some studies showed that combining artificial intelligence with classical compartmental models provides a better prediction of the disease spread. Assumptions made when parameterizing the models might be wrong and might not suit the local contexts and might partly explain the observed deviation from the reality on the ground. Furthermore, lack of publicly available key data such as age, gender, comorbidity, and historical medical data of cases and deaths in some continents could limit researchers in addressing some essential aspects of the virus spread and its consequences.

... interpretability) in Machine Learning, i.e., explaining how models arrive at decisions [1,27,33,46]. This includes selecting the features that impact the most the model as a whole [3,5,25,47], but also locally for the decision made for each individual [32,41]. Given an already trained Supervised Classification model, an effective class of post-hoc explanations are counterfactual explanations, i.e., a set of actions that can be taken by an instance (e.g., increase salary, decrease the current debt of an individual) such that the Machine Learning model at hand would have classified it in a different class (e.g., the loan request is granted to the individual). ...

Due to the increasing use of Machine Learning models in high stakes decision making settings, it has become increasingly important to be able to understand how models arrive at decisions. Assuming an already trained Supervised Classification model, an effective class of post-hoc explanations are counterfactual explanations, i.e., a set of actions that can be taken by an instance such that the given Machine Learning model would have classified it in a different class. For score-based multiclass classification models, we propose novel Mathematical Optimization formulations to construct the so-called collective counterfactual explanations, i.e., explanations for a group of instances in which we minimize the perturbation in the data (at the individual and group level) to have them labelled by the classifier in a given group. Although the approach is valid for any classification model based on scores, we focus on additive tree models, like random forests or XGBoost. Our approach is capable of generating diverse, sparse, plausible and actionable collective counterfactuals. Real-world data are used to illustrate our method.

... The typical losses such as the mean squared error or the expected misclassification cost may not be suitable to measure the accuracy for more complex response variables. In terms of sparsity, take, for instance, the case of time-series data, where we have an observation for each time period in the series, the response for this observation is the measurement in that time period and the features are the measurements in previous time periods, as in Benítez-Peña et al. (2020b) for the short-term predictions of the evolution of COVID-19. In this way, we have that individuals are characterized by p lags, but possibly other predictor variables. ...

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decision variables and the constraints required, as well as the optimization algorithms proposed. We illustrate how these powerful formulations enhance the flexibility of tree models, being better suited to incorporate desirable properties such as cost-sensitivity, explainability, and fairness, and to deal with complex data, such as functional data.

... Short term prediction models of disease spread were created using Random Forest on environmental predictors in [359,591,660]. RF was used to predict cases and deaths in different geographic regions; in Russia in [575], in the United States in [751], in a region of Spain in [74], in Iran in [572], in Morocco in [206], and worldwide in [777]. The authors in [23,150,290] used Random Forest to determine the effectiveness of social distancing and shelter-in-place orders at containing the spread of the virus. ...

The deadly coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has gone out of control globally. Despite much effort by scientists, medical experts, and society in general, the slow progress on drug discovery and antibody therapeutic development, the unknown possible side effects of the existing vaccines, and the high transmission rate of the SARS-CoV-2, remind us of the sad reality that our current understanding of the transmission, infectivity, and evolution of SARS-CoV-2 is unfortunately very limited. The major limitation is the lack of mechanistic understanding of viral-host cell interactions, the viral regulation, protein-protein interactions, including antibody-antigen binding, protein-drug binding, host immune response, etc. This limitation will likely haunt the scientific community for a long time and have a devastating consequence in combating COVID-19 and other pathogens. Notably, compared to the long-cycle, highly cost, and safety-demanding molecular-level experiments, the theoretical and computational studies are economical, speedy, and easy to perform. There exists a tsunami of the literature on molecular modeling, simulation, and prediction of SARS-CoV-2 that has become impossible to fully be covered in a review. To provide the reader a quick update about the status of molecular modeling, simulation, and prediction of SARS-CoV-2, we present a comprehensive and systematic methodology-centered narrative in the nick of time. Aspects such as molecular modeling, Monte Carlo (MC) methods, structural bioinformatics, machine learning, deep learning, and mathematical approaches are included in this review. This review will be beneficial to researchers who are looking for ways to contribute to SARS-CoV-2 studies and those who are assessing the current status in the field.

... Applied a wide variety of forecasting models, including autoregressive models, random forests, ridge regression and support vector regression, to provide very short-term forecasts of the cumulative number of confirmed cases in Brazil, and they compared the performances of individual models against an ensemble prediction. Similarly [30], used an optimization-based ensemble to find the best combination over a family of machine learning predictions and applied this methodology to predict the cumulative number of hospitalized patients in Andalusia. Following a different approach [31], used neural networks to extract features from time-series data and then used those features to feed standard compartment models for the purpose of describing the aggregated spread of the pandemic. ...

By early May 2020, the number of new COVID-19 infections started to increase rapidly in Chile, threatening the ability of health services to accommodate all incoming cases. Suddenly, ICU capacity planning became a first-order concern, and the health authorities were in urgent need of tools to estimate the demand for urgent care associated with the pandemic. In this article, we describe the approach we followed to provide such demand forecasts, and we show how the use of analytics can provide relevant support for decision making, even with incomplete data and without enough time to fully explore the numerical properties of all available forecasting methods. The solution combines autoregressive, machine learning and epidemiological models to provide a short-term forecast of ICU utilization at the regional level. These forecasts were made publicly available and were actively used to support capacity planning. Our predictions achieved average forecasting errors of 4% and 9% for one- and two-week horizons, respectively, outperforming several other competing forecasting models.

... Extracting knowledge from data is a crucial task in Statistics and Machine Learning, and is at the core of many fields, such as Biomedicine [103,152], Business Analytics [12,130,180], Computational Optimization [6,111,116,122], Criminal Justice [160,197], Cybersecurity [132], Health Care [21,27,55,171,179], Policy Making [9,10,11,113,188], Regulatory Benchmarking [20,70,117]. Mathematical Optimization plays an important role in building such models and interpreting their output [28,43,46,47,48,49,50,60,72,77,163], see [39,52,68,82,87,119,146,149,154] for surveys. ...

... The typical losses such as the mean squared error or the expected misclassification cost may not be suitable to measure the accuracy for more complex response variables. In terms of sparsity, take, for instance, the case of time series data, where we have an observation for each time period in the series, the response for this observation is the measurement in that time period and the features are the measurements in previous time periods, as in [21] for the short-term predictions of the evolution of COVID-19. In this way, we have that individuals are characterized by p lags, but possibly other predictor variables. ...

Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decision variables and the constraints required, as well as the optimization algorithms proposed. We illustrate how these powerful formulations enhance the flexibility of tree models, being better suited to incorporate desirable properties such as cost-sensitivity, explainability and fairness, and to deal with complex data, such as functional data.