Content uploaded by Milton Soto-Ferrari

Author content

All content in this area was uploaded by Milton Soto-Ferrari on Jan 13, 2023

Content may be subject to copyright.

AGGFORCLUS: A Hybrid Methodology Integrating Forecasting with

Clustering to Assess Mitigation Plans and Contagion Risk in Pandemic

Outbreaks: The COVID-19 Case Study

Milton Soto-Ferrari1*, Alejandro Carrasco-Pena2, Diana Prieto3,4

1. Scott College of Business, Indiana State University, Terre-Haute, IN, USA

2. Faculty of Science and Technology, Libera Università di Bolzano, Bolzano, Italy

3. School of Industrial Engineering, Pontificia Universidad Católica de Valparaíso, Valparaíso,

Chile

4. Johns Hopkins Carey Business School, Baltimore, Maryland

Provide full correspondence details here, including e-mail for the corresponding author

Milton Soto-Ferrari, PhD

Scott College of Business. Indiana State University, Terre Haute, IN, USA.

milton.soto-ferrari@indstate.edu

Alejandro Carrasco-Pena, PhD

Faculty of Science and Technology. Libera Università di Bolzano, Bolzano, Italy.

acarrascopena@unibz.it

Diana Prieto, PhD

School of Industrial Engineering, Pontificia Universidad Católica de Valparaíso, Valparaíso,

Chile.

Johns Hopkins Carey Business School, Baltimore, Maryland.

diana.prieto@pucv.cl

Corresponding Author:

Name: Milton Soto-Ferrari, PhD

E-mail: milton.soto-ferrari@indstate.edu

Address: 30 N 7th St, Terre Haute, IN 47809

Telephone/fax number: +1(812) 237-2276

AGGFORCLUS: A Hybrid Methodology Integrating Forecasting with

Clustering to Assess Mitigation Plans and Contagion Risk in Pandemic

Outbreaks: The COVID-19 Case Study

The COVID-19 pandemic showed governments’ unpreparedness as decision-makers hastily

created restrictions and policies to contain its spread. Identifying prospective areas with a

higher contagion risk can reduce mitigation planning uncertainty. This research proposes a

risk assessment metric called AGGFORCLUS that integrates time-series forecasting and

clustering to convey joint information on predicted caseload growth and variability, thereby

providing an educated yet visually simple view of the risk status. In AGGFORCLUS, the

development is sectioned into three phases. Phase I forecasts confirmed cases using a mixture

of five different forecasting methods. Phase II develops the identified best model forecasts

for an extended ten-day horizon, including their prediction intervals. In Phase III, we

calculate average growth metrics for predictions and use them to cluster series by their

multidimensional average growth. We present the results for various countries framed into a

nine-quadrant risk-grouped associated measure linked to the expected cumulative caseload

progress and uncertainty.

Keywords: COVID-19 Pandemic; Time Series Forecasting; Clustering; Risk Assessment;

Mitigation Plans Strictness

1. Introduction

COVID-19 (CDC, 2020) is a type of acute respiratory infectious disease of person-to-

person transmission, which was declared a pandemic by the World Health Organization (WHO)

in the first quarter of 2020 (WHO, 2020). While the strategies to reduce the spread nowadays rely

mostly on vaccination and self-care (Soto-Ferrari et al., 2021), most nations advocated intense

mitigation plans at the time of its inception. Due to the novelty of the virus, mitigation plans,

including social distancing and national lockdowns, caused multiple effects that are still reflected

in numerous global facets (Le & Nguyen., 2021).

The infection caused significant upheaval worldwide in the economy and social life. The

pandemic showed some countries’ unpreparedness as decision-makers hurriedly created

restrictions and policies to try and contain the spread (Frutos et al., 2021). While most policies

were enforced at the country level and included strategies for mobility restrictions (lockdowns),

physical distancing, hygienic measures, socio-economic limitations, healthcare network

enhancement, heightened means of communication, and international support mechanisms, it was

common to find regions within that were in dissimilar contagion stages (Johns Hopkins, 2020).

Areas with, for instance, a higher population density were expected to have a more significant

infection spread degree. Thus, policies and regulations were anticipated to be more severe or

intense, including fees or even lockup time for the public if the established procedures or guidelines

were not followed (Kaur et al., 2021).

The lockdown enforcement was also proportional to the severity of the cases reported in

an area and how the contagion of the disease progressed. Although, when decision-makers were

trying to agree on the rigorousness or the stringency of the policies, some recommendations relied

on agent-based simulations and compartmental models (Ferguson et al., 2020). These prototypes

require multiple unknown or unspecified datasets with missing parameters not available during the

pandemic outbreak’s peak (e.g., COVID-19 epidemiological parameters were still under

investigation during the initial uprising of cases worldwide). The models made a handful of

assumptions to be functional. While valuable and informative, their results cannot be considered

predictions or forecasts, particularly when parameters continue to be estimated. Understanding

how outbreaks, such as COVID-19 evolve and identifying prospective areas with a higher risk of

contagion can reduce the uncertainty of mitigation planning during sudden pandemics outbreaks.

There is a significant body of existing approaches to model and forecast pandemics and

seasonal influenza (Chretien et al., 2014; Prieto et al., 2012; Reich et al., 2019; Soto-Ferrari et al.,

2013) and, nowadays, COVID-19 particularly (Gecili et al., 2021; Gharoie Ahangar et al., 2020;

Maleki., 2020; Medeiros et al., 2022; Papastefanopoulos et al., 2020; Petropoulos et al., 2020; Rui

et al., 2021) with a handful of investigations (Chen at al., 2021, Hale et al., 2021, Violato et al.,

2021, Wong et al., 2020) assessing the relationship between total cases, healthcare occupancy,

and death projections with the intensity of the guidelines and the reasonable restrictions that a

region might enforce. However, while the available research offers multiple components of

analysis, which in most cases require the estimation of data not obtainable when the pandemic is

first triggered, our emphasis in this article is to propose a hybrid classification metric called

AGGFORCLUS that integrates time series forecasting and clustering evaluation to classify the

risk of contagion in regions during a pandemic outbreak. The procedure comprises only aggregated

caseload information which in most circumstances is the only attribute accessible when the event

is initially triggered. The classification proposed in AGGFORCLUS conveys joint information on

the forecasted caseload growth and data variability for each region, thereby providing the decision

maker with an educated, visually simple view of the risk status. The stringency of containment

measures can then be planned contingent upon the inherent deficiencies present in each region’s

data collection and reporting systems.

AGGFORCLUS stands for the approach’s three development phases composed of (I) data

aggregation and performance evaluation (AGG), (II) prospect forecasting (FOR), and (III)

clustering (CLUS). In phase I (AGG), we follow a similar approach to Petropoulos et al. (2020) to

forecast aggregated confirmed cases in the short-term (10 days) in several countries, defining

training and testing sets in the actual progression of the pandemic. However, in our methodology,

we apply a mixture of five different forecasting methods, including a benchmark or baseline

forecast (i.e., Naïve) and various combinations of exponential smoothing and autoregressive

integrated moving average models with bootstrapping or bagged forecasting approaches instead

of a simple time series model as proposed by Petropoulos et al. (2020). We are implementing (1)

bagged exponential smoothing and (2) bagged auto-regressive integrated moving average models

when fundamental models (e.g., simple exponential smoothing) do not seem to provide accurate

forecasts (Bergmeir et al., 2016; Petropoulos et al., 2018). Forecasts are calculated individually for

each of the methods, and the model’s performance is determined by calculating the root mean

squared error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error

(MAPE).

In phase II (FOR), the development considers forecast projections for an extended horizon

beyond the testing set matching its size (i.e., ten days ahead), implementing the identified

recalibrated method with the lowest error from phase I for each analyzed country. The forecasts

will include their corresponding 95% prediction intervals (Hyndman et al., 2001). This outcome

implies that the forecast for each coming day will consist of three values: the forecast point and

the 95% lower and upper bounds. Up to this line of reasoning, most time-series forecasting models

for predicting pandemics are available in the literature with distinctive methodologies. Inferences

where further evaluations, as well as different horizons and forecasts of deaths, including

healthcare resources, are detailed presented and described in numerous recent studies (Doornik et

al., 2020; Gecili et al., 2021; Gharoie Ahangar et al., 2020; Maleki., 2020; Medeiros et al., 2022;

Papastefanopoulos et al., 2020; Petropoulos et al., 2020; Rui et al., 2021;).

Indeed, in phase III (CLUS) of our proposal, we introduce the novelty of AGGFORCLUS.

Here, we extend the assessment and use the three forecast projections to calculate the growth

metrics (analogous to a slope calculation) for point forecasts and prediction intervals. We use them

to cluster series by their multidimensional average case growth load. Given the assembly, the

forecasts are grouped into associated values linked to the expected cumulative caseload growth

and expected cases deviation. In this grouped classification, we proceed to include a visual cluster

segmentation (i.e., risk quadrants) using the combined growth metrics’ first and third quartiles as

cut points that construe a region-related contagion risk measure based on the expected cases’

progress and their uncertainty. In the implementation presented here, the regions are countries

classified into nine quadrants. A country with a higher quadrant corresponds to an elevated risk of

contagion relative to the caseload volume and uncertainty.

Our study implementation is planned with the COVID-19 pandemic data. As the

application could be functional for any number of series and further pandemic outbreak

interventions where only the data about cases is available, in the approach, we propose two forms

of analysis: (i) multiple-origin and (ii) rolling-origin evaluations (Tashman, 2000). We consider

30 countries worldwide to detail the proposed methodology’s implementation phases in three

rounds using the multiple-origin framework. As part of this analysis, we included the progression

of the pandemic throughout various key time windows, specifically (1) when the COVID-19

mitigation plans were first entirely in effect with multiple lockdowns worldwide (April 2020-May

2020); (2) when the Delta variant reflected (Jan 2021 – Feb 2021) and (3) the Omicron variant

advanced (Jan 2022 – Feb 2022) (WHO, 2020). In the rolling-origin assessment, we expanded

additional experiments to five continuous rounds but now, with 60 countries in an enlarged

evaluation, starting in May 2020, this exemplifies the use of the approach if this was to be

employed in a real-time exercise when the pandemic is triggered.

We perform a parallel evaluation of our results with the Oxford COVID-19 Government

Response Tracker (Hale et al., 2021), which tracks the stringency of policy level of nations

worldwide and records the strictness of country-level guidelines such as lockdowns. From this

evaluation, we could continually justify each country’s stringency with the contagion risk provided

by AGGFORCLUS. The data used for this study was extracted from the Center for System Science

and Engineering (CSSE) at John Hopkins University (Johns Hopkins, 2020). This source provides

confirmed cases, deaths, and recovered cases per country.

The structure of this work is as follows. Section 2 presents a description of current

forecasting developments for the COVID-19 pandemic. Section 3 describes the AGGFORCLUS

components and methodology. Section 4 describes the data and detailed assessment of forecasting

performances with the AGGFORCLUS application results, complementing the discussion of the

subsequent risk classification. Section 5 presents the additional experiments. Finally, section 6

concludes the paper.

2. Literature Review

The following is a temporal review of the prediction models available in the literature about

pandemic outbreaks, specifically those related to COVID-19.

Numerous efforts and prototypes have been developed to predict the spread and impact

since the early stages of the COVID-19 pandemic. Due to the rapid increase in the number of cases

worldwide, the transmission rate of the virus, and the reporting promptness in every country, the

forecasting models provided predictions that showed the evolution of the pandemic primarily in

the short-term for the number of cases, the number of deaths, and the recovered patients

(Papastefanopoulos et al., 2020; Rahimi et al., 2021). Various investigations denote that a short-

term horizon for prediction in the context of the COVID-19 pandemic is in the range of 7 to 12

days (Abbasimehr & Paki, 2021; Borghi et al., 2021; Chimmula and Zhang., 2020; Doornik et al.,

2020; Maleki et al., 2020; Medeiros et al., 2022; Petropoulos et al., 2020; Rauf et al., 2021; Zhao

et al., 2021) given this is how at the time of the uprising of the cases, governments prepared their

short-term planning to respond to this epidemic.

Examples of this type of evaluation are reported by Maleki et al. (2020) and Petropoulos

et al. (2020), where autoregressive time series models based on two-piece scale mixture normal

distributions (TP-SMN-AR) and exponential smoothing representations, respectively, were used

to develop a 10-day prediction of the number of reported, deaths, and recovered cases worldwide.

Additional localized examples with different time-series approaches are presented in the work of

Salgotra et al. (2020), where a gene expression programming (GEP) approach is used to develop

models that find a relationship between the input and output variables from a hierarchical tree-like

structure to predict the number of cases and number of deaths in India. Furthermore, Chimmula

and Zhang (2020) used a Long Short-Term Memory Network (LSTM) methodology to develop a

model for disease transmission in Canada and estimate when the peak in the number of cases is

reached. Borghi et al. (2021) showed the development of a multilayer Perceptron artificial neural

network to predict the spread of COVID-19 over the next six days. The model takes data from 30

countries in a 20-day context using four-time series made from the number of accumulated infected

cases, new cases, accumulated deaths, and new deaths of each country. Smoothening of the data

was performed using a moving average filter with a window size of 3 and a normalization by the

maximum value.

Moreover, Rauf et al. (2021) used deep learning techniques such as LSTM networks,

Recurrent Neural Network (RNN), and Gated Recurrent Units (GRU) to forecast the number of

COVID-19 incidence cases for a period of 10 days into the future with 90% accuracy for the

countries of Afghanistan, Bangladesh, India, and Pakistan. Abbasimehr & Paki (2021) indicated

three hybrid approaches to forecast COVID-19 in 10 countries. This model combines LSTM and

CNN models with multi-head attention and a projected Bayesian optimization algorithm to

develop short-term forecasts of 10 days into the future. Bayesian optimization was chosen due to

its superiority over grid search in deep learning models and its efficiency in finding the optimal

hyperparameters with fewer iterations compared to grid search.

An auto-regressive integrated moving average (ARIMA) model is presented in the work of

Tandon et al. (2020) with parameters (p, d, q) being (2, 2, 2) where p is the order of auto-regression,

d is the degree of trend difference, and q is the order of moving average. Likewise, Kufel et al.

(2020) also used an ARIMA application to predict the evolution of COVID-19 cases in selected

European countries using the parameters (p, d, q) (1, 2, 0), which according to the authors, were

appropriate for the prediction of the pandemic’s dynamics over the population selected. Zhao et

al. (2021) presented a modeling approach for observed incidence utilizing a Poisson distribution

for daily cases and a Gamma distribution for the series interval. The adequate reproduction number

was estimated by assuming that this value remained constant during a short period (7 to 12 days),

predicting future cases from their posterior distributions, and accepting that the transmission rate

stays the same or has minor changes.

A set of studies focused on evaluating the performance of LSTM models when predicting

pandemic outbreaks. Bodapati et al. (2020) showed that LSTM models are one of the best dynamic

models used to generate sequences in multiple domains, including pandemics’ progression. The

LSTM method implemented in this research used deep learning models to generate time series

forecasting with higher accuracy than other methods such as linear and logistical regressions and

Support Vector Machine (SVM). Consequently, Masum et al. (2020) indicated that reproducible-

LSTM (r-LSTM) models remove the limitations of LSTM models and can produce replicable

results by leveraging the z-score outlier detection method and increasing the robustness of the

outcome. Shahid et al. (2020) evaluated the performance of Bidirectional Long Short-Term

Memory (Bi-LSTM) models to predict pandemic progression. In the assessment, its performance

was higher than single LSTM, GRU, SVR, and ARIMA models. This group of investigations

indicated that traditional statistical models such as ARIMA have some disadvantages, as these

require complete datasets and work better for univariate and linear relationships, stating that these

archetypes do not work well for nonlinear data or when we consider longer terms for prediction.

It also showed that Multi-Layer-Perceptron for Time Series (MLP), Convolutional Neural

Networks (CNNs), and Recurrent Neural Networks (RNNs) models require static mapping

functions (and fixed inputs with outputs that cannot learn from temporal dependency), are slow,

and overfit easily.

Compartmental models, such as susceptible-infectious-removed (SIR), susceptible-

exposed-infectious-removed (SEIR), or susceptible-unquarantined-quarantined-confirmed

(SUQC), have also been used to study and forecast the epidemiological rate of the COVID-19

outbreak and the effectiveness of prevention strategies (Abou-Ismail, 2020; Ramezani, 2021;

Sharov, 2020). Agent-based simulation models have also been used to evaluate post-pandemic

strategies. Li & Giabbanelli (2021) used an agent-based simulation and a logistic growth model to

assess if the vaccination strategy, without non-pharmaceutical intervention, would be enough to

reopen the United States and return to a pre-pandemic life. Mukherjee et al. (2021) also used an

agent-based approach to assess the reopening strategies for educational institutions in the United

States by measuring the average number of susceptible individuals as a fraction of the total

population used for the study.

While most of the studies presented focus on short-term prediction, the work described in

Ramazi et al. (2021) uses a model with a mean absolute percentage error of 9% capable of

predicting COVID-19 death cases in the USA up to 10 weeks into the future. A general learner

called LaFoPaFo (LAst FOld PArtitioning FOrecaster) is proposed in this investigation. The model

uses “last-fold partitioning” to find the best model parameters, the combination of features, and

the history length to produce the forecasting. This approach has a forecast horizon of 5 to 10 weeks.

It considers 11 different features such as the current number of COVID-19 tests, cases, and deaths,

social activity measures, weather-related covariates specific to the USA, and historical values at

the start of the pandemic.

Even though the post-COVID-19 reopening strategies were evaluated in the references

above, some additional studies considered models that can forecast the number of cases when the

pandemic is controlled, and government policies are relaxed. The work presented by Medeiros et

al. (2022) focuses on these aspects. It proposes a short-term real-time forecasting model based on

a penalized LASSO regression with an error correction mechanism and an adaptive rolling-

window scheme capable of forecasting the number of cases and deaths in US states where COVID-

19 issues amplified later. Doornik et al. (2020) also utilized a short-term forecasting approach

using statistical extrapolations of past and present data that permitted the development of models

using an improved version of the calibrated average of rho and delta methods, called the Cardt

process, which takes the average of two autoregressive models and one moving average to develop

the forecast.

Overall, the implemented models have shown detailed results, indicating the effectiveness

of statistical and machine learning approaches for managing contagions and developing enhanced

strategies that can provide local and global solutions for future pandemics. Nevertheless, these,

while comprehensive and detailed, have not copiously covered the relationship of the output to

possible policy intervention and its related strictness when the only information available is cases

as most rely on descriptive dashboards (that might include simple forecasts) or multiple parameters

not available but estimated; specifically in the starting stages of the pandemic or when there are

new scenarios concerning upcoming epidemics variants.

Our work is situated in this capacity, where applying mitigation plans and setting their

stringency is framed into a proposed clustering approach linked to expected cumulative caseload

growth, variability, and contagion risk. This article aims at methodologies focusing on short-term

horizon forecasts with aggregated data of cases. We purposefully aggregate the data as part of our

development to detach the inherent nonlinearity of the daily cases. Our forecasting approach is

intended for statistical methods where a degree of smoothing in the data is necessary to produce

reliable results implying that techniques such as ARIMA and exponential smoothing remain

dependable for performance. Indeed, we could have used LSTM or other designs as these are

reliant on multiple contexts of linear and nonlinear data and prediction terms, as shown in the

illustrated investigations above. However, our objective with AGGFORCLUS is, first and

foremost, to facilitate the analysis using statistical forecasting methods. In the following sections,

we describe the fundamentals of our development.

3. Data, Methods, and Modelling

AGGFORCLUS combines time series forecasting with clustering classification and

develops a risk arrangement based on aggregated caseload and uncertainty. This section first

describes the AGGFORCLUS fundamentals, including the data information and setup, the

exponential smoothing or state space (ETS) and the auto-regressive integrated moving average

(ARIMA) models, the development of bagged forecasts, the performance configuration, and the

proposed cluster-based risk assessment and how these are arranged in the AGGFORCLUS

framework.

3.1. Data and Model Setup

Our model predicts one variable related to COVID-19: the cumulative number of confirmed

cases. The data used for this study was extracted from the online repository provided by the Center

for System Science and Engineering at Johns Hopkins University (Johns Hopkins, 2020), which

is currently updated daily and is available at: https://github.com/CSSEGISandData/COVID-19.

We first focused on a worldwide context and followed a multiple-origin evaluation process. We

started with 40 data points available (from 2020-04-01 to 2020-05-10) and produced forecasts and

prediction intervals in the short term for the next ten days (2020-05-11 to 2020-05-20). We selected

this critical time window as most countries worldwide during this time were in total lockdown

measures as part of their mitigation plans (WHO, 2020).

We re-run the analysis using 41 points of data available (from 2021-01-01 to 2021-02-10)

and again produced forecasts for the next ten days; we selected this second-time window as the

COVID-19 Delta variant arose during these dates. This process was repeated, but now (from 2022-

01-01 to 2022-02-10) to account for the Omicron variant (WHO, 2020). We selected these window

times to evaluate the mitigation plans reassessment generated given the uncertainty of the

pandemic starting days and later the introduction of the variant specificities worldwide. Overall,

this evaluation produced these three rounds of 10-step-ahead non-overlapping forecasts. Our

choice on the horizon (10 days) is in-line with (Abbasimehr & Paki, 2021; Borghi et al., 2021;

Chimmula and Zhang., 2020; Doornik et al., 2020; Maleki et al., 2020; Medeiros et al., 2022;

Petropoulos et al., 2020; Rauf et al., 2021; Zhao et al., 2021) as these considered that short-term

horizon is encompassed in a range of 7 to 12 days. The implementation presented in this article is

for a 10-days evaluation. Nevertheless, our methodology is contingent on any set horizon within

the short-term scope.

Using the previously defined time window rounds, we evaluated 30 countries worldwide.

We consider countries for each continent to assess the multiplicity of the mitigation plans designed

during the pandemic progression. Including the US, Canada, Mexico, Brazil, Peru, Colombia,

Chile, and Argentina from the North, Central, and South America continents. France, Germany,

Italy, Spain, the UK, Sweden, Denmark, Norway, and Finland from the European continent. India,

Japan, China, Indonesia, South Korea, Mongolia, and Saudi Arabia from the Asian and Middle

East continents. Nigeria, Morocco, South Africa, and Egypt from the African continent, and finally

from Australia and Oceanian, we included Australia and New Zealand.

After presenting the scope and details of the procedure in these initial rounds, we develop a

rolling-origin assessment (Tashman, 2000) using the first data window ending point (2020-05-20)

as the initial time, increasing ten days ahead and developing an additional five rounds. For the

rolling-origin assessment, we expanded the evaluation, including 30 other countries (a total of 60)

to denote the implementation execution when the evaluation is performed in real-time (i.e., when

the pandemic is triggered and lockdowns are in full extent) and the flexibility of the application

when processing a wide assortment of series. The additional 30 countries in this evaluation are the

Dominican Republic, Ecuador, Costa Rica, Panama, Uruguay, Bolivia, and Trinidad and Tobago

from the American continents. Thailand, Vietnam, Philippines, Indonesia, Malaysia, Iran, Nepal,

and Iraq from the Asian continent. Portugal, Ireland, Iceland, Greece, Poland, Austria, Hungary,

and Croatia from the European continent. Finally, Algeria, Tunisia, Ethiopia, Kenya, Ivory Coast,

Ghana, and Senegal from Africa.

We provide in our data repository

https://osf.io/crxn7/?view_only=b87da8aa9f1f46a2a8766f0fdf00887d (Folder: Data) the

aggregated data of cases of each round used in this study with their corresponding time series plots

for all the countries. As mentioned, the information about the confirmed cases was collected from

(John Hopkins, 2020); the data found in the repository is merely the aggregated cases information

during the time windows of analysis.

We aim to model the data’s extensive behavior, avoiding hypotheses on many unknown

variables at the uprisen of cases (such as transmissibility, healthcare resources, or death rates). We

recourse to the exponential smoothing family of models (Hyndman et al., 2008, Hyndman et al.,

2002) and the Auto-Regressive Integrated Moving Average (ARIMA) models with bagged

variations in capturing and extrapolating the levels, trends, and seasonal patterns in the aggregated

information. We further narrow our focus on a clustering approach to provide suitable risk

grouping classification based on the forecast evaluation. The following sessions detail the

conceptualization of the complete proposed development.

3.2. Exponential Smoothing

These are also known as the Error, Trend, and Seasonal forecast interpretations (ETS). In

the ETS context, the components of the exponential smoothing models are decomposed into three

categories, the trend component, the seasonal component, and the remainder or error component.

The trend component refers to the direction of the series; the seasonal component refers to the

recurring elements of a series with a certain periodicity; the remainder or error component refers

to the unpredictable elements of the series (Hyndman et al., 2002; Ramos et al., 2015).

Each deterministic exponential smoothing model can be reformulated as two stochastic

ETS models, one that includes additive errors and one that provides for multiplicative errors

(Brown, 1959; Holt, 1957; Winters, 1960). Differentiating between these two alternatives is only

relevant for prediction intervals, not point forecasts. The prediction intervals will differ between

models with additive and multiplicative methods (Hyndman & Athanasopoulos, 2018). In the

notation, E is the type of error additive (A) or multiplicative (M). T stands for the modeling options

in the trend; non-existent (N), additive (A), multiplicative (M), damped additive (Ad), or damped

multiplicative (Md). S stands for the modeling options in the seasonality; non-existent (N), additive

(A), and multiplicative (M). For instance, equation 1 summarizes the state space model (A, A, N):

where denotes the estimation of the forecast at time t, denotes the estimation of the series

level at time t, denotes the estimation of the slope (trend) at time t, denotes the estimation of

seasonality at time t, and represents the number of seasons in a year in the case of seasonal

models). The constants α, β, and γ are the smoothing parameters constrained between 0 and 1 to

interpret the data sets as moving averages (Makridakis et al., 1998). We refer to Hyndman &

(1)

Athanasopoulos (2018) and Hyndman et al. (2002) for the complete scope of the state-space

models’ formulations.

3.3. Auto-Regressive Integrated Moving Average (ARIMA)

ARIMA is an extensive class of prediction models that can represent autocorrelated and

stochastic seasonal and non-seasonal time series, including autoregressive (AR), moving average

(MA), and mixed AR or MA processes with differentiated or integrated (I) baselines.

The notation in these types of models is formally denoted as , where

represents the number of lagged values to consider for autoregression, represents the number of

times the series has been differentiated to achieve stationarity, and represents the number of

moving average parameters (Box & Jenkins, 1970). The ARIMA notations are used to calculate a

set of parameters from a combined formulation of autoregressive and moving average models that

are described as:

where is the differenced series (could be differentiated more than once) at time t, is the average

of the changes between consecutive observations, and is white noise which is regarded as a

multiple regression but with lagged values o as predictors. By modifying the parameters

for the autoregressive section and the parameters for the moving average,

ARIMA results in different time-series patterns. The variance of the error term will only change

the scale of the series, not the patterns. An extra parametrization for seasonal components in

ARIMA models is represented as ARIMA (p,d,q)(P, D, Q), where the uppercase letters have the

same meaning as the lowercase letters. However, these are referred solely to seasonal parameters.

(2)

3.4. Bootstrapping and Bagging

To configure the bagged forecast in our development, we will follow a similar approach as

detailed in Bergmeir et al. (2016). Here, the bagging time-series forecasting procedure begins by

decomposing the series to obtain the trend, seasonal, and remainder components (i.e., time series

decomposition). The loess-based scheme decomposes the series into the trend, seasonal, and

remainder (Cleveland et al., 2017). After the decomposition, we bootstrap the remainder

component and add the bootstrapped remainder to the original decomposed components. By

adding up the components, we have created one bootstrapped series with the same trend and

seasonal component as the original but with a remainder component that is alike but not identical.

We generate several bootstrapped series from a single original sequence (i.e., simulating and

adding multiples times the remainder component to the original trend and seasonal), and each one

is individually forecasted as we fit ETS and ARIMA models independently. An average

aggregated or bagged forecast of the overall simulated series is then calculated, denominated

bagged ETS (B-ETS) and bagged ARIMA (B-ARIMA), denoting the methods used to create the

bootstrapped forecasts. The bagged forecasts attempt to identify random variations that otherwise

might not be possible to recognize with a single original series, thus improving the prediction in

some cases where the fundamental models (i.e., ETS and ARIMA) were unable to produce reliable

forecasts (Petropoulos et al., 2018).

In our application, we first applied a Box-Cox transformation to stabilize the variance and

ensure that the time series components are additive (Box & Cox, 1964). The parameter λ of the

transformation is chosen automatically using the procedure described in (Guerrero, 1963). We then

bootstrap using moving block bootstrapping (MBB) (Künsch, 1989) and generate the bagged

forecast (Bergmeir et al., 2016). We created 100 bootstrapped series from each original aggregated

case series since the improvements of bagging for time series seem to be minimal after this number

of bootstraps (Cleveland et al., 2017; Petropoulos et al., 2018). The multiple bootstrapped series

allows modeling the parameters’ uncertainty and the random error term, critical components of

calculating prediction intervals (Petropoulos et al., 2018). A common characteristic of prediction

intervals is that they become wider as observations are predicted further ahead due to the increasing

error uncertainty. To calculate the point and prediction intervals for the series, we use the forecasts

generated from the bootstrapped series and compute the mean point forecasts and the 2.5th and

97.5th quantiles for lower and upper bounds of the bagged 95% prediction intervals.

We bootstrapped to aim and improve the forecasts’ performance compared to the single

versions of ETS and ARIMA. The bagged approach seeks to identify possible series shifts that are

not likely to be distinguished by solely using the original series. The objective of the decomposition

and replication of the error component is not to try and create duplicates of the original series to

compensate for data (in the end, we only have one bagged forecast) but to identify possible swings

in the error component of the series that are not readily perceptible when forecasting only on the

original data (Petropoulos et al., 2018).

Algorithm 1 below further describes the entire procedure as previously described, where the

COVID-19 aggregated cases series are decomposed into the trend and remainder components (step

4). The remainder component is bootstrapped using MBB 100 times. Then, the original trend and

seasonal components (if any) are added to each bootstrapped remainder, resulting in 100 simulated

series from the original series. Likewise, ETS and ARIMA forecasts are fitted and calculated for

each simulated series (steps 5-10). For each ETS and ARIMA, point forecasts and bagged

prediction intervals are estimated for the original series per forecasted period by calculating the

mean and quantiles on the forecasts of the simulated series (steps 11-18). Mean and quantile

forecasts represent each method’s bagged forecasts called B-ETS and B-ARIMA.

Algorithm 1 Bootstrapped algorithm

3.5. Performance and Error Measurement

We train the forecast models for each round and develop a projection of cases for the

defined testing set (horizon = 10 days). We implement five different models: (1) Naive

(benchmark), (2) ETS, (3) ARIMA, (4) B-ETS, and (5) B-ARIMA. The forecast performance is

determined by calculating the root mean squared error (RMSE), the mean absolute error (MAE),

and the mean absolute percentage error (MAPE):

(3)

(4)

(5)

where for these formulations, n is the number of observations (sample size); m is the periodicity;

is the actual value of the time series y at time t, and is the forecast for the testing set. The

forecast performance measures are calculated using point forecasts, and for each, we included 95%

prediction intervals for all aggregated case series. The forecast with the lowest percentual error

from the five models is selected as the best representation of the series and is foremost used to

forecast the ten upcoming days that will serve as the prediction in our development. The selected

model parameters are recalibrated with the testing set information before forecasting. A similar

practice of forecast performance evaluation is presented in (Al-qaness et al., 2020; Maleki et al.,

2020; Petropoulos et al., 2020; Soto-Ferrari et al., 2020; Soto-Ferrari et al., 2019).

3.6. Clustering Procedure

Our approach’s point forecasts and prediction intervals represent average, worst, and best-

case scenarios of cases. In addition to these estimates, we intend a greater degree of exploration to

determine the convenience of policy requirements at critical decision epochs during a pandemic.

To this end, we convert the extended horizon predictions into four average growth metrics (GR)

by first using the following formulation:

where indicates the value estimation of the growth metric to calculate (i.e., point forecast, lower,

and upper interval) at time t, we proceed to calculate the series’ growth for each period t (

starting on day t=2 of the forecast. After completing this calculation, we average the values to

determine the expected mean growth for point forecasts and prediction intervals. The resulting

measures are denoted as (1) the average growth rate of point forecasts (GR-F), (2) the average

growth rate of lower prediction interval (GR-L), and (3) the average growth rate of upper

prediction interval (GR-H). Arithmetically these formulations are akin to the slope valuation but

in terms of rates. The value obtained from the difference between the average upper prediction

(6)

interval (GR-H) and the average lower prediction interval (GR-L) denotes the final metric in the

evaluation, defined as (4) the interval growth difference (DIFF-INT).

Each growth metric is considered a clustering dimension, and this multidimensional

arrangement groups the series into a prospect categorization measure. The clustering aims to

determine the series with a similar probability of spreading the virus based on the expected

caseload growth and interval variability. For this research, we applied the traditional k-means

method for clustering (Lloyd, 1982) because of its simplicity and overall good results; however,

other clustering techniques are equally applicable. We use Gap-Statistics (Bock H.H, 1985) to

determine the best k given the growth rates information of the countries evaluated.

After completing the cluster ensemble, we calculate the first (Q1) and third (Q3) quartiles

for the cases growth (GR-F) and interval variability (DIFF-INT). We are using the quartiles to

design nine risk quadrants, each representing the relative measures of low, mid, and high rates

estimates for both expected cases growth and variability, as shown in Figure 1.

Figure 1. Risk Quadrants

The proposed quadrant classification locates countries in multi-layered risk estimations

based on the GR-F and DIFF-INT metrics. Each quadrant implies a different extent of contagion

risk dependent on the case growth and the uncertainty denoted in the forecast intervals. A country

with a higher quadrant corresponds to an elevated risk of contagion. The higher the risk, the greater

the need for policy intervention. Where, for instance, quadrant nine (9) implies both high volumes

of cases growth (GR-F) and expected variability (DIFF-INT); the proposed quadrant classification

facilitates the interpretation as, for example, a quadrant three (3) measure might consider stricter

policies than a quadrant two (2) rating because of the projected higher uncertainty of cases. The

objective for the decision maker is, in this case, that in subsequent interactions of the approach

(rolling-origin evaluation), if the quadrant measure is lower than an earlier run, it might consider

starting to relax the policies’ strictness.

The interval variability implies that a country might move to a higher quadrant as the

application is run multiple times in the pandemic progression. Suppose the decision maker

observes that the quadrant measure increases in subsequent evaluations. In that case, the strictness

of the requirements will be suggested as the initially identified variability translated into more

actual cases. If it goes in the opposite direction, and the quadrant classification is reduced based

on the actual stringency policy, the decision maker might consider relaxing the policies.

The expected variability can also be used to assess each country’s progress in collecting

and reporting quality caseload data. Countries with a wider prediction interval will be placed in

quadrants three, six, and nine, indicating that their forecasts have high uncertainty and stringency

cannot be relaxed unless prediction intervals narrow down in further decision epochs.

3.7. AGGFORCLUS Blueprint

Our development combines in three sequencing phases the steps described in sections 3.1,

3.2, 3.3, 3.4, 3.5, and 3.6, as represented in Figure 2. The denomination AGGFORCLUS signifies

the phases of progression in the procedure. The outputs for the implementation will consist of

composite tables with the forecasts, the performance, and the average growth metrics. Also, a

unique display is that AGGFORCLUS renders these values to graphical representations denoting

the clustering grouping. These are directly related to the projected contagion risk with the

corresponding quadrant classification.

Figure 2. AGGFORCLUS Procedure

The overall procedure is developed by implementing numerous libraries but primary the

forecast package (Hyndman & Khandakar, 2008) in the R statistical software (version 4.01)

through the RStudio Cloud service, where the bld.mbb.bootstrap function generates the

bootstrapped remainder series of the bagged forecasts (Bergmeir et al., 2016; Hyndman &

Athanasopoulos, 2018). In the application’s programming, we follow the algorithmic structure

presented below in Figure 3. We provide the R code reference in our data repository

https://osf.io/crxn7/?view_only=b87da8aa9f1f46a2a8766f0fdf00887d (Folder: R Codification) to

facilitate the evaluation and replication of results.

Figure 3. AGGFORCLUS Algorithmic Form

To complement the assessment, we perform a parallel evaluation of our results with the

Oxford COVID-19 Government Response Tracker (Hale et al., 2021), which tracks the stringency

of policy level of each nation and records the strictness of country-level guidelines such as

lockdowns. This tracker aims to compare the response to COVID-19 of governments worldwide

to understand their effectiveness in controlling the pandemic and contribute to global efforts to

stop the spread of the virus. The data used in this tool is extracted from common policy responses

followed by each government worldwide. This information is then used to score the stringency of

the measures adopted and aggregate this score into a Stringency Index. The visualization of this

index is done by employing a heat map on a global chart that ranks the stringency of the countries

with values from 0 (light grey) to 100 (red) for any day starting on January 2, 2020, until the

present day, or utilizing a time series showing the stringency index of all countries over time in a

scatter plot for a time range starting on June 20, 2020.

The stringency drivers used by the Oxford COVID-19 Government Response Tracker to

determine the index are contained in five significant groups summarized as Containment and

Closure (C1-C8), Economic Response (E1-E4), Health Systems (H1-H8), Vaccine Policies (V1-

V3), and Miscellaneous (M1).

4. AGGFORCLUS Implementation

4.1. AGGFORCLUS Phase I

For the multiple origin evaluation, we calculate the forecast models for the 30 countries

(see section 3.1). We contemplated three independent multi-origin rounds where the training set

size equals 30 days in April 2020 for the first round (W1), 31 days for the second (W2), and third

(W3) rounds in Jan 2021 and Jan 2022, respectively. The testing size equals the ten following days.

We develop the five forecast methods (i.e., (1) Naïve, (2) ETS, (3) ARIMA, (4) B-ETS, and (5)

B-ARIMA) to determine the best for the series based on the performance and create a

comprehensive report with the entire scope of the evaluation for each country, as presented in

Figure 4.

Figure 4. AGGFORCLUS Phase I –(Argentina)

Our data do not contain seasonality on an aggregate level and is exponential, so we focus

on non-seasonal models. Petropoulos et al. (2020) consider that an exponential smoothing model

that satisfies this criterion is the non-seasonal multiplicative error and multiplicative trend

W1

W2

W3

exponential smoothing model, usually denoted as ETS(MMN). However, our development fits

multiple forecast models and automated the process using the ets(z, z, z) and auto.arima functions

from the forecast package in R (Hyndman & Khandakar, 2008) to identify the best configuration

according to series information.

We present the analysis performance plots for all countries during the three rounds in our

data repository https://osf.io/crxn7/?view_only=b87da8aa9f1f46a2a8766f0fdf00887d (refer to

folder: Analysis/Phase I). The bagged forecasts do not offer a classification for the characteristic

ETS or ARIMA parameters. The estimates come from the combined aggregated forecasts for each

bootstrapped series as specified in Algorithm 1 (see section 3.4). However, our analysis is not just

of these estimated point projections but also includes the expected uncertainty in the anticipated

prediction intervals.

Table 1 below presents the results for the performance and the model selected for the

country in each round (refer to columns Best Method and MAPE). The overall highest errors in

W1 were 16%, 11%, 7.97%, 7.75%, and 7.45%, respectively for Chile, India, Colombia, Nigeria,

and South Africa. While all other errors were lower than 7% in all windows, we under-forecasted

the confirmed cases when the virus started picking up in Chile and India. These values do not

necessarily mean that our produced forecasts were positively biased. However, that containment

measures were implemented in these countries to reduce the impact of the pandemic, and such

procedures changed the recognized patterns in the data. Also, these forecasted values might be

related to the accuracy of the reports because most developing countries had delays in the accounts

during this period. As in the pandemic’s starting days, the cases significantly multiplied. Data was

not straightforwardly collected initially for most developing countries but improved over time, and

forecasted values were adjusted in the following windows.

A noticeable characteristic of the W1 evaluation is that France and China presented almost

constant aggregated cases during the assessment, implying that there were not many cases during

this window time because of the strictness of their lockdown measurement. Given that the case

series were almost steady, the Naïve forecasts presented a positive performance. The following

AGGFORCLUS phases show how the estimated forecasts are used to calculate the growth metrics

and design the clustering risk assessment approach.

4.2. AGGFORCLUS Phase II

After developing and evaluating the performance of the models, we predict the projected

number of aggregated cases with a horizon of 10 days ahead for all countries in each round (i.e.,

May 11 to May 20, 2020; February 11 to February 20, 2021; and February 11 to February 20,

2022, respectively). The best forecasting model identified (Phase I) is recalibrated and used to

predict point forecasts and 95% prediction intervals. Figure 5 shows partial outputs from the

AGGFORCLUS development in each window time. The continuous line shows the calculated

point forecasts in these figures. The shaded area displays the aggregated case’s 95% projected

prediction intervals. This calculation allows us to compare the uncertainty levels across different

periods, given the cumulative nature of the data. We refer to

https://osf.io/crxn7/?view_only=b87da8aa9f1f46a2a8766f0fdf00887d (refer to the folder:

Analysis/Phase II) for complete output information on all countries analyzed.

W1

W2

W3

Figure 5. AGGFORCLUS Phase II

Various remarks take place in this phase. A significant forecast error is associated with

changes in the observed patterns. Concerning the confirmed cases, countries such as Chile in W2

and Egypt in W1 and W3 have a lower interval variation. In countries like China and Denmark,

we observe a progressive decrease in the forecasted uncertainty due to enhanced surveillance

policies and higher test availability. Other countries such as Argentina, Australia, Brazil, Canada,

Chile, Egypt, and Finland had broad uncertainty in W1, which effectively narrowed in W2, but

went wide again in W3. Colombia’s caseload uncertainty seems to increase through W1, W2, and

W3.

The forecasts can inform us of what happened and whether the applied policies and

measures were successful. However, the expected growth of cases can inform us about the

decisions concerning retaining, strengthening, or relaxing such standards. Decision-makers should

consider the interaction with the intervals, which must be echoed in the policy’s relaxation or

further stringency assessments. For instance, an uptrend of cases with more significant expected

variability (i.e., higher width of the interval) should be considered differently than when we have

similar trends but with a lower variation.

4.3. AGGFORCLUS Phase III

In this phase, we calculate the growth metrics (i.e., GR-F, GR-L, GR-H, DIFF-INT; see

section 3.6). These values allow us to develop the k-means clustering approach (Lloyd, 1982) to

arrange countries with comparable growth values that will be used to benchmark stringency levels.

Using the resulting GR-F and DIFF-INT, we calculate the first and third quartiles and position the

countries in the risk quadrant defined by the cases and uncertainty. The risk evaluation cut-off

points (i.e., Q1 and Q3) and the optimal k groups are dynamic and must be re-calculated in each

run. Their layout depends on each round’s resulting growth relative measures.

Figure 6 presents the subsequent classification for all rounds in this development. The

renders from these displays focus on the association between GR-F vs. DIFF-INT as the first one

measures the projected increase of cases and the second the uncertainty (average size of the

prediction interval). Risk quadrants are plotted in combination with the cluster arrangement. In the

resulting plot, countries are set with their cluster number classification next to it (countries with

the same number imply that they belong to the same cluster). Dashed lines correspond to the

quadrants’ cut-off points, represented by the Q1 and Q3 of the metrics. Each quadrant is assigned a

number, denoting the risk valuation category (refer to Figure 1 for details). This display is our

primary emphasis as it shows the associated risk classification of cases and anticipation for the

analyzed countries in each round. Table 1 below presents all measures for all the rounds, including

the Oxford Stringency Index during the time windows studied. Additionally, we refer to

https://osf.io/crxn7/?view_only=b87da8aa9f1f46a2a8766f0fdf00887d (please, go to the folder:

Analysis/Phase III) for complete output information on all countries.

Evaluating the results, in W1 (May 11 – May 20, 2020), the countries categorized as the

highest risk are Nigeria, Chile, and South Africa, all from cluster 6; while Chile and Nigeria were

classified in quadrant nine, South Africa was classified in quadrant eight as Chile and Nigeria

denoted a higher variability. While the stringency for Nigeria and South Africa was 84.26%, Chile

at the time was implementing countrywide mitigation restrictions with localized regional

lockdowns, and the stringency was measured as 75.93%. At this time, a potential solution for Chile

would have been to enhance the stringency and impose a total temporary lockdown, as suggested

by previous research (Li, 2022).

Likewise, countries at similar risk are Brazil and Colombia from cluster 5 (both in quadrant

eight), India and Egypt from cluster 8 (quadrant eight and seven respectively), and Saudi Arabia

from cluster 3 (quadrant eight). For these countries, all stringency measures at the time were in an

80+ range, with Brazil as the lowest (81.02%). A recommendation at the time for these countries

would have been to continue to strengthen their mitigation strategies. Interestingly, the country

with the highest stringency level in this window was Peru (96.30%). While Peru’s risk position is

in quadrant five when associated with its cluster arrangement (cluster 5), all the other countries in

this cluster, Mexico, Brazil, and Colombia, are in a higher quadrant (six, eight, and eight,

respectively) implying that if not for this level of stringency, Peru would have been placed with

them.

In W2 (February 11- February 20, 2021), as we can spot in the pandemic progression, the

overall GR-F for countries is significantly lower than when compared to W1; this is undoubtedly

expected as cases were significantly reduced than when compared to the starting point of the

pandemic as during these times vaccination was available for the population while relatively strong

stringency measures were maintained. In this window, the countries classified as the higher risk

(quadrant nine) are Mongolia (cluster 1), Spain, and Indonesia (cluster 4), with the stringency of

75%, 81.94%, and 74.54%, respectively.

In W3 (February 11- February 20, 2022), the GR-F increased if compared to W2; this is

because while vaccination was available for all countries, the stringency for most was also relaxed,

and the Omicron variant had the highest degree of virus transmissibility (WHO,2020). In this

window, South Korea (cluster 20), Norway (cluster 11), Japan, New Zealand, Denmark (cluster

15), Chile (cluster 8), and Germany (cluster 7) were identified as the ones at higher risk (all placed

in quadrants nine and eight). A unique characteristic of this assessment is the significant number

of resulting clusters (k = 20), exemplifying the notorious differences between the countries’

mitigation plans reflected in the volume of cases and variability denoted. South Korea has the

highest risk, given its elevated forecasted GR-F (4.69). During this period, the stringency level for

South Korea was 46.30%; this was not uncommon at this period since some other countries were

in similar stringency stages. Given the expected uprising of cases for South Korea at the time, a

recommendation would have been to increase the strictness of its mitigation plans as

AGGFORCLUS placed the country at a considerable risk of contagion.

A meaningful perspective provided by AGGFORCLUS is the visualized evolution of

uncertainty concerning caseload growth, which might convey a performance view of each

country’s quality in their caseload data collection and reporting. For example, Mongolia was in

quadrants six and nine for W1, and W2, respectively, and came down to quadrant two in W3,

evidencing the buildup of diagnostic and reporting capabilities in this last stage. Despite their

elevated testing capabilities, Japan was in quadrant six for W1 and W2 but regressed to quadrant

nine in W3, evidencing operational delays of their manual reporting in W1, the immaturity of the

recently installed online system to collect Delta cases in W2, and the relaxation of contact tracing

surveillance policies during Omicron for W3 (Bacchi, 2022; Tokumoto et al., 2021).

5. Additional Experiments: Rolling-Origin Evaluation

For this assessment, we included additional 30 countries (a total of 60), as detailed in

section 3.1. This approach would showcase the use of AGGFORCLUS if the evaluation followed

a real-time exercise when the pandemic is initially triggered. We present the findings for this ten-

day rolling-origin assessment for five additional rounds (i.e., R1 to R5) following the time for the

W1 review (May 20, 2020). Figure 7 and Table 2 comprise the rolling-origin resulting cluster

classification plots and table.

Figure 6. AGGFORCLUS Phase III

Meaningfully as denoted in R1, while most forecasts present similar errors as those

presented in W1 significantly, both Nepal (49.97%) and Mongolia (41.75%) have notoriously

substantial high errors, and this is because one more time the cases started to pick up in both

countries and the previously identified pattern of the cases drastically altered in this period. This

situation denotes one of the mean features of AGGFORCLUS because phase II, when using the

recalibrated model, will then denote the perceived variability of cases increasing the interval size,

which when in phase III, will tend to place these countries in one of the superior quadrants (i.e.,

three, six, or nine quadrants constantly depending on each country cases growth) informing the

decision maker of these perceived modifications in the pattern.

Table 1. Overall Results AGGFORCLUS (W1, W2, W3)

Country

Best Method

MAPE (%)

GR-F

GR-L

GR-H

DIFF-INT

AGGFORCLUS

Cluster

AGGFORCLUS

Risk Quadrant

Oxford

Stringency (%)

Argentina

B-ETS, ETS, B-ETS

1.4, 0.35, 0.41

2.75, 0.36, 0.32

1.59, 0.21, -0.3

3.98, 0.5, 1.13

2.39, 0.29, 1.43

9, 7, 14

5, 5, 6

88.89, 79.17, 49.07

Australia

B-ARIMA, ETS, B-ARIMA

0.33, 0.01, 0.19

0.22, 0.01, 0.92

-1.2, 0.01, -1.03

1.56, 0.02, 2.61

2.76, 0.01, 3.65

4, 9, 18

2, 1, 6

69.44, 56.02, 55.56

Brazil

B-ARIMA, ETS, ARIMA

3.57, 0.34, 0.25

4.32, 0.52, 0.47

2.58, 0.43, 0.2

5.39, 0.61, 0.73

2.81, 0.18, 0.53

5, 3, 16

8, 8, 4

81.02, 69.91, 61.57

Canada

B-ARIMA, ETS, B-ETS

1.74, 0.09, 0.17

1.8, 0.32, 0.29

1.37, 0.28, 0.1

2.09, 0.37, 0.47

0.73, 0.1, 0.37

2, 8, 3

4, 4, 1

74.54, 75.46, 76.39

Chile

ARIMA, ARIMA, B-ARIMA

16.21, 0.35, 0.56

4.45, 0.48, 1.26

2.78, 0.4, 0.85

5.85, 0.56, 1.72

3.07, 0.15, 0.88

6, 3, 8

9, 4, 8

75.93, 79.17, 30.09

China

Naïve, ARIMA, ARIMA

0.06, 0.08, 0.71

0, 0.03, 1.2

-0.06, -0.08, 0.78

0.06, 0.14, 1.6

0.11, 0.23, 0.82

2, 9, 8

1, 2, 5

81.94, 78.24, 64.35

Colombia

ARIMA, B-ETS, ETS

7.97, 0.28, 0.28

3.91, 0.24, 0.11

2.73, -0.07, -0.19

4.96, 0.58, 0.4

2.23, 0.64, 0.58

5, 10, 17

8, 5, 2

87.04, 81.02, 62.04

Denmark

B-ETS, B-ARIMA, B-ETS

0.59, 0.02, 1.98

0.9, 0.22, 2.05

-0.79, -0.12, 1.58

1.98, 0.58, 2.79

2.77, 0.7, 1.21

4, 10, 15

5, 3, 9

68.52, 66.67, 16.67

Egypt

ARIMA, ARIMA, ETS

6.81, 0.02, 0.07

4.77, 0.38, 0.48

4.06, 0.2, 0.42

5.43, 0.55, 0.54

1.37, 0.35, 0.12

8, 7, 13

7, 5, 4

84.26, 54.63, 43.52

Finland

ETS, B-ARIMA, ETS

0.7, 0.15, 0.67

1.01, 0.72, 1.2

0.75, 0.58, 0.85

1.25, 0.92, 1.53

0.5, 0.34, 0.69

2, 6, 4

4, 8, 5

68.52, 52.31, 38.89

France

Naïve, ETS, ETS

2.97, 0.25, 1.32

0, 0.51, 1.28

-3.18, 0.42, 1.02

2.09, 0.6, 1.52

5.27, 0.18, 0.5

7, 3, 4

3, 8, 7

87.96, 60.19, 72.22

Germany

B-ARIMA, ETS, ARIMA

0.88, 0.59, 2.71

0.55, 0.34, 1.48

-0.83, -0.53, 1.14

1.94, 1.16, 1.8

2.76, 1.69, 0.66

4, 2, 7

5, 6, 8

64.35, 83.33, 48.15

India

ETS, B-ETS, ARIMA

11.69, 0.02, 0.35

4.56, 0.09, 0.13

3.41, 0.07, -0.2

5.59, 0.11, 0.45

2.18, 0.03, 0.65

8, 9, 9

8, 1, 2

81.94, 61.57, 75.46

Indonesia

ETS, ETS, B-ARIMA

2.57, 0.8, 1.87

2.53, 0.71, 0.92

1.82, 0.27, 0.59

3.18, 1.14, 1.23

1.35, 0.88, 0.64

9, 4, 2

4, 9, 5

74.54, 68.06, 68.98

Italy

ARIMA, ARIMA, ARIMA

0.96, 0.09, 2.53

0.36, 0.45, 0.67

-0.54, 0.33, 0.08

1.19, 0.58, 1.22

1.74, 0.26, 1.15

4, 3, 14

2, 5, 5

75.00, 74.07, 76.85

Japan

ETS, ETS, B-ARIMA

1.31, 1.01, 2.88

0.43, 0.45, 2.36

-2.15, -0.46, 1.67

2.52, 1.29, 2.95

4.68, 1.76, 1.28

7, 2, 15

3, 6, 9

47.22, 49.54, 47.22

Mexico

ARIMA, ARIMA, ARIMA

1.53, 1.14, 0.54

3.68, 0.64, 0.56

1.99, 0.41, 0.34

5.14, 0.85, 0.76

3.15, 0.44, 0.42

5, 5, 5

6, 8, 4

82.41, 68.98, 38.89

Mongolia

B-ETS, ETS, B-ETS

3.39, 1.22, 1.27

0.85, 2.07, 0.15

-5.14, 1.34, -0.52

3.51, 2.74, 0.59

8.65, 1.4, 1.11

1, 1, 1

6, 9, 2

75.00, 73.61, 23.60

Morocco

ARIMA, ETS, ETS

3.72, 0.08, 0.56

2.42, 0.1, 0.13

1.14, 0.07, -0.23

3.56, 0.13, 0.47

2.42, 0.06, 0.7

9, 9, 9

5, 1, 2

93.52, 76.85, 65.74

New Zealand

B-ETS, B-ETS, ETS

0.28, 0.43, 2.12

0.17, 0.08, 1.99

-1.81, 0.03, 1.35

1.91, 0.1, 2.6

3.72, 0.07, 1.24

7, 9, 15

3, 1, 9

83.33, 22.22, 62.04

Nigeria

ARIMA, ARIMA, ARIMA

7.75, 0.96, 0.03

4.76, 0.9, 0.01

2.91, 0.78, -0.11

6.3, 1.02, 0.13

3.38, 0.24, 0.25

6, 6, 6

9, 8, 1

84.26, 58.33, 37.96

Norway

ARIMA, ARIMA, ETS

0.34, 0.12, 2.59

0.01, 0.28, 2.21

-1.27, -0.06, 0.67

1.14, 0.61, 3.57

2.41, 0.66, 2.9

4, 10, 11

2, 5, 9

67.59, 73.15, 25.00

Peru

ETS, ARIMA, ETS

3.76, 0.68, 0.74

3.66, 0.5, 0.38

2.43, 0.29, -0.62

4.73, 0.69, 1.3

2.31, 0.4, 1.92

5, 5, 12

5, 5, 6

96.30, 86.11, 61.11

Saudi Arabia

ETS, B-ETS, B-ARIMA

3.13, 0.07, 0.19

3.97, 0.1, 0.44

3.2, 0.08, 0.26

4.7, 0.12, 0.61

1.5, 0.04, 0.36

3, 9, 5

8, 1, 4

89.81, 57.41, 75.93

South Africa

ARIMA, ARIMA, B-ARIMA

7.45, 0.56, 0.04

4.67, 0.21, 0.08

3.16, -0.44, -0.02

5.97, 0.82, 0.17

2.81, 1.26, 0.2

6, 2, 19

8, 3, 1

84.26, 64.81, 44.44

South Korea

ETS, B-ARIMA, ARIMA

0.12, 0.15, 6.14

0.1, 0.49, 4.69

0, 0.27, 3.57

0.2, 0.79, 5.71

0.2, 0.52, 2.13

2, 5, 20

1, 5, 9

43.52, 63.89, 46.30

Spain

ARIMA, B-ETS, B-ETS

0.63, 0.87, 3.11

0.45, 0.66, 0.51

-1.11, -0.13, 0.12

1.77, 1.11, 0.72

2.88, 1.24, 0.6

4, 4, 16

6, 9, 5

81.94, 71.30, 46.76

Sweden

ARIMA, ETS, B-ARIMA

0.89, 0.46, 3.46

1.94, 0.29, 1.13

1.77, 0.02, 0.56

2.1, 0.55, 1.51

0.32, 0.53, 0.95

2, 10, 8

4, 5, 5

64.81, 69.44, 19.44

United Kingdom

ARIMA, ETS, ARIMA

2.69, 0.27, 1.75

1.37, 0.26, 0.68

0.93, -0.07, 0.38

1.79, 0.59, 0.96

0.85, 0.66, 0.58

2, 10, 2

4, 5, 5

79.63, 87.96, 42.13

US

ETS, ETS, B-ARIMA

0.25, 0.25, 1.11

1.19, 0.33, 0.3

0.33, 0.28, -0.11

1.98, 0.38, 0.6

1.65, 0.1, 0.72

2, 8, 10

5, 4, 2

72.69, 68.06, 58.80

Over time, we observe a decrease for most countries in the forecast uncertainty regarding

the width of the prediction intervals as the series pattern is continuously monitored. In this sense,

Mongolia’s rolling quadrant values are three, nine, six, three, and six, while Japan’s are three, six,

five, five, and five, respectively. From May 21 until July 9, Mongolia could never significantly

reduce the uncertainty of its data, while Japan was able to move one quadrant down, as recounted,

due to the transition from a manual to an online data collection and reporting system (Bacchi,

2022).

The continuous monitoring nature of the rolling-origin evaluation also shows that when we

explore our proposed models’ accuracy in forecasting the confirmed cases for these rounds, we

observe a perceptible decrease in most series in the mean absolute percentage error (MAPE) for

the evaluation sets from the first to the latter. At the same time, the average forecast error for

confirmed cases in the rounds has been as low as 1% or less at the furthest horizon for most series

in the evaluation.

6. Concluding Remarks

Our development, called AGGFORCLUS, does not only report both the caseload mean

estimate and the levels of uncertainty (as most forecasts methods available in the literature) but

also uses these values to develop a systematic and time progressing evaluation of each country’s

pandemic risk. When AGGFORCLUS is evaluated with the Oxford COVID-19 Government

Response Tracker, it provides an expanded vision of the strictness that nations might enforce

compared to others in a similar situation (cluster and quadrant), which extends our results and

contemplates the measures that a government might anticipate.

Our approach performs best on an aggregate level since aggregation provides an inherent

degree of smoothing; therefore, it is suited for extensive regions like countries, regardless of the

condition of their collected data. The quadrant classification proposed in AGGFORCLUS conveys

joint information on both the forecasted caseload growth and data variability (in some cases related

to issues with surveillance policies and reporting systems), thereby providing the decision maker

with an educated yet visually simple view of the risk status for each country. The forecast

clustering and risk classification proposed in AGGFORCLUS can assist decision-makers in

strictly setting their policies or regulations when developing mitigation plans. However, the model

we use to forecast confirmed cases for COVID-19 has certain limitations. As a pure univariate

model, it does not consider the primary drivers of cases, such as governmental actions. Our model

exclusively extrapolates established patterns in the data, assuming that these patterns are accurate

and will continue to hold in the future.

Overall, we believe the methodology is sustainable in time. The model reassessment depends

exclusively on the confirmed case time series, in contrast to other decision support models that

rely on multiple datasets and simulated parameters that might need to be re-evaluated for each

decision epoch.

Figure 7. AGGFORCLUS Rolling-Origin

Figure 7. AGGFORCLUS Rolling-Origin (continued)

Table 2. Overall Results AGGFORCLUS (R1, R2, R3, R4, R5)

Country

Best Method

MAPE (%)

GR-F

GR-L

GR-H

DIFF-INT

AGGFORCLUS

Cluster

AGGFORCLUS

Quadrant

Oxford

Stringency (%)

Algeria

B-ETS, B-ARIMA, ETS, B-ETS, B-ETS

0.24, 0.48, 0.92, 0.14, 2.29

2.13, 1.38, 1.04, 0.96, 1.99

1.58, 0.89, 0.44, 0.43, 1.6

2.66, 1.83, 1.62, 1.34, 2.39

1.08, 0.94, 1.18, 0.91, 0.79

13, 16, 15, 15, 13

5, 4, 5, 4, 4

76.85, 76.85, 76.85, 65.74, 65.74

Argentina

B-ARIMA, ETS, ETS, ARIMA, ETS

7.09, 10.41, 1.32, 4.99, 1.85

3.52, 3.92, 3.42, 4.18, 3.17

2.63, 2.81, 2.7, 3.31, 2.42

4.37, 4.93, 4.1, 4.98, 3.87

1.74, 2.12, 1.4, 1.67, 1.44

21, 12, 17, 28, 18

8, 8, 8, 8, 8

90.74, 90.74, 88.89, 88.89, 88.89

Australia

B-ETS, ARIMA, B-ETS, ETS, B-ARIMA

0.2, 0.11, 0.09, 0.44, 1.18

0.01, 0.23, 0.17, 0.03, 1.35

-1.27, -0.93, -0.78, -1.59, 0.72

1.18, 1.27, 1.59, 1.44, 2.19

2.45, 2.2, 2.38, 3.04, 1.47

6, 11, 6, 25, 8

2, 2, 3, 3, 5

67.13, 63.43, 60.19, 50.46, 52.31

Austria

B-ETS, B-ETS, B-ETS, ETS, ARIMA

0.17, 0.09, 0.1, 0.06, 0.23

0.33, 0.31, 0.17, 0.18, 0.25

-0.82, -0.72, -0.98, -0.81, -0.68

1.46, 1.67, 1.22, 1.08, 1.11

2.28, 2.39, 2.2, 1.89, 1.79

9, 11, 7, 9, 25

5, 6, 3, 6, 6

59.26, 53.7, 50, 50, 50

Bolivia

B-ETS, ETS, B-ETS, ETS, ETS

7.53, 8.11, 3.6, 8.76, 2.48

4.47, 5.15, 2.56, 3.26, 2.28

3.37, 4.18, 1.62, 2.57, 1.68

5.27, 6.03, 3.26, 3.91, 2.84

1.9, 1.85, 1.64, 1.34, 1.17

12, 1, 14, 4, 14

8, 8, 8, 8, 8

96.3, 93.52, 88.89, 88.89, 89.81

Brazil

B-ARIMA, ARIMA, ARIMA, ETS, B-ETS

5.93, 2.06, 3.28, 1.41, 1.2

4.53, 4.32, 3.23, 3.12, 2.28

3.3, 3.32, 2.86, 2.39, 1.68

5.89, 5.24, 3.58, 3.79, 2.7

2.59, 1.92, 0.73, 1.41, 1.02

14, 1, 12, 4, 4

8, 8, 7, 8, 8

81.02, 81.02, 77.31, 77.31, 77.31

Cameroon

ARIMA, B-ETS, ETS, ETS, ARIMA

9.09, 11, 5.35, 4.28, 4.18

1.75, 1.86, 2.65, 2.03, 1.39

1.07, -0.58, 1.86, 1.41, 0.28

2.34, 3.44, 3.35, 2.58, 2.35

1.27, 4.02, 1.49, 1.17, 2.08

16, 5, 4, 5, 17

5, 6, 8, 5, 6

63.89, 63.89, 60.19, 60.19, 60.19

Canada

B-ETS, B-ARIMA, ETS, ARIMA, ETS

0.12, 0.82, 1.28, 0.22, 0.36

1.26, 0.95, 0.44, 0.38, 0.3

0.74, 0.6, 0.02, -0.06, -0.06

1.73, 1.34, 0.85, 0.81, 0.64

0.99, 0.74, 0.84, 0.88, 0.7

20, 10, 20, 26, 20

4, 4, 4, 4, 4

72.69, 70.83, 70.83, 70.83, 68.98

Chile

B-ETS, ARIMA, B-ARIMA, ARIMA, B-ETS

4.84, 10.1, 1.57, 0.95, 0.52

4.21, 4.57, 3.09, 2.05, 1.44

2.7, 3.25, 2.1, 1.26, 0.74

5.32, 5.72, 3.65, 2.78, 2.16

2.62, 2.46, 1.55, 1.52, 1.42

14, 1, 22, 22, 8

9, 9, 8, 5, 5

78.24, 78.24, 78.24, 78.24, 78.24

China

B-ETS, ARIMA, ETS, ARIMA, B-ARIMA

0.03, 0.11, 0.02, 0.13, 0.02

0.02, 0.03, 0.01, 0.04, 0.03

-0.04, -0.05, -0.03, -0.03, -0.04

0.07, 0.11, 0.05, 0.11, 0.1

0.11, 0.16, 0.08, 0.14, 0.13

2, 9, 9, 2, 6

1, 1, 1, 1, 1

81.94, 81.94, 78.24, 78.24, 78.24

Colombia

B-ARIMA, ARIMA, B-ETS, ETS, ARIMA

4.03, 5.01, 0.53, 4.77, 0.93

3.29, 4.96, 2.79, 3.93, 3.07

2.38, 4.26, 2.25, 3.17, 2.56

3.9, 5.62, 3.25, 4.64, 3.56

1.52, 1.35, 0.99, 1.47, 1

8, 1, 21, 28, 12

8, 8, 7, 8, 8

87.04, 87.04, 87.04, 87.04, 87.04

Costa Rica

ARIMA, ETS, B-ETS, B-ARIMA, ETS

0.68, 1.21, 4.08, 9.09, 4.26

1.3, 1.86, 2.37, 3.24, 3.62

0.32, 1.02, 1.45, 2.34, 2.65

2.18, 2.63, 3.21, 4.17, 4.5

1.87, 1.61, 1.76, 1.84, 1.86

22, 28, 14, 14, 1

5, 5, 5, 8, 9

72.22, 72.22, 72.22, 72.22, 73.61

Croatia

Naïve, B-ARIMA, Naïve, Naïve, ARIMA

1.5, 0.16, 0.03, 0.42, 5.42

0, 0.03, 0, 0, 1.7

-0.77, -1.18, -0.64, -0.59, 0.72

0.69, 0.8, 0.58, 0.54, 2.6

1.46, 1.98, 1.22, 1.13, 1.87

7, 7, 26, 10, 17

2, 2, 2, 2, 6

70.37, 50.93, 50.93, 54.63, 54.63

Denmark

ETS, B-ETS, ARIMA, ETS, B-ETS

1.13, 0.45, 0.31, 0.19, 0.24

0.54, 0.46, 0.25, 0.19, 0.19

0.35, -0.48, -0.51, 0.05, -0.36

0.72, 1.34, 0.97, 0.33, 0.77

0.37, 1.82, 1.48, 0.28, 1.13

24, 6, 10, 7, 2

4, 5, 5, 4, 5

68.52, 60.19, 57.41, 57.41, 57.41

Dominican Republic

ARIMA, ARIMA, ETS, B-ETS, B-ARIMA

4.43, 0.68, 0.96, 1.92, 4.6

2.22, 1.94, 1.66, 1.66, 1.92

1.52, 1.4, 1.24, 1.28, 1.47

2.87, 2.44, 2.07, 2, 2.35

1.35, 1.04, 0.83, 0.72, 0.88

13, 29, 29, 29, 21

5, 5, 4, 4, 5

87.04, 87.04, 87.04, 87.04, 83.33

Ecuador

B-ARIMA, B-ETS, ETS, B-ARIMA, B-ARIMA

1.17, 0.57, 0.61, 0.36, 0.7

1.87, 1.09, 1.13, 1.19, 0.98

0.3, -2.03, -0.44, -0.01, 0.12

3.04, 3.44, 2.36, 1.85, 1.48

2.74, 5.47, 2.8, 1.86, 1.36

22, 3, 2, 17, 3

6, 6, 6, 5, 5

86.11, 86.11, 83.33, 79.63, 79.63

Egypt

B-ETS, ETS, ARIMA, B-ARIMA, B-ARIMA

3.02, 1.74, 0.5, 1.49, 0.38

3.96, 4.58, 3.23, 2.46, 1.83

3.12, 3.83, 2.38, 1.92, 1.46

4.94, 5.28, 4.01, 2.92, 2.26

1.82, 1.45, 1.63, 0.99, 0.79

1, 1, 22, 27, 27

8, 8, 8, 8, 4

84.26, 84.26, 71.3, 71.3, 71.3

Ethiopia

B-ARIMA, B-ETS, B-ARIMA, ETS, B-ETS

6.97, 21.71, 1.98, 1.55, 4.45

4.39, 6.35, 4.66, 2.9, 2.4

2.92, 0.02, 3.41, 1.52, -0.08

6.07, 11.03, 5.44, 4.11, 4.62

3.15, 11.01, 2.04, 2.59, 4.7

14, 18, 1, 23, 22

9, 9, 9, 9, 9

80.56, 80.56, 80.56, 80.56, 80.56

Finland

ETS, ARIMA, ARIMA, B-ARIMA, ARIMA

1.02, 1.09, 0.88, 0.24, 0.07

0.73, 0.37, 0.18, 0.1, 0.11

0.5, -0.26, -0.41, -0.36, -0.39

0.94, 0.97, 0.73, 0.55, 0.59

0.45, 1.23, 1.14, 0.91, 0.98

24, 24, 26, 10, 24

4, 5, 5, 1, 2

62.04, 56.48, 44.44, 35.19, 35.19

France

B-ARIMA, ETS, Naïve, B-ETS, B-ETS

0.46, 0.92, 0.95, 0.41, 0.25

0.96, 0.25, 0, -0.1, -0.09

-2.6, -4.71, -2.04, -3.35, -2.8

3.3, 3.35, 1.53, 2.51, 2.18

5.9, 8.06, 3.58, 5.87, 4.98

5, 22, 25, 11, 11

6, 3, 3, 3, 3

76.85, 75, 72.22, 72.22, 51.85

Germany

ETS, B-ETS, B-ETS, ETS, ARIMA

0.16, 0.13, 0.56, 0.08, 0.17

0.44, 0.29, 0.15, 0.4, 0.19

-1.09, -1.04, -0.74, -0.73, -0.37

1.79, 1.54, 1, 1.42, 0.73

2.89, 2.58, 1.74, 2.15, 1.09

9, 11, 24, 6, 2

6, 3, 2, 6, 5

59.72, 59.72, 59.72, 63.43, 63.43

Ghana

B-ARIMA, B-ARIMA, B-ETS, ETS, B-ETS

13.35, 0.92, 2.69, 2.27, 2.31

2.33, 1.94, 2.29, 2.13, 2.45

-1.44, 0.33, 0.76, 1.2, 1.79

4.24, 3.5, 3.57, 2.96, 3.1

5.69, 3.17, 2.81, 1.76, 1.31

5, 5, 28, 22, 14

6, 6, 6, 5, 8

62.04, 56.48, 56.48, 56.48, 52.78

Greece

ARIMA, B-ETS, ETS, ETS, B-ETS

0.34, 0.34, 1.61, 0.59, 0.91

0.31, 0.09, 0.38, 0.2, 0.36

-2.85, -3.6, -1.59, -0.44, -1.53

2.64, 2.97, 1.97, 0.78, 2.18

5.49, 6.58, 3.56, 1.22, 3.71

5, 22, 25, 24, 11

3, 3, 6, 5, 6

68.52, 62.04, 58.33, 44.44, 44.44

Hungary

B-ETS, B-ETS, B-ETS, ETS, ARIMA

1.93, 0.81, 0.98, 0.61, 0.24

0.91, 0.57, 0.29, 0.1, 0.14

-0.63, -0.36, -0.82, -0.81, -0.66

2.06, 1.55, 1.14, 0.93, 0.88

2.69, 1.9, 1.96, 1.74, 1.54

9, 25, 24, 9, 7

6, 5, 6, 2, 3

66.67, 66.67, 61.11, 54.63, 54.63

Iceland

ETS, B-ARIMA, Naïve, B-ARIMA, B-ARIMA

0.04, 0.03, 0.02, 0.07, 0.09

0.04, -0.08, 0, 0.13, 0.11

-1.92, -1.16, -0.56, -0.87, -0.86

1.69, 0.83, 0.52, 1.47, 1.4

3.61, 1.99, 1.08, 2.34, 2.26

3, 7, 26, 6, 9

3, 2, 2, 3, 3

50, 39.81, 39.81, 39.81, 39.81

India

ARIMA, ARIMA, ARIMA, B-ARIMA, ETS

1.63, 2.64, 1.46, 2.42, 2.08

4.23, 3.84, 2.96, 3.02, 2.83

3.36, 3.17, 2.44, 2.66, 2.47

5.03, 4.47, 3.46, 3.43, 3.19

1.66, 1.3, 1.02, 0.77, 0.72

1, 12, 21, 12, 23

8, 8, 8, 7, 7

81.94, 81.94, 87.5, 87.5, 87.5

Indonesia

B-ETS, ARIMA, B-ETS, ARIMA, ETS

2.7, 3.25, 0.88, 2.7, 0.44

2.49, 2.14, 2.27, 2.25, 1.93

1.79, 1.53, 1.71, 1.77, 1.53

2.92, 2.71, 2.72, 2.71, 2.3

1.13, 1.18, 1.01, 0.94, 0.77

13, 13, 13, 13, 13

5, 5, 5, 5, 4

71.76, 71.76, 68.06, 68.06, 68.06

Iran

B-ARIMA, B-ETS, B-ARIMA, B-ARIMA, B-ETS

1.68, 0.21, 1.03, 0.69, 0.14

1.75, 1.45, 1.13, 1.25, 1.01

1.34, 0.97, 0.78, 0.91, 0.61

2.18, 1.96, 1.51, 1.52, 1.4

0.84, 0.98, 0.72, 0.61, 0.8

16, 16, 16, 16, 15

4, 4, 4, 4, 4

45.37, 44.44, 44.44, 44.44, 41.67

Iraq

ARIMA, ARIMA, B-ARIMA, B-ETS, B-ARIMA

1.42, 9.36, 12.12, 3.37, 4.26

2.63, 5.24, 5.45, 4.33, 3.48

1.76, 4.2, 3.96, 1.19, 2.66

3.43, 6.18, 6.64, 6.67, 4.14

1.66, 1.98, 2.67, 5.48, 1.48

18, 1, 18, 18, 1

5, 8, 9, 9, 8

82.41, 92.59, 92.59, 92.59, 92.59

Ireland

B-ARIMA, B-ETS, B-ETS, B-ARIMA, B-ETS

0.79, 0.15, 0.22, 0.03, 0.03

0.48, 0.09, -0.03, 0.02, 0.03

-1.28, -1.42, -1.17, -1.54, -0.74

1.98, 1.56, 1.05, 1.49, 0.97

3.26, 2.98, 2.21, 3.03, 1.71

3, 2, 7, 25, 25

6, 3, 3, 3, 3

83.33, 83.33, 72.22, 72.22, 38.89

Italy

ARIMA, B-ETS, ETS, ETS, B-ETS

0.25, 0.11, 0.3, 0.08, 0.09

0.29, 0.2, 0.1, 0.06, 0.06

-0.14, -0.09, -0.1, -0.12, -0.1

0.7, 0.61, 0.29, 0.24, 0.23

0.83, 0.69, 0.39, 0.36, 0.33

19, 19, 9, 7, 6

1, 1, 1, 1, 1

67.59, 67.59, 67.59, 67.59, 67.59

Japan

ETS, B-ETS, ARIMA, ARIMA, ARIMA

0.42, 0.12, 0.06, 0.26, 0.36

0.15, 0.3, 0.25, 0.34, 0.54

-1.84, -0.99, -0.67, -0.48, -0.19

1.83, 1.41, 1.1, 1.11, 1.22

3.67, 2.4, 1.78, 1.59, 1.41

3, 11, 24, 9, 10

3, 6, 5, 5, 5

40.74, 34.26, 28.7, 28.7, 25.93

Jordan

ETS, Naïve, B-ARIMA, B-ETS, B-ARIMA

1.54, 5.8, 1.57, 1.88, 1.35

2.7, 0, 1.33, 0.79, 0.75

0.99, -0.68, 0.52, -1.11, -0.38

4.17, 0.61, 2.25, 2.78, 1.8

3.19, 1.29, 1.73, 3.89, 2.18

4, 26, 5, 25, 9

6, 2, 5, 6, 6

77.78, 77.78, 48.15, 48.15, 48.15

Kenya

ETS, ETS, B-ARIMA, B-ETS, B-ARIMA

3.61, 5.19, 8.11, 1.37, 1.36

4.24, 5.57, 3.49, 3.29, 2.95

2.58, 3.56, 1.98, 2.47, 2.37

5.67, 7.27, 4.58, 4.7, 3.49

3.09, 3.71, 2.6, 2.23, 1.13

14, 27, 27, 14, 12

9, 9, 9, 9, 8

88.89, 88.89, 86.11, 86.11, 86.11

Malaysia

B-ETS, B-ETS, B-ARIMA, B-ARIMA, B-ETS

1.72, 3.16, 1.07, 0.52, 0.21

0.44, 0.66, 0.35, 0.18, 0.07

-0.54, -0.28, -0.67, -0.62, -0.53

1.57, 1.6, 1.04, 1.08, 0.9

2.12, 1.87, 1.7, 1.69, 1.42

9, 25, 24, 9, 7

5, 5, 5, 5, 2

69.44, 75, 75, 54.63, 50.93

Mexico

B-ARIMA, B-ARIMA, B-ARIMA, ARIMA, ETS

2.39, 3.98, 1.49, 1.95, 0.42

3.59, 3.14, 2.54, 2.38, 1.58

2.84, 2.59, 2.02, 2.02, 0.87

4.29, 3.53, 3.02, 2.73, 2.24

1.45, 0.95, 1.01, 0.71, 1.37

21, 8, 8, 8, 8

8, 4, 5, 4, 5

82.41, 82.41, 72.69, 70.83, 70.83

Mongolia

ARIMA, B-ETS, ARIMA, ETS, B-ARIMA

41.75, 5.84, 2.03, 0.59, 1.18

-0.11, 3.96, 1.26, 0, 0.98

-8.63, -11.02, -2.1, -9.41, -0.85

4.21, 15.04, 3.66, 4.81, 2.64

12.85, 26.06, 5.75, 14.22, 3.49

11, 17, 3, 3, 11

3, 9, 6, 3, 6

75, 75, 71.3, 71.3, 71.3

Morocco

ETS, B-ETS, B-ARIMA, B-ARIMA, ARIMA

3.98, 1.49, 0.7, 1.27, 1.24

1.26, 0.75, 0.98, 1.83, 1.7

-0.02, -0.28, 0.12, 0.83, 1.04

2.39, 1.87, 1.65, 2.6, 2.3

2.41, 2.16, 1.53, 1.77, 1.26

22, 25, 15, 22, 8

5, 5, 5, 5, 5

93.52, 93.52, 93.52, 76.85, 68.52

Nepal

ETS, B-ARIMA, B-ARIMA, ETS, B-ETS

49.97, 18.78, 10.13, 5.03, 4.72

5, 8.03, 5.27, 5.23, 3.18

2.19, 5.83, 4.05, 4.43, 2.51

7.04, 10.12, 6.38, 5.96, 3.88

4.85, 4.3, 2.33, 1.53, 1.38

17, 14, 18, 1, 18

9, 9, 9, 8, 8

92.59, 92.59, 92.59, 92.59, 92.59

New Zealand

Naïve, Naïve, Naïve, ARIMA, ETS

0.13, 0.07, 0, 0.09, 0.04

0, 0, 0, 0.15, 0

-1.01, -0.91, -0.84, -1.45, -1.76

0.87, 0.8, 0.74, 1.56, 1.52

1.88, 1.71, 1.58, 3.01, 3.28

7, 7, 10, 25, 11

2, 2, 2, 3, 3

39.81, 37.04, 22.22, 22.22, 22.22

Nigeria

ARIMA, ARIMA, B-ETS, ARIMA, ARIMA

5.14, 2.55, 0.83, 3.96, 1

3.19, 3.3, 2.55, 2.85, 2.13

1.89, 2.41, 1.85, 2.22, 1.66

4.34, 4.12, 2.99, 3.44, 2.58

2.45, 1.72, 1.14, 1.21, 0.93

4, 4, 8, 21, 4

8, 5, 8, 8, 8

84.26, 84.26, 84.26, 80.09, 80.09

Norway

B-ARIMA, ARIMA, ETS, ETS, ETS

0.17, 0.18, 0.11, 0.08, 0.2

0.11, 0.13, 0.17, 0.11, 0.13

-0.65, -0.78, -0.57, -0.12, -0.49

0.84, 0.97, 0.85, 0.33, 0.72

1.49, 1.75, 1.41, 0.45, 1.22

7, 7, 10, 7, 2

2, 2, 2, 1, 2

58.33, 58.33, 43.52, 40.74, 40.74

Panama

ARIMA, ETS, ARIMA, B-ETS, ARIMA

1.49, 3.54, 0.84, 6.44, 0.67

1.66, 2.61, 2.24, 2.51, 2.59

1.49, 1.82, 1.62, 1.16, 2.23

1.82, 3.34, 2.82, 3.76, 2.92

0.34, 1.52, 1.2, 2.6, 0.69

23, 23, 13, 23, 23

4, 5, 5, 9, 7

89.81, 89.81, 83.33, 83.33, 83.33

Peru

B-ARIMA, ARIMA, B-ETS, B-ETS, B-ARIMA

2.35, 2.21, 3.76, 1.54, 0.71

3.23, 3.55, 1.83, 1.48, 1.18

2.37, 2.7, 0.95, 0.67, 0.51

3.97, 4.34, 2.77, 2.22, 1.76

1.6, 1.63, 1.83, 1.55, 1.25

8, 12, 23, 17, 3

8, 8, 6, 5, 5

92.59, 89.81, 89.81, 89.81, 89.81

Philippines

ETS, ETS, ARIMA, ARIMA, ARIMA

1.49, 2.73, 2.25, 0.5, 3.39

1.55, 3.36, 2.13, 1.88, 2.02

1.37, 2.16, 1.12, 1.05, 1.39

1.72, 4.43, 3.04, 2.65, 2.61

0.34, 2.27, 1.91, 1.61, 1.22

23, 4, 23, 22, 4

4, 8, 6, 5, 8

96.3, 77.78, 77.78, 83.33, 83.33

Poland

ARIMA, ARIMA, B-ARIMA, ARIMA, ARIMA

1.03, 0.93, 0.91, 0.31, 0.92

1.63, 1.41, 1.24, 1.09, 0.98

1.48, 1.28, 1.08, 0.91, 0.81

1.78, 1.53, 1.41, 1.27, 1.15

0.3, 0.25, 0.33, 0.35, 0.34

23, 20, 16, 16, 16

4, 4, 4, 4, 4

83.33, 64.81, 53.7, 50.93, 50.93

Portugal

ETS, ARIMA, B-ARIMA, B-ARIMA, ARIMA

0.83, 0.41, 0.35, 0.22, 0.14

0.55, 0.8, 0.91, 0.86, 0.8

0.23, 0.02, 0.43, 0.43, 0.32

0.85, 1.52, 1.31, 1.4, 1.26

0.61, 1.5, 0.88, 0.97, 0.94

15, 15, 15, 15, 15

4, 5, 4, 5, 5

65.74, 65.74, 60.65, 59.26, 60.65

Saudi Arabia

ARIMA, B-ETS, ARIMA, B-ARIMA, ARIMA

2.72, 2.07, 3.21, 2.46, 2.44

3.64, 1.82, 2.64, 2.74, 1.92

2.86, 1.34, 2.12, 2.28, 1.48

4.38, 2.36, 3.14, 3.3, 2.34

1.53, 1.02, 1.01, 1.02, 0.87

21, 21, 8, 21, 21

8, 4, 8, 8, 5

89.81, 91.67, 81.94, 81.94, 71.3

Senegal

ETS, ETS, ARIMA, ETS, B-ETS

8.91, 0.89, 1.24, 1.79, 1.05

2.71, 2.27, 1.88, 1.98, 1.55

1.42, 1.31, 1.14, 1.4, 1.07

3.84, 3.14, 2.57, 2.52, 1.98

2.42, 1.84, 1.43, 1.12, 0.92

4, 23, 23, 5, 5

5, 5, 5, 5, 5

72.22, 72.22, 61.11, 61.11, 57.41

South Africa

ARIMA, ARIMA, B-ARIMA, B-ARIMA, B-ARIMA

5.3, 4.82, 3.47, 5.6, 6.17

4.08, 4.35, 3.71, 3.54, 3.7

3.24, 3.39, 2.74, 2.89, 3.07

4.85, 5.22, 4.4, 4.23, 4.25

1.62, 1.83, 1.66, 1.34, 1.17

1, 1, 17, 4, 1

8, 8, 8, 8, 8

84.26, 84.26, 76.85, 76.85, 76.85

South Korea

B-ETS, B-ARIMA, B-ARIMA, B-ARIMA, B-ETS

0.09, 0.41, 0.11, 0.15, 0.44

0.16, 0.39, 0.34, 0.45, 0.3

-0.12, 0.11, 0.07, 0.22, -0.08

0.56, 0.68, 0.6, 0.72, 0.57

0.68, 0.57, 0.53, 0.5, 0.65

19, 19, 19, 19, 20

1, 4, 4, 4, 4

39.81, 55.09, 53.24, 53.24, 53.24

Spain

B-ETS, ETS, ETS, B-ETS, ARIMA

0.36, 0.2, 0.36, 0.14, 0.05

0.28, 0.17, 0.11, 0.14, 0.14

-0.61, -0.19, -0.22, -0.34, -0.72

1.13, 0.5, 0.42, 0.82, 0.93

1.74, 0.69, 0.64, 1.15, 1.66

7, 19, 11, 24, 25

2, 1, 1, 2, 3

79.17, 68.06, 57.41, 57.41, 41.2

Sweden

B-ARIMA, ARIMA, B-ETS, ARIMA, ARIMA

0.58, 0.56, 2.55, 2.45, 0.76

1.55, 1.34, 1.43, 1.65, 1.58

1.37, 1.23, 0.84, 1.32, 1.29

1.68, 1.44, 1.82, 1.98, 1.87

0.31, 0.21, 0.98, 0.66, 0.59

23, 20, 29, 29, 5

4, 4, 4, 4, 4

64.81, 64.81, 64.81, 59.26, 59.26

Thailand

B-ETS, ARIMA, ETS, B-ETS, ETS

0.32, 0.21, 0.37, 0.37, 0.09

-0.02, 0.39, 0.14, 0.13, 0.03

-1.32, -0.5, -0.76, -0.62, -0.16

0.91, 1.22, 0.97, 1, 0.2

2.23, 1.72, 1.74, 1.63, 0.36

6, 6, 24, 9, 6

2, 5, 2, 2, 1

75, 75, 62.96, 59.26, 59.26

Trinidad and Tobago

Naïve, Naïve, Naïve, ARIMA, ARIMA

0, 0.09, 0, 2.92, 0.71

0, 0, 0, 0.26, 0.32

-0.54, -0.49, -0.45, -0.97, -0.82

0.5, 0.45, 0.42, 1.35, 1.33

1.04, 0.94, 0.87, 2.32, 2.16

19, 26, 11, 6, 9

1, 1, 1, 6, 6

90.74, 87.04, 77.78, 77.78, 56.48

Tunisia

Naïve, ARIMA, ETS, ARIMA, B-ETS

0.47, 0.51, 0.42, 1.44, 0.7

0, 0.49, 0.04, 0.64, 0.17

-0.9, -1.4, -0.32, -0.46, -0.54

0.79, 2.09, 0.39, 1.62, 0.84

1.69, 3.49, 0.71, 2.08, 1.38

7, 2, 11, 6, 7

2, 6, 1, 6, 2

83.33, 79.63, 29.63, 29.63, 26.85

United Kingdom

B-ARIMA, ARIMA, ETS, B-ARIMA, B-ETS

0.37, 1.34, 0.22, 0.06, 0.47

1.07, 0.56, 0.42, 0.33, 0.17

0.38, 0.19, -0.44, -0.01, -0.24

1.52, 0.91, 1.22, 0.66, 0.53

1.14, 0.72, 1.66, 0.66, 0.77

20, 19, 24, 20, 19

5, 4, 5, 4, 1

71.3, 69.44, 73.15, 71.3, 71.3

Uruguay

ETS, ARIMA, ETS, Naïve, ARIMA

0.86, 0.6, 0.85, 0.29, 3.95

0.49, 0.91, 0.34, 0, 0.52

-0.12, 0.38, -0.19, -0.55, -0.2

1.06, 1.4, 0.83, 0.51, 1.18

1.18, 1.02, 1.02, 1.06, 1.37

10, 10, 20, 10, 10

5, 4, 5, 2, 5

61.11, 61.11, 61.11, 57.41, 57.41

US

ARIMA, B-ARIMA, ARIMA, ARIMA, ETS

0.36, 0.25, 0.19, 0.29, 1.42

1.35, 1.18, 1.06, 1.09, 1.32

1.11, 0.96, 0.88, 0.92, 0.89

1.58, 1.44, 1.23, 1.25, 1.73

0.47, 0.48, 0.35, 0.33, 0.84

23, 20, 16, 16, 5

4, 4, 4, 4, 4

72.69, 72.69, 72.69, 68.98, 68.98

Vietnam

B-ARIMA, Naïve, Naïve, ETS, Naïve

4.13, 0.64, 0.36, 0.76, 0.79

0.64, 0, 0, 0.46, 0

0.29, -0.7, -0.64, -0.07, -0.54

1.02, 0.63, 0.58, 0.95, 0.5

0.73, 1.33, 1.21, 1.02, 1.04

15, 26, 26, 26, 26

4, 2, 2, 5, 2

69.44, 69.44, 69.44, 58.33, 55.56

Compliance with Ethical Standards

Disclosure statement: The authors report there are no competing interests to declare.

Data availability statement: The coding information, supporting results, and analysis can be

found at AGGFORCLUS Repository

7. References

Abbasimehr, H., & Paki, R. (2021). Prediction of COVID-19 confirmed cases combining deep

learning methods and Bayesian optimization. Chaos, Solitons & Fractals, 142, 110511.

Abou-Ismail, A. (2020). Compartmental models of the COVID-19 pandemic for physicians and

physician-scientists. SN comprehensive clinical medicine, 2(7), 852-858.

Al-qaness, M. A., Ewees, A. A., Fan, H., & Abd El Aziz, M. (2020). Optimization method for

forecasting confirmed cases of COVID-19 in China. Journal of Clinical Medicine, 9(3), 674.

Bacchi, U. (2022, March 18). Pandemic surveillance: Is tracing tech here to stay? The Japan

Times. Retrieved July 26, 2022, from

https://www.japantimes.co.jp/news/2022/03/18/world/covid-contact-tracing-surveillance/

Bergmeir, C., Hyndman, R. J., & Benítez, J. M. (2016). Bagging exponential smoothing methods

using STL decomposition and Box-Cox transformation. International journal of

forecasting, 32(2), 303-312.

Bock, H. H. (1985). On some significance tests in cluster analysis. Journal of classification, 2(1),

77-108.

Bodapati, S., Bandarupally, H., & Trupthi, M. (2020, October). COVID-19 time series forecasting

of daily cases, deaths caused and recovered cases using long short term memory networks. In 2020

IEEE 5th International Conference on Computing Communication and Automation (ICCCA) (pp.

525-530). IEEE.

Borghi, P. H., Zakordonets, O., & Teixeira, J. P. (2021). A COVID-19 time series forecasting

model based on MLP ANN. Procedia Computer Science, 181, 940–947.

Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical

Society: Series B, 26(2), 211–252.

Box, GEP, & Jenkins, G.M. (1970) Time Series Analysis: Forecasting and Control. Holden-Day,

San Francisco

Brown, R. (1959), “Statistical forecasting for inventory control,” New York McGraw Hill.

Centers for Disease, Control, and Prevention (CDC). (2020, February 15). Human Coronavirus

Types. Retrieved on May 16, 2022, from https://www.cdc.gov/coronavirus/types.html

Chen, S., Guo, L., Alghaith, T., Dong, D., Alluhidan, M., Hamza, M. M., Herbst, C. H., et al.

(2021). Effective COVID-19 Control: A Comparative Analysis of the Stringency and Timeliness

of Government Responses in Asia. International Journal of Environmental Research and Public

Health, 18(16), 8686. MDPI AG. Retrieved from http://dx.doi.org/10.3390/ijerph18168686

Chimmula, V. K. R., & Zhang, L. (2020). Time series forecasting of COVID-19 transmission in

Canada using LSTM networks. Chaos, Solitons & Fractals, 135, 109864.

Chretien J., George D., Shaman J., Chitale R., McKenzie F. (2014). Influenza forecasting in human

populations: A scoping review. PLOS. One, 9(4), e94130.

Cleveland, W. S., Grosse, E., & Shyu, W. M. (2017). Local regression models. In Statistical

Models in S (pp. 309-376). Routledge.

de Bruin, Y. B., Lequarre, A. S., McCourt, J., Clevestig, P., Pigazzani, F., Jeddi, M. Z., ... &

Goulart, M. (2020). Initial impacts of global risk mitigation measures taken during the combatting

of the COVID-19 pandemic. Safety Science, 104773.

Doornik, J. A., Castle, J. L., & Hendry, D. F. (2020). Short-term forecasting of the coronavirus

pandemic. International Journal of Forecasting.

Ferguson, N., Laydon, D., Nedjati Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bhatia, S.,

Boonyasiri, A., Cucunuba Perez, ZULMA, Cuomo-Dannenburg, G. and Dighe, A. (2020). Report

9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and

healthcare demand. DOI: https://doi.org/10.25561/77482

Frutos, R., Gavotte, L., Serra-Cobo, J., Chen, T., & Devaux, C. (2021). COVID-19 and emerging

infectious diseases: The society is still unprepared for the next pandemic. Environmental

Research, 202, 111676.

Gecili, E., Ziady, A., & Szczesniak, R. D. (2021). Forecasting COVID-19 confirmed cases, deaths,

and recoveries: Revisiting established time series modeling through novel applications for the

USA and Italy. PloS one, 16(1), e0244173.

Gharoie Ahangar, R., Pavur, R., Fathi, M., & Shaik, A. (2020). Estimation and demographic

analysis of COVID-19 infections with respect to weather factors in Europe. Journal of Business

Analytics, 3(2), 93-106.

Guerrero, V. (1993). Time-series analysis supported by power transformations. Journal of

Forecasting, 12, 37–48.

Hale, T., Angrist, N., Goldszmidt, R., Kira, B., Petherick, A., Phillips, T., Webster, S., Cameron-

Blake, E., Hallas, L., Majumdar, S., & Tatlow, H. (2021). A global panel database of pandemic

policies (Oxford COVID-19 Government Response Tracker). Nature Human Behaviour, 5(4),

529–538. https://doi.org/10.1038/s41562-021-01079-8

Hale, T., Hale, A. J., Kira, B., Petherick, A., Phillips, T., Sridhar, D., ... & Angrist, N. (2020).

Global assessment of the relationship between government response measures and COVID-19

deaths. MedRxiv.

Holt, C. (1957), “Forecasting trends and seasonal by exponentially weighted averages,”

International Journal of Forecasting, 20(1), 5–13.

Hyndman, R. and Athanasopoulos, G. (2018), “Forecasting: Principles and practice,” OTexts.

Hyndman, R., & Khandakar, Y. (2008). Automatic time series forecasting: The forecast package

for R. Journal of Statistical Software, 27(3), 1–22.

Hyndman, R. J., Koehler, A. B., Ord, J. K., & Snyder, R. D. (2001). Prediction intervals for

exponential smoothing state-space models (No. 11/01). Monash University, Department of

Econometrics and Business Statistics.

Hyndman, R. J., Koehler, A. B., Snyder, R. D., & Grose, S. (2002). A state-space framework for

automatic forecasting using exponential smoothing methods. International Journal of

forecasting, 18(3), 439-454.

Johns Hopkins University. (2020, May 8). Coronavirus COVID-19 Global Cases by the Center

for system Science and Engineering. (CSSE.) at John Hopkins University. Retrieved on May 16,

2022, from https://github.com/CSSEGISandData/COVID-19

Kaur, S., Bherwani, H., Gulia, S., Vijay, R., & Kumar, R. (2021). Understanding COVID-19

transmission, health impacts and mitigation: timely social distancing is the key. Environment,

Development and Sustainability, 23(5), 6681-6697.

Kufel, T. (2020). ARIMA-based forecasting of the dynamics of confirmed Covid-19 cases for

selected European countries. Equilibrium. Quarterly Journal of Economics and Economic

Policy, 15(2), 181-204.

Künsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations. Annals

of Statistics, 17(3), 1217–1241.

Le, K., & Nguyen, M. (2021). The psychological consequences of COVID-19

lockdowns. International Review of Applied Economics, 35(2), 147-163.

Li, M. L., Bouardi, H. T., Lami, O. S., Trikalinos, T. A., Trichakis, N., & Bertsimas, D. (2022).

Forecasting COVID-19 and analyzing the effect of government interventions. Operations

Research.

Li, J., & Giabbanelli, P. (2021). Returning to a normal life via COVID-19 vaccines in the United

States: a large-scale Agent-Based simulation study. JMIR medical informatics, 9(4), e27419.

Lloyd, S. (1982). Least squares quantization in PCM. IEEE transactions on information

theory, 28(2), 129-137. CiteSeerX 10.1.1.131.1338. doi:10.1109/TIT.1982.1056489.

Makridakis, S., Wheelwright, S., and Hyndman, R. (1998), “Forecasting: Methods and

applications,” John Wiley & Sons, New York, 3rd Ed.

Maleki, M., Mahmoudi, M. R., Wraith, D., & Pho, K. H. (2020). Time series modelling to forecast

the confirmed and recovered cases of COVID-19. Travel medicine and infectious disease, 37,

101742.

Medeiros, M. C., Street, A., Valladão, D., Vasconcelos, G., & Zilberman, E. (2022). Short-term

Covid-19 forecast for latecomers. International Journal of Forecasting, 38(2), 467-488.

Mukherjee, U. K., Bose, S., Ivanov, A., Souyris, S., Seshadri, S., Sridhar, P., ... & Xu, Y. (2021).

Evaluation of reopening strategies for educational institutions during COVID-19 through agent-

based simulation. Scientific Reports, 11(1), 1-24.

Papastefanopoulos, V., Linardatos, P., & Kotsiantis, S. (2020). COVID-19: a comparison of time

series methods to forecast percentage of active cases per population. Applied Sciences, 10(11),

3880.

Petropoulos, F., Hyndman, R. J., & Bergmeir, C. (2018). Exploring the sources of uncertainty:

Why does bagging for time series forecasting work? European Journal of Operational

Research, 268(2), 545-554.

Petropoulos, F., Makridakis, S., & Stylianou, N. (2020). COVID-19: Forecasting confirmed cases

and deaths with a simple time series model. International Journal of Forecasting.

Prieto D., Das T. K., Savachkin A., Uribe A., Izurieta R., and Malavade S. (2012). A systematic

review to identify enhancements of pandemic simulation models for operational use at provincial

and local levels. BMC. Public Health, 12(1), 251.

Rahimi, I., Chen, F., & Gandomi, A. H. (2021). A review on COVID-19 forecasting

models. Neural Computing and Applications, 1-11.

Ramazi, P., Haratian, A., Meghdadi, M., Mari Oriyad, A., Lewis, M. A., Maleki, Z., Vega, R.,

Wang, H., Wishart, D. S., & Greiner, R. (2021). Accurate long-range forecasting of COVID-19

mortality in the USA. Scientific Reports, 11(1), 1–11.

Ramezani, S. B., Amirlatifi, A., & Rahimi, S. (2021). A novel compartmental model to capture

the nonlinear trend of COVID-19. Computers in Biology and Medicine, 134, 104421.

Ramos, P., Santos, N., and Rebelo, R. (2015), “Performance of state space and ARIMA models

for consumer retail sales forecasting,” Robotics and Computer-Integrated Manufacturing, 34,

151–163.

Rauf, H. T., Lali, M., Khan, M. A., Kadry, S., Alolaiyan, H., Razaq, A., & Irfan, R. (2021). Time

series forecasting of COVID-19 transmission in Asia Pacific countries using deep neural networks.

Personal and Ubiquitous Computing, 1–18.

Reich, N.G., Brooks, L.C., Fox, S.J., Kandula, S., McGowan, C.J., Moore, E., Osthus, D., Ray,

E.L., Tushar, A., Yamana, TK and Biggerstaff, M. (2019). A collaborative, multiyear, multimodel

assessment of seasonal influenza forecasting in the United States. Proceedings of the National

Academy of Sciences, 116(8), 3146-3154.

Rui, R., Tian, M., Tang, M. L., Ho, G. T. S., & Wu, C. H. (2021). Analysis of the spread of COVID-

19 in the USA with a spatio-temporal multivariate time series model. International Journal of

Environmental Research and Public Health, 18(2), 774.

Salgotra, R., Gandomi, M., & Gandomi, A. H. (2020). Time series analysis and forecast of the

COVID-19 pandemic in India using genetic programming. Chaos, Solitons & Fractals, 138,

109945.

Shahid, F., Zameer, A., & Muneeb, M. (2020). Predictions for COVID-19 with deep learning

models of LSTM, GRU and Bi-LSTM. Chaos, Solitons & Fractals, 140, 110212.

Sharov, K. S. (2020). Creating and applying SIR modified compartmental model for calculation

of COVID-19 lockdown efficiency. Chaos, Solitons & Fractals, 141, 110295.

Soto-Ferrari, M., Chams-Anturi, O., & Escorcia-Caballero, J. P. (2020). A time-series forecasting

performance comparison for neural networks with state space and ARIMA models. In Proceedings

of the 5th NA International Conference on Industrial Engineering and Operations Management.

Soto-Ferrari, M., Chams-Anturi, O., Escorcia-Caballero, J. P., Hussain, N., & Khan, M. (2019).

Evaluation of bottom-up and top-down strategies for aggregated forecasts: state-space models and

ARIMA applications. In International Conference on Computational Logistics (pp. 413-427).

Springer, Cham.

Soto-Ferrari, M., Chams-Anturi, O., Escorcia-Caballero, J. P., Romero-Rodriguez, D., Daza-

Escorcia, J., & Ferrari-Padilla, B. (2021). Mortality Incidence for SARS-CoV-2 Non-Survivor

Infected in Colombia: A Potential Vaccination Priority Guide Based on

Comorbidities Proceedings of the International Conference on Industrial Engineering and

Operations Management Sao Paulo, Brazil, April 5 - 8, 2021.

Soto-Ferrari, M., Holvenstot, P., Prieto, D., de Doncker, E., & Kapenga, J. (2013). Parallel

programming approaches for an agent-based simulation of concurrent pandemic and seasonal

influenza outbreaks. Procedia computer science, 18, 2187-2192.

Tandon, H., Ranjan, P., Chakraborty, T., & Suhag, V. (2020). Coronavirus (COVID-19): ARIMA

based time-series analysis to forecast near future. arXiv preprint arXiv:2004.07859.

Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: an analysis and review.

International Journal of Forecasting, 16(4), 437-450.

Tokumoto, A., Akaba, H., Oshitani, H., Jindai, K., Wada, K., Imamura, T., et al. (2021). COVID‐

19 health system response monitor: Japan. New Delhi: World Health Organization Regional Office

for South‐East Asia.

Violato, C., Violato, E. M., & Violato, E. M. (2021). Impact of the stringency of lockdown

measures on covid-19: A theoretical model of a pandemic. PloS one, 16(10), e0258205.

Winters, P. (1960). “Forecasting sales by exponentially weighted moving averages,” Management

Science, 6, 324–342.

Wong, M. C., Huang, J., Teoh, J., & Wong, S. H. (2020). Evaluation on different non-

pharmaceutical interventions during COVID-19 pandemic: An analysis of 139 countries. Journal

of Infection, 81(3), e70-e71.

World Health Organization (WHO). (2020, May 8). Coronavirus disease (COVID-19) outbreak.

Retrieved on July 20, 2022, from https://www.who.int/emergencies/diseases/novel-coronavirus-

2019

Yige Li, Eduardo A Undurraga, José R Zubizarreta. (2022). Effectiveness of Localized

Lockdowns in the COVID-19 Pandemic. American Journal of Epidemiology, 191(5), 812–

824, https://doi.org/10.1093/aje/kwac008

Zhao, H., Merchant, N. N., McNulty, A., Radcliff, T. A., Cote, M. J., Fischer, R. S., Sang, H., &

Ory, M. G. (2021). COVID-19: Short term prediction model using daily incidence data. PloS

One, 16(4), e0250110.