ArticlePDF Available

Abstract and Figures

COVID-19 is a severe acute respiratory syndrome caused by the new Coronavirus. COVID-19 outbreak is a Public Health Emergency of International Concern, declared by WHO, that killed more than 2 million people worldwide. Since there are no specific drugs available and vaccination campaigns are in the initial phase, or even have not begun in some countries, the main way to fight the outbreak worldwide is still based on non-pharmacological strategies, such as the use of protective equipment, social isolation and mass testing. Modeling of the disease epidemics have gained pivotal importance to guide health authorities on the decision making and applying of those strategies. Here, we present the use of the Weibull distribution to model predictions of the COVID-19 outbreak based on daily new cases and deaths data, by non-linear regression using Metropolis-Markov Chain Monte Carlo simulations. It was possible to predict the evolution of daily new cases and deaths of COVID-19 in many countries as well as the overall number of cases and deaths in the future. Modeling predictions of COVID-19 pandemic may be of importance on the evaluation of governments and health authorities mitigation procedures, since it allows one to extract parameters that may help to guide those decisions and measures, slowing down the spread of the disease.
Content may be subject to copyright.
Model Assisted Statistics and Applications 16 (2021) 5–14 5
DOI 10.3233/MAS-210510
IOS Press
Using the Weibull distribution to model
COVID-19 epidemic data
Vitor Hugo Moreau
Department of Biotechnology, Institute of Health Sciences, Federal University of Bahia, Av. Reitor Miguel Calmon,
sn, Vale do Canela, Salvador, BA, Brazil
Tel.: +55 71 98175 5469; E-mail: vitorhmc@ufba.br
Abstract.
COVID-19 is a severe acute respiratory syndrome caused by the new Coronavirus. COVID-19 outbreak is a Public
Health Emergency of International Concern, declared by WHO, that killed more than 2 million people worldwide. Since there are
no specific drugs available and vaccination campaigns are in the initial phase, or even have not begun in some countries, the main
way to fight the outbreak worldwide is still based on non-pharmacological strategies, such as the use of protective equipment,
social isolation and mass testing. Modeling of the disease epidemics have gained pivotal importance to guide health authorities on
the decision making and applying of those strategies. Here, we present the use of the Weibull distribution to model predictions
of the COVID-19 outbreak based on daily new cases and deaths data, by non-linear regression using Metropolis-Markov Chain
Monte Carlo simulations. It was possible to predict the evolution of daily new cases and deaths of COVID-19 in many countries as
well as the overall number of cases and deaths in the future. Modeling predictions of COVID-19 pandemic may be of importance
on the evaluation of governments and health authorities mitigation procedures, since it allows one to extract parameters that may
help to guide those decisions and measures, slowing down the spread of the disease.
Keywords: COVID-19, Weibull distribution, modelling, model, death toll
1. Introduction
Since the World Health Organization has declared the COVID-19 outbreak a Public Health Emergency of Interna-
tional Concern, in January, 30
th
, more than 93 million cases and 2 million deaths were registered worldwide. Many
efforts are being done for discovering and developing therapeutic strategies against COVID-19. In despite of global
initiatives to search for treatments and vaccines, the main tool for slowing down the spread of the disease throughout
communities still are social isolation, personal hygiene and mass testing.
Before the development of efficient vaccines, many non-pharmacological strategies have been proposed to fight
COVID-19. Most of them are based on slowing the virus spread by self-care measures, as the use of personal
protection equipment, mass testing and restriction of social contact, through patient quarantine, social isolation
and lock down. Despite many discussion, social isolation and mass lock down measures have been described as
successfully strategies for slowing the virus spreading (Anderson et al., 2020; Lau et al., 2020; Mitjà et al., 2020;
Saez et al., 2020; Sjödin et al., 2020; Wilder-Smith & Freedman, 2020). Mass testing has been shown to be one of the
most effective strategies, since it allows tracing precisely the contact network of each contaminated person and apply
isolation and quarantine measures. Success of some countries, such as South Korea, in slowing down the spread of
the virus has been attributed to mass testing and selective quarantine (Choi, 2020).
Comparison of the evolution of epidemic curves among countries may be of pivotal importance to predict the effect
of the mitigation measures taken. It is possible, based on the data analysis, to model the evolution of the disease
and to predict the number of infected, healed and deceased people along days and weeks. Such predictions may be
extremely helpful in the decision taking by health authorities. Many papers have presented predictions of the epidemic
evolution by different methods (Ciufolini & Paolozzi, 2020; Gupta et al., 2020; Kim et al., 2020; Li et al., 2020). In
this work, we have used the Weibull distribution in a selected set of data, depicting the number of daily new cases and
ISSN 1574-1699/$35.00 c
2021 IOS Press. All rights reserved.
AUTHOR COPY
6V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data
deaths, from countries that present distinct epidemic patterns. Weibull distribution is one of the most commonly used
parametric lifetime model (Lawless, 2003), mostly for its parsimony, its ability to satisfactorily model data which are
commonly encountered in survival analysis and its availability in statistical software packages (Khan, 2018; Lawless,
2003). We believe that data from daily new cases and deaths of COVID-19, as well as from other epidemic outbreaks,
may be modeled by the Weibull distribution, resulting in valuable information to be used for supporting mitigation
measures taken by governments and health authorities worldwide.
2. Materials and methods
Data on the daily number of confirmed new cases and deaths, for every country, were extracted from Our World
in Data project (Roser et al., 2020) as comma-spaced values (CSV) files, processed with R (R Core Team, 2013)
using Rstudio 1.2.5042 (RStudio Team, 2020) for Linux. Data were subset in order to select countries names, dates
of registers and the number of daily new cases and deaths for each country, since the beginning of the pandemic
(December, 31
st
2019) up to January, 7
th
2021. In order to perform proper statistics, data were subset to use only
those countries in which more than 3,000 deaths were registered until the day of the data collection.
Data on daily new cases and daily new deaths were adjusted to a 4-parameter Weibull distribution using Markov
Chain Monte Carlo simulation (MCMC). A modified Weibull 4-parameter equation was used to adjust the data to the
prediction model, described as:
f(t) =
0, τ 6γ
αβ
ηtγ
ηβ1
e(tγ
η)β
,otherwise (1)
where
t
is the time;
f(t)
is the number of new cases or new deaths as a function of
t
;
α
is the area under the curve
(sum of total cases or deaths),
γ
is the location parameter,
β
and
η
are the shape and the scale Weibull’s parameter,
respectively.
Some cases in which data calculations required the use of bimodal Weibull distribution will be presented bellow. In
such cases, a bimodal Weibull distribution, adapted from Eq. (1), was used:
f(t) =
0, τ 6γ
α"β
ηtγ
ηβ1
e(tγ
η)β
+β0
η0tγ0
η0β01
etγ0
η0β0#,otherwise (2)
where
t, f (t)
,
α
,
β
,
γ
and
η
have the same meanings as in the Eq. (1);
β0
,
γ0
and
η0
are the shape, location and scale
parameters of the second mode of the Weibull distribution, respectively.
2.1. Markov Chain Monte Carlo Simulations
Markov Chain Monte Carlo Simulations (MCMC) were performed using random walk Metropolis algorithm
(Metropolis et al., 1953) within a five dimensional space to accommodate
β
,
η
,
γ
and
α
parameters, as well as
the standard deviation (SD). When bimodal distributions were used, the calculations were performed in an eight
dimension space (
α
,
β
,
η
,
γ
,
β0
,
η0
,
γ0
and SD). Prior distributions used in MCMC were normal for
η
and uniform
for
β
,
γ
,
α
and SD. The log of the likelihoods were determined from 10,000 iterations, with a 5,000 iterations burn
out period. Parameters were sampled from a normal proposal distribution averaged in the value of the parameter
in the subsequent iteration. The standard deviations of the proposal distributions were set to 1.5% of the given
parameters’ values, since this value was described to give an acceptance ratio around 0.23 among MCMC iterations.
The accepatance ratio of 0.23 has been previously demonstrated to maximize the efficiency of the Metropolis-MCMC
algorithm (Roberts et al., 1997).
2.2. Starting parameters for Metropolis-MCMC
Selection of the starting parameters for the Metropolis-MCMC procedures is a key feature for the efficiency of the
simulation. These initial values should not be too far away from a typical set of parameters (where posterior density
is high) because the Metropolis-MCMC algorithms would need too many iterations to reach the convergence if the
initial values are far in the tail of the posterior distribution (Korner-Nievergelt et al., 2015). Since 4-parameter Weibull
AUTHOR COPY
V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data 7
distribution was specifically used to model COVID-19 epidemic curves, we have used simple rules, based on the
analysis of the graphical role of the Weibull parameters (Eq. (1)), as described below:
i.
Location parameter (
γ
): shifts the beginning of the distribution to higher
t
values. In our data, it represents the
time lapse before the firsts cases/deaths arise, or before the raising of the exponential growing of daily new
cases and deaths. Data sets used in this work register COVID-19 daily new cases and deaths since December,
31
st
2019. However, most countries registered their first cases and deaths only in latter dates. Thus, there is
a lag of zeroes (or near zeroes) registered before the raise of the first case/death. The starting
γ
value for
Metropolis-MCMC procedures (
γ0
) was empirically calculated as the first day in which more than 5% of the
actual maximum number of daily new cases or deaths was registered, or the magnitude of the vector
ti
, from
t= 0 to the time value in which 5% of the maximum number of cases/deaths was registered, given by:
γ0=k
tikf(ti)=0.05 max(f(t))
0(3)
where
γ0
is the location parameter at iteration 0 and
f(ti)
is the number of daily new cases or deaths as a
function of t.
ii.
Scale parameter (
η0
): starting value for the Weibull’s scale parameter was set as the mean of the vector
t
from
its ith element equals to γ0to tn, described as:
η0=1
(nγ0)Zi=n
ti=γ0
ti(4)
iii.
Starting values for the area parameter (
α0
) was set to the sum of the number of cases or deaths and for the
shape parameter (
β0
) was set to 2.5 for all countries, since typical Weibull shape parameter that fits to most of
the COVID-19 epidemic data ranges from 1 to 5 (data not shown).
2.3. Weibull4 R package
Fitting procedures described in this paper were summarized in a R language package named “weibull4”, designed
to fit epidemiological data, in special for COVID-19, using Weibull 4-parameter equation (Eq. (1)) by Metropolis-
MCMC algorithm. The package weibull4 is available for downloading and installing at the Comprehensive R Archive
Network (CRAN) repository (R Core Team, 2013).
2.4. Supplementary material
Figure S1, described in the text, is available as Supplementary Material. The R script called “Moreau_weibull_
2021”, with the codes for every calculation and plots in this paper is available in the Code Ocean server (codeo-
cean.com).
3. Results
Data analysis can be of outstanding importance during infection diseases outbreaks, mainly if fast decision making
is crucial to slow down the spread of the disease. Modeling of the course of the COVID-19 pandemic in highly
affected countries is a live-saving demand (Eberhardt et al., 2020; Verma et al., 2020), since it can be used to support
and guide decision makers to quickly act and block the spread of the disease. Data on daily new cases and daily new
deaths were extracted from Our World in Data project on Coronavirus (Roser et al., 2020). Data from countries that
faced the COVID-19 pandemics earlier, such as Italy, France, Spain, etc, formed a well defined single peak in a first
moment. Such pattern allowed us to evaluate statistical modeling to proper fit the data. The Weibull distribution was
chosen for this goal because of its potential in modeling life time events (Lawless, 2003). Such analysis allows us to
forecast predict epidemiological outcomes, as death toll and the future number of daily new cases and deaths in the
studied countries.
Figure 1 shows plots for the initial single peaks of daily new cases and deaths of COVID-19 registered in Italy,
as well as the curve fits calculated for them. As seen, the Weibull distribution can properly fit the evolution of the
epidemic peak and to be used to model and to forecast predict the number of daily new cases and deaths. Panel A of
the Fig. 1 also illustrates the positions of the scale parameter (
γ
) and the mode of the distribution (
Mof(t)
), what may
also be called
tmax
, that stands for the time (in days) in which the maximum number of cases or deaths were (or will
be) registered or, yet, the maximum turning point, given by:
AUTHOR COPY
8V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data
Fig. 1. Panel A: Profile of the first wave of daily new cases (open circles) and deaths (closed circles) of COVID-19 in Italy. Data were fit within an
unimodal 4-parameter Weibull distribution (lines). Arrows show the
γ
parameter, corresponding to the beginning of the exponential growth of new
cases or deaths; and the mode of the distribution, corresponding to the average day in which the largest number of cases or deaths are observed.
Calculated Weibull’s shape, scale, location and area parameters are shown in Panels B, C (upper lines), C(lower lines) and D, respectively, as
a functions of the Metropolis-MCMC iterations. Lines converging to the same average values in each panel correspond to the same parameter
calculation, with distinct starting values.
Mof(t) =
γ, β 61
γ+ηβ1
β1
, β > 1(5)
where
Mof(t)
is the mode of the number of daily new cases or deaths distribution (
f(t)
). Additionally, Fig. 1 shows
the posterior values for the Weibull parameters along the Metropolis-MCMC simulations iterations (Panels B, C and
D). Convergences are reached before the ending of the burning out period and the posterior distributions converge to
the same average values even if distinct starting values were chosen. This could be observed for every parameter
(converging lines in Panels B, C and D), suggesting that the Weibull-directed Metropolis-MCMC performed here is a
suitable procedure to properly fit the daily new cases and deaths data of COVID-19.
Recently, most countries have entered in a second wave of COVID-19 infections. Due to this second wave, such
countries began to present a multimodal pattern of daily new cases and deaths. Additionally, some countries have
shown more complex patterns, with more than two mixed waves. Such puzzling patterns make harder, or even
impracticable, to perform proper statistic analysis and predictions. Even so, such complex data on COVID-19 daily
new cases and deaths could be analyzed if multimodal distributions were used. This is possible by splitting the data
by date to perform the analysis with two Weibull distributions, being one before and one after the splitting date.
Optional arguments were included into the weibull4 R package (see Material and Methods) in order to allow users to
choose the dates for split the data in two parts, as well as to set the unimodal or bimodal Weibull distribution to be
used before and after the split date. More explanations about such arguments functionality are described under the
package documentation files (not shown).
AUTHOR COPY
V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data 9
Fig. 2. Data on daily new cases and deaths of COVID-19 from nine selected countries used as examples of customized analysis. Upper panels
show countries in which two unimodal Weibull distributions were used, with split date on August, 1
st
. (Belgium, Canada and Germany), i.e., Mo1
=
1 and Mo2
=
1. Middle panels display countries in which a single bimodal Weibull distribution was used for data fitting (Bolivia, Brazil and
Russia), i. e., without splitting the data and Mo1
=
2. Lower panels show countries in which two Weibull distributions were used one unimodal
and one bimodal with split date on September, 1
st
(Serbia and United States), and bimodal distribution up to the split date and an unimodal
distribution from the split date foward (Mo1
=
2, Mo2
=
1); and United Kingdom, with an unimodal distribution up to the split date and a bimodal
distribution from the split date forward (Mo1 =1, Mo2 =2).
Examples on the data analysis performed with the Weibull distribution within the countries COVID-19 daily new
cases and deaths data are shown in Fig. 2, in three distinct ways: i. Upper panels display countries in which data were
analyzed by two separated unimodal distribution (Belgium, Canada and Germany), with data splitting on Sep, 1
st
;
ii. Middle panels display the analysis using a single bimodal distribution for the whole data set (Bolivia, Brazil and
Russia), without split date; iii. Lower panels show the data analysis using one unimodal distribution and one bimodal
distribution, with data splitting in Aug, 1
st
(Serbia, United Kingdom and United States). Non-linear regressions
performed with single or double, both unimodal and bimodal Weibull distributions look to fit the COVID-19 daily
new cases and deaths data in a proper fashion. Split date, as well as the number of modes of the Weibull distribution
to be used can be selected from each countries data. Such parameters may be chosen in order to reach better fit quality
from daily new cases and deaths data. The split dates used in chart calculations of the Fig. 2 were manually selected
to the deepest valley between two COVID-19 infection waves, both in daily new cases and deaths data.
The suitability of the model can be further evaluated by the residuals and by the Determination Coefficient
(
R2
) of the regressions. Figure 3 shows the distribution of the fit residuals of the daily new cases (Panel A) and
deaths (Panel B) of COVID-19 in all studied countries. Lines in Panels A and B represent normal distributions with
means and standard deviations (SD) for each respective panel data. In both cases and death data, the residuals are
narrowed distributed around zero when compared to a normal distribution with the same mean and SD of the residuals
distribution (lines). Panels C and D display the residuals correlation plots between the actual number of daily new
cases and deaths values versus the Weibull distribution estimates for all studied countries. As shown, residuals plot
are well distributed around the slope =1, intercept =0 straight line, although it seams to present a large number of
outliers (Fig. 3, panels C and D). Additionally, Table 1 shows the value for the
R2
of every countries data fit. As
displayed, with rare exceptions, all fitted data resulted in R2values greater than 0.6.
AUTHOR COPY
10 V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data
Fig. 3. Distributions of the residuals of the fitted data for daily new cases (Panel A) and deaths (Panel B) of COVID-19 calculated with customized
analysis for each country. Split dates, as well as the number of modes in each used Weibull distribution were as described in Table 1. Normal
distributions, calculated with the same means and SD of each respective residuals distributions, are shown in lines. Panels C and D show the
residuals plots of daily new cases and deaths for countries versus the estimated fits in custom mode, as described in Table 1. Straight lines were
draw with slope =1 and intercept =0, merely to guide the eyes.
3.1. Parameter extraction from Weibull Metropolis-MCMC
Modeling of natural processes is of primordial importance for predicting forecast tendencies of similar phenomena
in the future, as well as for extracting information from the model that allows one to better understand it. The
estimated death toll is one of the parameters that can be extracted from the calculations used to model the COVID-19
data here. Modeling of the COVID-19 curves of daily new cases and deaths allows us to predict both the number of
daily new cases and deaths in the future, as well as the overall death toll for COVID-19 in a given country. Table 1
shows the total expected death tolls for COVID-19 for all the studied countries. Data fitting were performed, for each
country, using the split dates and the number of modes for the first (Mo1) and the second (Mo2) distributions, as
displayed in Table 1 (Mo1
=
1, representeing unimodal Weibull distribution and Mo1
=
2 bimodal distribution).
Modeling of the COVID-19 data can be customizing adjusted for each countries data, in order to set the split date and
the number of modes of the used distributions, as well as to be reevaluated day-by-day, as new data emerge. Such
analytical model may be a worthfull tool to evaluate and guide the health authorities and governments response to
COVID-19 and to other epidemics in the future.
4. Discussion
COVID-19 is a global health emergency that is going to change the way in which people, institutions and
governments manage and execute their lives and duties. The fact that specific drugs or vaccines for COVID-19
have only been developed recently, raises the importance of behavioral strategies, as social isolation, lock-down
(Anderson et al., 2020; Lau et al., 2020; Mitjà et al., 2020; Saez et al., 2020; Sjödin et al., 2020; Wilder-Smith &
AUTHOR COPY
V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data 11
Table 1
Estimated overall death tolls calculated from customized analysis, updated to January, 7
th
2021, for every country. Overall death tolls represent the
integer area under the regression curves shown in Fig. S1 (
α
parameter of Eqs (1) or (2)). Errors were calculated from standard deviations of the
last 5,000 iterations of Metropolis-MCMC simulation (see Material and Methods). Determination Coefficients (
R2
) for the data fitting, as well as
the split date and the number of modes of the Weibull distribution used to fit the data before (Mo1) and after (Mo2) the split date are also shown
for each country (1 corresponds to unimodal Eq. (1) and 2 to bimodal Eq. (2) Weibull distribution).
Country Death toll R2Split date Mo1 Mo2 Country Death toll R2Split date Mo1 Mo2
Argentina 68,512 ±1,505 0.8452 2 Italy 118,980 ±8,500 0.9409 Jul/01 1 2
Austria 8,952 ±630 0.8576 Jun/01 1 1 Japan 8,304 ±1,132 0.8586 Jun/01 1 2
Bangladesh 11,902 ±476 0.7970 2 Jordan 8,311 ±283 0.9510 1
Belgium 24,180 ±1,228 0.8536 Jul/01 1 2 Mexico 202,567 ±7,053 0.6970 2
Bolivia 15,752 ±298 0.8185 2 Moldova 5,584 ±785 0.7228 Jul/01 2 2
Bosnia and
Herzegovina
4,785 ±296 0.7477 Jun/01 2 2 Morocco 13,735 ±956 0.9134 Jun/05 1 1
Brazil 358,586 ±15,536 0.6423 2 Netherlands 17,559 ±1,134 0.8089 Jul/01 1 2
Bulgaria 10,291 ±558 0.8125 Oct/01 1 1 Pakistan 20,151 ±2,266 0.6464 Sep/01 1 1
Canada 34,380 ±3,541 0.8252 Jul/01 1 1 Panama 13,142 ±1,444 0.8265 May/15 1 2
Chile 20,155 ±1,202 0.4935 Aug/01 1 2 Peru 51,300 ±3,463 0.5195 2
China 4,058 ±478 0.7175 May/01 1 2 Philippines 12,175 ±1,055 0.4125 May/20 2 1
Colombia 73,053 ±2,281 0.8549 2 Poland 34,976 ±5,243 0.7739 Jul/01 2 2
Croatia 7,812 ±357 0.9710 Aug/05 2 2 Portugal 10,622 ±593 0.9513 Aug/10 2 2
Czechia 24,690 ±1,693 0.9267 Jul/15 2 2 Romania 18,194 ±1,023 0.9192 Jun/01 1 2
Ecuador 20,980 ±764 0.0168 2 Russia 83,294 ±2,799 0.9478 2
Egypt 23,975 ±2,644 0.8569 Oct/05 2 1 Saudi Arabia 11,323 ±515 0.8250 1
France 86,646 ±5,763 0.6716 Jul/01 1 2 Serbia 5,205 ±372 0.9686 Sep/15 2 1
Germany 213,074 ±26,463 0.8010 Jul/01 1 1 South Africa 118,902 ±26,752 0.7654 Oct/01 1 1
Greece 6,906 ±207 0.9597 Jun/01 1 2 Spain 108,474 ±13,022 0.4896 Jun/15 1 2
Guatemala 6,178 ±180 0.5920 2 Sweden 15,763 ±1,193 0.4741 Sep/01 2 1
Honduras 5,547 ±208 0.3671 2 Switzerland 13,565 ±1,271 0.7349 Jul/01 1 1
Hungary 15,525 ±855 0.9617 Jul/01 1 1 Tunisia 6,234 ±626 0.5676 Jun/01 1 1
India 266,570 ±8,443 0.8933 1 Turkey 137,267 ±31,191 0.9289 Aug/01 2 1
Indonesia 53,125 ±2,335 0.8647 2 Ukraine 26,986 ±3,135 0.8644 Jun/01 1 2
Iran 58,670 ±1,270 0.9638 May/01 1 2 United Kingdom 123,536 ±4,698 0.8347 Jul/01 1 2
Iraq 30,346 ±473 0.9101 1 United States 1,209,893 ±193,330 0.7741 Sep/01 2 1
Israel 4,968 ±353 0.8405 Aug/15 2 2
Freedman, 2020) and mass testing (Choi et al., 2020; Peto, 2020; Salath et al., 2020) to keep fighting the pandemic.
In this scenario, modeling and forecast predicting the course of the pandemic play an important role by providing
information for evaluating the measures taken by governments and health authorities. Parameters extracted from
modeling and forecast predictions may be used to determine better strategies to mitigate the impact of infection
diseases in the population (Verma et al., 2020). With this in mind, we proposed the use of the Weibull distribution to
model data on daily new cases and death of COVID-19 pandemic from some selected countries. In our previous work,
the Weibull distribution has been used to model forecast predictions of COVID-19 data in Brazil (Moreau, 2020).
From our knowledge, that was the first time in which such approach was used with this end, and the present work is
the first report of the use of the Weibull distribution to model COVID-19 data in a sistematic worldwide analysis.
Weibull distribution has been shown to fit well to a COVID-19 daily new cases and deaths single peak. Figure 1
displays the daily new cases and deaths data from the first wave of COVID-19 infections in Italy. Italy was chosen
because it was one of the countries that displayed a well defined single peak of new cases and deaths, probably
because of strict lock downs and wide mass testing measures taken in response to the first wave of infections. This
pattern allows us to use the Italy data to evaluate the application of the 4-parameter Weibull distribution to fit the
COVID-19 epidemic data. Similar results could be obtained when data from the first peak of dailly new cases and
deaths were analyzed in other countries that displayed a clear initial single wave of infections, such as Belgium,
Canada, China, France, Germany, Netherlands, Portugal, Spain, Switzerland and United Kingdom (data not shown).
Figure 2 shows examples of five distinct customized ways to model the daily new cases and deaths for COVID-19,
depending on the pattern of the countries epidemic curve. It was possible to model the data by performing non-linear
curve fitting with in a single unimodal Weibull distribution (Eq. (1)), within a single bimodal distributions (Eq. (2)) or
within two Weibull distribution. Two distributions can be applied to the modeling calculation by splitting the data in a
given date. Actually, the split date may be set to a day in the deepest valley between the end of one epidemic wave and
AUTHOR COPY
12 V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data
the beginning of another wave. Figure 2 brings examples of the customized analysis in which the fitting parameters
were set to reach better fit results. Data from Belgium, Canada and Germany (upper panels), were modeled within two
unimodal Weibull distribution and split point at August, 1
st
; from Bolivia, Brazil and Russia (middle panels), analyzed
with one single bimodal Weibull distribution (Eq. (2)); from Serbia and United States (lower panel), analyzed with
two Weibull distributions one bimodal up to September, 1
st
(split date) and one unimodal from September, 2
nd
up;
and, finally, from United Kingdom (lower panel), analyzed by one unimodal distributions up to September, 1
st
(split
date) and with one bimodal distribution from September, 2
nd
forward. The split date, as well as the number of modes
of the Weibull distribution to be used before (Mo1) and after (Mo2) the split date may be chosen in order to reach
better quality of the data fitting. Table 1 shows the splitting date and the number of modes of the used distributions
(Mo1 and Mo2), as well as the Determination Coefficient (
R2
) for each countries data fit. As also seen, most fitting
procedures showed here have reached a good fit quality, with
R2
values over 0.6, confirming that the data analysis
procedures using the 4-parameter Weibull distribution is a suitable model for COVID-19 data fitting.
Data analysis of the results presented did not allow us to determine correlations between the goodness of the fit and
any key parameter related to the response measures from the countries governments to the COVID-19 pandemic,
such as the number of deaths per million or the Oxford COVID-19 Government Response Tracker (Hale et al., 2020),
for instance (data not shown), though the goodness of fit might be associated to misconduct data collection and
processing. Well defined peaks for daily new cases and deaths, with the clear presence of ascending and descending
phases, may tend to conform better to the unimodal Weibull distribution, as shown in Fig. 1A. We speculate that fuzzy
patterns for the daily new cases and deaths, presented by some countries, might be associated to misleading strategies
taken by such country to fight COVID-19 pandemic, what would make the number of daily new transmissions
strongly vary, due to undesired spreading of the virus through out the community. This misconducting might lead to
what is called “multiple waves” of the disease. In cases in which multiples waves are present, alternative ways to
perform the Weibull analysis were presented here (Fig. 2 and Table 1).
Figure S1 (Supplementary Material) shows the customized analysis, performed with a single (no split date) or two
Weibull distributions for every country data that present more than 3,000 deaths up to January, 7
th
. Multiple waves
pattern can be observed for most, if not all, the countries (Fig. S1). Split dates, Mo1 and Mo2 used to fit the data in
Fig. S1 were as described in Table 1. Although the data analysis presented here was able to deliver feasible models on
the COVID-19 pandemic data, it may be taken in account with caution, due to the possible existence of corrupted
data or by the unconfidence on the epidemic data collected by some countries authorities. Yet, although the overall
death tolls extracted from the area under the curve (
α
in Eq. (2)) displayed in Table 1, reflect good estimates of
the real number of deaths at the end of the pandemic peaks, it might probably be biased by the oscillations present
in the pattern of epidemic data from some countries and can reach much greater values if new waves of infection
become present.
In a overall point of view, the 4-parameter Weibull distribution showed to be a suitable modeling distribution for the
COVID-19 pandemic, when applied to daily new cases and deaths data. Figure 3 summarizes the residuals analysis of
the non-linear regression of the daily new cases and deaths data from every country used in this work. Residuals from
both daily new cases and deaths form narrow distributions around zero. Lines in Panels A and B represent normal
distributions with the mean and SD for each respective panel data. It is worth to note that the residuals distributions
are narrower than the normal distribution of residuals, with same mean and SD values. This observation denotes
the presence of highly dispersed outliers in the residuals. Panels C and D display more evidently those outliers. In
despite of the presence of the outliers, data both from new cases and deaths display sharp residuals distributions
within the Weibull distribution fit, what reinforces the use of such method for suitably modeling, forecast predicting
and parameters extracting from daily new cases and deaths data of COVID-19.
Non-linear regressions used here were performed by Metropolis-MCMC algorithm built in a R language script and
coded in a R package called “weibull4”. This module may be quite useful to be applied not just to COVID-19, but to
any epidemic data that displays similar spreading pattern. Weibull4 R package can be used for non-linear regression
of daily new cases and deaths data using both unimodal and bimodal 4-parameters Weibull distributions (Eqs (1) and
(2)), with the location parameter (
γ
), that accommodates the time lapse before the arise of first cases or deaths, and
the area parameter (
α
), that represents the overall number of registered cases or deaths. Weibull4 package is available
at the R CRAN repository (R Core Team, 2013) at https://cran.r-project.org/.
Predictions of COVID-19 epidemic evolution based on daily new cases and deaths are especially efficient, because
they can be revised day-by-day, giving to governments and health authorities the opportunity of re-conducting their
AUTHOR COPY
V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data 13
measures as new data arise. Additionally, 4-parameters Weibull distribution, as well as weibull4 R package, may
be suitable to perform analysis on epidemic data from other diseases or, eventually, from future pandemics, since it
seams to be a consensus in the scientific community that we are in imminent risk of them (Osterholm, 2005). We
believe that such predictions would be useful for decision makers in order to define strategies to fight epidemic and
pandemic outbreaks, nowadays and in the future.
Acknowledgments
The author would like to thank to Dr. Gilson Carvalho for worthful discussion and to Dr. Juliana Cortines for the
critical reading of the manuscript.
References
Anderson, R. M., Heesterbeek, H., Klinkenberg, D., & Hollingsworth, T. D. (2020). How will country-based mitigation measures influence the
course of the COVID-19 epidemic? Lancet,395, 931-934.
Choi, J. Y. (2020). COVID-19 in South Korea. Postgrad Med J ,96, 399-402.
Choi, S., Han, C., Lee, J., Kim, S. I., & Kim, I. B. (2020). Innovative screening tests for COVID-19 in South Korea. Clin Exp Emerg Med, 1-5.
Ciufolini, I., & Paolozzi, A. (2020). Mathematical prediction of the time evolution of the COVID-19 pandemic in Italy by a Gauss error function
and Monte Carlo simulations. Eur Phys J Plus 135, 355.
Eberhardt, J. N., Breuckmann, N. P., & Eberhardt, C. S. (2020). Multi-Stage Group Testing Improves Efficiency of Large-Scale COVID-19
Screening. J Clin Virol, S1386-6532(20)30124-4.
Gupta, S., Raghuwanshi, G. S., & Chanda, A. (2020). Effect of weather on COVID-19 spread in the US: A prediction model for India in 2020. Sci
Total Environ,728, 138860.
Hale, T., Angrist, N., Cameron-Blake, E., Hallas, L., Kira, B., Majumdar, S., Petherick, A., Phillips, T., Tatlow, H., & Webster, S. (2020). Variation
in government responses to COVID-19. BSG Work. Pap. Ser. Blavatnik Sch. Gov. Univ. Oxford: Version 8.0.
Khan, S. A. (2018). Exponentiated Weibull regression for time-to-event data. Lifetime Data Anal,24, 328-354.
Kim, S., Seo, Y. B., & Jung, E. (2020). Prediction of COVID-19 transmission dynamics using a mathematical model considering behavior changes.
Epidemiol Health,42, e2020026.
Korner-Nievergelt, F., Roth, T., von Felten, S., Guélat, J., Almasi, B., & Korner-Nievergelt, P. (2015). Markov Chain Monte Carlo Simulation. in:
Bayesian Data Anal. Ecol. Using Linear Model. with R, BUGS, STAN. Elsevier, 197-212.
Lau, H., Khosrawipour, V., Kocbach, P., Mikolajczyk, A., Schubert, J., Bania, J., & Khosrawipour, T. (2020). The positive impact of lockdown in
Wuhan on containing the COVID-19 outbreak in China. J Travel Med 27.
Lawless, J. F. (2003). Basic Concepts and Models 1.1. in: Stat Model Methods Lifetime Data, Second Ed, 1-47.
Li, L., Yang, Z., Dang, Z., Meng, C., Huang, J., Meng, H., Wang, D., Chen, G., Zhang, J., Peng, H., & Shao, Y. (2020). Propagation analysis and
prediction of the COVID-19. Infect Dis Model,5, 282-292.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of State Calculations by Fast Computing
Machines. J Chem Phys,21, 1087-1092.
Mitjà O., Arenas À., Rodó X., Tobias, A., Brew, J., & Benlloch, J. M. (2020). Experts’ request to the Spanish Government: Move Spain towards
complete lockdown. Lancet,395, 1193-1194.
Moreau, V. H. (2020). Forecast predictions for the COVID-19 pandemic in Brazil by statistical modeling using the Weibull distribution for daily
new cases and deaths. Brazilian J Microbiol,51, 1109-1115.
Osterholm, M. T. (2005). Preparing for the next pandemic. N Engl J Med,352, 1839-1842.
Peto, J. (2020). Covid-19 mass testing facilities could end the epidemic rapidly. BMJ, m1163.
R Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria.
Roberts, G. O., Gelman, A., & Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl
Probab,7, 110-120.
Roser, M., Ritchie, H., Ortiz-Ospina, E., & Hasel, J. (2020). Coronavirus Pandemic (COVID-19).
RStudio Team. (2020). RStudio: Integrated Development Environment for R. Boston, MA.
Saez, M., Tobias, A., Varga, D., & Barceló, M. A. (2020). Effectiveness of the measures to flatten the epidemic curve of COVID-19. The case of
Spain. Sci Total Environ,727, 138761.
Salath, M., Althaus, C. L., Neher, R., Stringhini, S., Hodcroft, E., Fellay, J., Zwahlen, M., Senti, G., Battegay, M., Wilder-Smith, A., Eckerle, I.,
Egger, M., & Low, N. (2020). COVID-19 epidemic in Switzerland: on the importance of testing, contact tracing and isolation. Swiss Med
Wkly.
Sjödin, H., Wilder-Smith, A., Osman, S., Farooq, Z., & Rocklöv, J. (2020). Only strict quarantine measures can curb the coronavirus disease
(COVID-19) outbreak in Italy, 2020. Eurosurveillance,25, 1-6.
Verma, V., Vishwakarma, R. K., Verma, A., Nath, D. C., & Khan, H. T. A. (2020). Time-to-Death approach in revealing Chronicity and Severity of
COVID-19 across the World. Ed. Kannan Navaneetham. PLoS One,15, e0233074.
Wilder-Smith, A., & Freedman, D. O. (2020). Isolation, quarantine, social distancing and community containment: pivotal role for old-style public
health measures in the novel coronavirus (2019-nCoV) outbreak. J Travel Med,27, 1-4.
AUTHOR COPY
14 V.H. Moreau / Using the Weibull distribution to model COVID-19 epidemic data
Supplementary data
The supplementary files are available to download from http://dx.doi.org/10.3233/MAS-210510.
Supplement Fig. 1. Plots of daily new cases (open circles) and deaths (closed circles) for COVID-19 in every country used in this work, calculated
with customized mode. Split dates and the number of modes in the Weibull distribution used to fit the data before and after the split date are shown
in Table 1. Data were fit (lines) using weibull4 R package (see Material and Methods). Charts are ordered from higher to lower Determination
Coefficients (R2) of the data fit.
AUTHOR COPY
... where ϕ � (κ, σ), κ > 0, and σ > 0. e Weibull model and its different generalized/modified variants have been used by researchers for modeling data in numerous sectors. For example, (i) Ghorbani et al. [1] and Moreau [2] applied it to the medical science phenomena; (ii) Zaindin and Sarhan [3], Lai [4], Almalki and Yuan [5], and Singh [6] used it for reliability engineering applications; and (iii) Ahmad et al. [7] studied its applications in the finance sector. e probability density function (PDF) v(x; ϕ) of the Weibull model is v(x; ϕ) � κσx κ− 1 e − σx κ , · · · · · · x > 0, (2) with hazard function (HF) h(x; ϕ) given by h(x; ϕ) � κσx κ− 1 , · · · · · · x > 0. ...
... For example, (i) Ghorbani et al. [1] and Moreau [2] applied it to the medical science phenomena; (ii) Zaindin and Sarhan [3], Lai [4], Almalki and Yuan [5], and Singh [6] used it for reliability engineering applications; and (iii) Ahmad et al. [7] studied its applications in the finance sector. e probability density function (PDF) v(x; ϕ) of the Weibull model is v(x; ϕ) � κσx κ− 1 e − σx κ , · · · · · · x > 0, (2) with hazard function (HF) h(x; ϕ) given by h(x; ϕ) � κσx κ− 1 , · · · · · · x > 0. ...
Article
Full-text available
In this article, we focused on predictive modeling for real data by means of a new statistical model and applying di erent machine learning algorithms. e importance of statistical methods in various research elds is modeling the real data and predicting the future behavior of data. For modeling and predicting real-life data, a series of statistical models have been introduced and successfully implemented. is study introduces another novel method, namely, a new generalized exponential-X family for generating new distributions. is method is introduced by using the T-X approach with the exponential model. A special case of the new method, namely, a new generalized exponential Weibull model, is introduced. e applicability of the new method is illustrated by means of a real application related to the alumina (Al 2 O 3) data set. Acceptance sampling plans are developed for this distribution using percentiles when the life test is truncated at the pre-assigned time. e minimum sample size needed to make sure that the required lifetime percentile is determined for a speci ed customer's risk and producer's risk simultaneously. e operating characteristic value of the sampling plans is also provided. e plan methodology is illustrated using Al 2 O 3 fracture toughness data. Using the same data set, we implement various machine learning approaches including the support vector machine (SVR), group method of data handling (GMDH), and random forest (RF). To evaluate their forecasting performances, three statistical measures of accuracy, namely, root-mean-square error (RMSE), mean absolute error (MAE), and Akaike information criterion (AIC) are computed.
... Statistical methodologies, particularly those pertaining to lifetime and time-to-event scenarios, are extensively employed across various domains, with a notable emphasis on health-related sectors, we refer to Baghestani et al. [1], Hoseini et al. [2], Moreau [3], and Almongy et al. [4]. ...
Article
Full-text available
In recent years, the modeling of time-to-events has emerged as a highly promising and dynamic research area. This field has witnessed a surge of research studies dedicated to developing novel statistical methodologies aimed at effectively handling time-to-event phenomena. These studies are motivated by the increasing recognition of the importance of time-related factors in various fields such as medicine, epidemiology, finance, and engineering. Researchers have been actively engaged in proposing innovative approaches to address the complexities associated with time-to-event data. The overarching goal is to enhance our understanding of event occurrence and duration, enabling more accurate predictions and informed decision-making. This research encompasses a wide range of topics, including survival analysis, reliability modeling, and event prediction. The motivation behind these research efforts stems from the need to overcome traditional limitations in time-to-event analysis and to explore new avenues for modeling and interpretation. By introducing advanced statistical techniques, researchers seek to capture the intricate dynamics of event processes, considering factors such as censoring, competing risks, and time-varying covariates. The proliferation of research studies in this domain reflects a collective effort to push the boundaries of statistical modeling and analysis, paving the way for more comprehensive and robust methodologies. As researchers continue to delve deeper into the intricacies of time-to-event data, the impact of these advancements extends to diverse applications, ultimately fostering innovation and progress across interdisciplinary fields. This paper adopts and implements a new statistical approach to propose a family of flexible distributions, namely, a new generalized-family of distributions. For the newly obtained family, certain mathematical properties such as identifiability, quantile function, th non-central moment, Lorenz curve, incomplete moments, and the expression of the Bonferroni curve are obtained. Furthermore, an extension of the Weibull model is introduced using the newly developed approach, namely, a new generalized Weibull model. The parameters of the new generalized version of the Weibull model are estimated by adopting a well-known estimation approach. Finally, a data set consists of sixty (60) observations representing the times of the survival of some patients infected by the COVID-19 epidemic is analyzed to illustrate the new generalized Weibull model.
... When k = 2, the Gaussian mixture distribution becomes a bimodal distribution. In addition to the Gaussian mixture distribution, several compound distributions proposed in the literature can be considered as a multimodal distribution (Ahmad et al., 2022;Chesneau et al., 2019;Moreau, 2021;Para et al., 2020;Punzo et al., 2018;Vasconcelos et al., 2020). Some less familiar bimodal distributions are the U-shaped and the V-shaped distributions. ...
Article
This paper proposes a new asymmetric V-shaped distribution for fitting continuous data. In this study, some statistical properties, such as the mean, the median, the variance, the survival, and the hazard function of the new distribution are investigated. Furthermore, we also presented how to generate the proposed asymmetric V-shaped distribution based on two random variables that have uniform distributions. Three examples are presented to illustrate the advantages of the asymmetric V-shaped distribution for some simulated and real-life data sets.
... where φ φ φ = (λ , η), λ > 0 and η > 0. This model can be applied to discuss various data in different sectors such as: Medical sciences [10,11], reliability engineering [12,13,14], and finance sector [15]. For more detailed information about the modifications and usefulness of the Weibull distribution [16,17,18,19]. ...
Article
Full-text available
Statistical methodologies have broad applications in sports and other exercise sciences. These methods can be used to predict the winning probability of a team or individual in a match. Due to the applicability of the statistical methods in sports, this paper introduces a new method of obtaining statistical distributions. The new method is called a novel beta power-L family of distributions. Some mathematical characteristics of the new family are obtained. Based on the novel beta power-L family, a special model, namely, a novel beta power Weibull model is studied. Finally, the applicability/usefulness of the novel beta power Weibull distribution is shown by analyzing the time-to-even data taken from different football matches during 1964-2018. The data consist of seventy-eight observations and is representing the waiting time duration of the fastest goal scored ever in the history of football. The fitting results of the novel beta power Weibull distribution are compared with other models. Based on three model selection criteria, it is observed that the proposed novel beta power Weibull model provides a close fit to the waiting time data.
... There are other models that have been used to study COVID-19 disease. They are SEIR model with quarantine and fatality compartment [51], lattice model [52], a recursive model [53], computational model [54], networks based model [55,56], tailored model [57], growth rate model [58], parameter estimation model [59], Weibull distribution model [60], Markovchain model [61], in-homogeneous spatial model [62], fuzzy model [63], hybrid intelligent model [64], case based rate reasoning model [65], forecasting model [66], complex mathematical model [67], time series model [68], and hierarchical epidemic risk model [69]. All these models were used to predict, forecast and understand COVID-19 spread dynamics. ...
Thesis
This thesis is devoted to the mathematical and statistical modeling of epidemic data andIt is divided into two broad parts, which are subdivided into different sections. The modeling of infectious diseases has been a subject of interest to researchers, policy makers, andmedical practitioners, most especially during the recent global COVID-19 pandemic, whichIt has been devastating to the health infrastructure and socio-economic status of many nations.It has affected mobility and interaction among citizens due to the many daily new cases and deaths.Hence, the need to contribute to understanding the mechanisms of virulence and spread using different mathematical and statistical modeling approaches. The first part is dedicated to the mathematical modeling aspect, which consists of the deterministic and discrete approaches to epidemiology modeling, which in this case is mainly focused on the COVID-19 pandemic. The daily reproduction number of the COVID-19 outbreak calculation is approached by discretization using the idea of deconvolution and a unique biphasic pattern is observed that is more prevalent during the contagiousness period across various countries. Furthermore, a discrete model is formulated from Usher’s model in order to calculate the life span loss due to COVID-19 disease and to also explain the role of comorbidities, which are very essential in the disease spread and its dynamics at an individual level. Also, the formulation of Susceptible-Infectious-Geneanewsusceptible-Recovered (SIGR) age-dependent modelling is proposed in order to perform some mathematical analysis and present the role of different epidemiology parameters, most especially vaccination, and finally, a new technique to identify the point of inflection on the smoothed curves of the new infected pandemic cases using the Bernoulli equation is presented. This procedure is important because not all countries have reached the turning point (maximum number of daily cases) in the epidemic curve. The approach is used to calculate the transmission rate and the maximum reproduction number for various countries.The statistical modelling of the COVID-19 pandemic using various data analysis models (namely machine and deep learning models) is presented in the second part in order to understand the dynamics of the pandemic in different countries and also predict and forecast the daily new cases and deaths due to the disease alongside some socio-economic parameters. It is observed that the prediction and forecasting are consistent with the disease evolution at different waves in these countries and that there are socio-economic determinants of the disease depending on whether the country is developed or developing. Also, the study of the shapes and peaks of the COVID-19 disease is presented. The peaks of the curves of the daily new cases and deaths are identified using the spectral analysis method, which enables the weekly peak patterns to be visible. Finally, the clustering of different regions in France due to the spread of the disease is modeled using functional data analysis. The study shows clear differences between the periods when vaccination has not been introduced (but only non-pharmaceutical mitigation measures) and when it was introduced. The results presented in this thesis are useful to better understand the modeling of a viral disease, the COVID-19 virus.
Article
This study introduces a Simulation-Driven Mixed Integer Linear Programming (SDMILP) model developed to optimize the scheduling of medical waste treatment during emergencies. Our approach integrates simulation to reflect the complexities of the real world, providing solutions that are adaptable to dynamic conditions. Initially, we formulate the problem using an MILP model to optimize waste allocation and alleviate operational pressures. We then incorporate a simulation mechanism within the MILP framework, which simulates waste generation to address uncertainties in epidemic transmission and the rehabilitation process. Through computational experiments conducted on benchmark instances, we evaluate the model’s performance. The results confirm its efficacy in reducing waste treatment costs, including transportation, fixed expansion costs and temporary overload operating costs at treatment stations, while ensuring equitable load distribution among treatment stations.
Preprint
Full-text available
Background/Objective: Relative proportion of cases in a multi-strain pandemic like the COVID-19 pandemic provides insight on how fast a newly emergent variant dominates the infected population. However, the behavior of relative proportion of emerging variants is an understudied field. We investigated the emerging behavior of dominant COVID-19 variants using nonlinear statistical methods and calculated the time to dominance of each variant. Method: We used a phenomenological approach to model national- and regional-level variant share data from the national genomic surveillance system provided by the Centers for Disease Control and Prevention to determine the best model to describe the emergence of two recent dominant variants of the SARS-CoV-2 virus: XBB.1.5 and JN.1. The proportions were modeled using logistic, Weibull, and generalized additive models. Model performance was evaluated using the Akaike Information Criteria (AIC) and the root mean square error (RMSE). Findings: The Weibull model performed the worst out of all three approaches. The generalized additive model approach slightly outperformed the logistic model based on fit statistics, but lacked in interpretability compared to the logistic model. These models were then used to estimate the time elapsed from emergence to dominance in the infected population, denoted by the time to dominance (TTD). All three models yielded similar TTD estimates. The XBB.1.5 variant was found to dominate the population faster compared to the JN.1 variant, especially in HHS Region 2 (New York) where the XBB.1.5 was believed to emerge. This research expounds on how emerging viral strains transition to dominance, informing public health interventions against future emergent COVID-19 variants and other infectious diseases.
Article
Full-text available
COVID-19 has killed more than 500,000 people worldwide and more than 60,000 in Brazil. Since there are no specific drugs or vaccines, the available tools against COVID-19 are preventive, such as the use of personal protective equipment, social distancing, lockdowns, and mass testing. Such measures are hindered in Brazil due to a restrict budget, low educational level of the population, and misleading attitudes from the federal authorities. Predictions for COVID-19 are of pivotal importance to subsidize and mobilize health authorities’ efforts in applying the necessary preventive strategies. The Weibull distribution was used to model the forecast prediction of COVID-19, in four scenarios, based on the curve of daily new deaths as a function of time. The date in which the number of daily new deaths will fall below the rate of 3 deaths per million — the average level in which some countries start to relax the stay-at-home measures — was estimated. If the daily new deaths curve was bending today (i.e., about 1250 deaths per day), the predicted date would be on July 5. Forecast predictions allowed the estimation of overall death toll at the end of the outbreak. Our results suggest that each additional day that lasts to bend the daily new deaths curve may correspond to additional 1685 deaths at the end of COVID-19 outbreak in Brazil (R2 = 0.9890). Predictions of the outbreak can be used to guide Brazilian health authorities in the decision-making to properly fight COVID-19 pandemic.
Article
Full-text available
BACKGROUND: With its epicenter in Wuhan, China, the COVID-19 outbreak was declared a Public Health Emergency of International Concern by the World Health Organization (WHO). Consequently, many countries have implemented flight restrictions to China. China itself has imposed a lockdown of the population of Wuhan as well as the entire Hubei province. However, whether these two enormous measures have led to significant changes in the spread of COVID-19 cases remains unclear. METHODS: We analyzed the available data on the development of confirmed domestic and international COVID-19 cases before and after lockdown measures. We evaluated the correlation of domestic air traffic to the number of confirmed COVID-19 cases and determined the growth curves of COVID-19 cases within China before and after lockdown as well as after changes in COVID-19 diagnostic criteria. RESULTS: Our findings indicate a significant increase in doubling time from 2 days (95% CI: 1.9-2.6) to 4 days (95% CI: 3.5-4.3), after imposing lockdown. A further increase is detected after changing diagnostic and testing methodology to 19.3 (95% CI: 15.1-26.3), respectively. Moreover, the correlation between domestic air traffic and COVID-19 spread became weaker following lockdown (before lockdown: r = 0.98, P < 0.05 vs after lockdown: r = 0.91, P = NS). CONCLUSIONS: A significantly decreased growth rate and increased doubling time of cases was observed, which is most likely due to Chinese lockdown measures. A more stringent confinement of people in high risk areas seems to have a potential to slow down the spread of COVID-19. © International Society of Travel Medicine 2020. All rights reserved. For Permissions, please e-mail: [email protected]
Article
Full-text available
Background The outbreak of coronavirus disease, 2019 (COVID-19), which started from Wuhan, China, in late 2019, have spread worldwide. A total of 5,91,971 cases and 2,70,90 deaths were registered till 28th March, 2020. We aimed to predict the impact of duration of exposure to COVID-19 on the mortality rates increment. Methods In the present study, data on COVID-19 infected top seven countries viz., Germany, China, France, United Kingdom, Iran, Italy and Spain, and World as a whole, were used for modeling. The analytical procedure of generalized linear model followed by Gompertz link function was used to predict the impact lethal duration of exposure on the mortality rates. Findings Of the selected countries and World as whole, the projection based on 21st March, 2020 cases, suggest that a total (95% Cl) of 76 (65–151) days of exposure in Germany, mortality rate will increase by 5 times to 1%. In countries like France and United Kingdom, our projection suggests that additional exposure of 48 days and 7 days, respectively, will raise the mortality rates to10%. Regarding Iran, Italy and Spain, mortality rate will rise to 10% with an additional 3–10 days of exposure. World’s mortality rates will continue increase by 1% in every three weeks. The predicted interval of lethal duration corresponding to each country has found to be consistent with the mortality rates observed on 28th March, 2020. Conclusion The prediction of lethal duration was found to have apparently effective in predicting mortality, and shows concordance with prevailing rates. In absence of any vaccine against COVID-19 infection, the present study adds information about the quantum of the severity and time elapsed to death will help the Government to take necessary and appropriate steps to control this pandemic.
Article
Full-text available
A novel coronavirus (severe acute respiratory syndrome-CoV-2) that initially originated from Wuhan, China, in December 2019 has already caused a pandemic. While this novel coronavirus disease (covid-19) frequently induces mild diseases, it has also generated severe diseases among certain populations, including older-aged individuals with underlying diseases, such as cardiovascular disease and diabetes. As of 31 March 2020, a total of 9786 confirmed cases with covid-19 have been reported in South Korea. South Korea has the highest diagnostic rate for covid-19, which has been the major contributor in overcoming this outbreak. We are trying to reduce the reproduction number of covid-19 to less than one and eventually succeed in controlling this outbreak using methods such as contact tracing, quarantine, testing, isolation, social distancing and school closure. This report aimed to describe the current situation of covid-19 in South Korea and our response to this outbreak.
Article
Full-text available
In this paper are presented mathematical predictions on the evolution in time of the number of positive cases in Italy of the COVID-19 pandemic based on official data and on the use of a function of the type of a Gauss error function, with four parameters, as a cumulative distribution function. We have analyzed the available data for China and Italy. The evolution in time of the number of cumulative diagnosed positive cases of COVID-19 in China very well approximates a distribution of the type of the error function, that is, the integral of a normal, Gaussian distribution. We have then used such a function to study the potential evolution in time of the number of positive cases in Italy by performing a number of fits of the official data so far available. We then found a statistical prediction for the day in which the peak of the number of daily positive cases in Italy occurs, corresponding to the flex of the fit, that is, to the change in sign of its second derivative (i.e., the change from acceleration to deceleration), as well as of the day in which a substantial attenuation of such number of daily cases is reached. We have also analyzed the predictions of the cumulative number of fatalities in both China and Italy, obtaining consistent results. We have then performed 150 Monte Carlo simulations to have a more robust prediction of the day of the above-mentioned peak and of the day of the substantial decrease in the number of daily positive cases and fatalities. Although official data have been used, those predictions are obtained with a heuristic approach since they are based on a statistical approach and do not take into account either a number of relevant issues (such as number of daily nasopharyngeal swabs, medical, social distancing, virological and epidemiological) or models of contamination diffusion.
Article
Full-text available
Recently, the number of Corona Virus Disease 2019 (COVID-19) cases has increased remarkably in South Korea, so the triage clinics and emergency departments (ED) are expected to be overcrowded with patients with presumed infection. As of March 21st, there was a total of 8,799 confirmed cases of COVID-19 and 102 related deaths in South Korea that was one of the top countries with high incidence rates [1]. This sharp increase in infection is associated with 1) outbreaks in individual provinces, 2) deployment of rapid and aggressive screening tests, 3) dedicated healthcare staffs for virus screening tests, 4) quarantine inspection data transparency and accurate data reporting, and 5) public health lessons from previous Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) outbreaks. This commentary introduces innovative screening tests that are currently used in South Korea for COVID-19, e.g., Drive-Through and Walk-Through tests, and compare the advantages and disadvantages of both methods.
Article
Full-text available
Objectives: Since the report of the first confirmed case in Daegu on February 18, 2020, local transmission of COVID-19 in the Republic of Korea has continued. In this study, we aimed to identify the pattern of local transmission of COVID-19 using mathematical modeling and predict the epidemic size and the timing of the end of the spread. Methods: We modeled the COVID-19 outbreak in the Republic of Korea by applying a mathematical model of transmission that factors in behavioral changes. We used the Korea Centers for Disease Control and Prevention data of daily confirmed cases in the country to estimate the nationwide and Daegu/Gyeongbuk area-specific transmission rates as well as behavioral change parameters using a least-squares method. Results: The number of transmissions per infected patient was estimated to be about 10 times higher in the Daegu/Gyeongbuk area than the average of nationwide. Using these estimated parameters, our models predicts that about 13,800 cases will occur nationwide and 11,400 cases in the Daegu/Gyeongbuk area until mid-June. Conclusion: We mathematically demonstrate that the relatively high per-capita rate of transmission and the low rate of changes in behavior have caused a large-scale transmission of COVID-19 in the Daegu/Gyeongbuk area in the Republic of Korea. Since the outbreak is expected to continue until May, nonpharmaceutical interventions that can be sustained over the long term are required.
Article
Background SARS-CoV-2 test kits are in critical shortage in many countries. This limits large-scale population testing and hinders the effort to identify and isolate infected individuals. Objective Herein, we developed and evaluated multi-stage group testing schemes that test samples in groups of various pool sizes in multiple stages. Through this approach, groups of negative samples can be eliminated with a single test, avoiding the need for individual testing and achieving considerable savings of resources. Study design We designed and parameterized various multi-stage testing schemes and compared their efficiency at different prevalence rates using computer simulations. Results We found that three-stage testing schemes with pool sizes of maximum 16 samples can test up to three and seven times as many individuals with the same number of test kits for prevalence rates of around 5% and 1%, respectively. We propose an adaptive approach, where the optimal testing scheme is selected based on the expected prevalence rate. Conclusion These group testing schemes could lead to a major reduction in the number of testing kits required and help improve large-scale population testing in general and in the context of the current COVID-19 pandemic.
Article
The effect of weather on COVID-19 spread is poorly understood. Recently, few studies have claimed that warm weather can possibly slowdown the global pandemic, which has already affected over 1.6 million people worldwide. Clarification of such relationships in the worst affected country, the US, can be immensely beneficial to understand the role of weather in transmission of the disease in the highly populated countries, such as India. We collected the daily data of new cases in 50 US states between Jan 1–Apr 9, 2020 and also the corresponding weather information (i.e., temperature (T) and absolute humidity (AH)). Distribution modeling of new cases across AH and T, helped identify the narrow and vulnerable AH range. We validated the results for 10-day intervals against monthly observations, and also worldwide trends. The results were used to predict Indian regions which would be vulnerable to weather based spread in upcoming months of 2020. COVID-19 spread in the US is significant for states with 4 < AH < 6 g/m3 and number of new cases > 10,000, irrespective of the chosen time intervals for study parameters. These trends are consistent with worldwide observations, but do not correlate well with India so far possibly due the total cases reported per interval < 10,000. The results clarify the relationship between weather parameters and COVID-19 spread. The vulnerable weather parameters will help classify the risky geographic areas in different countries. Specifically, with further reporting of new cases in India, prediction of states with high risk of weather based spread will be apparent.
Article
After the cases of COVID-19 skyrocketed, showing that it was no longer possible to contain the spread of the disease, the governments of many countries launched mitigation strategies, trying to slow the spread of the epidemic and flatten its curve. The Spanish Government adopted physical distancing measures on March 14; 13 days after the epidemic outbreak started its exponential growth. Our objective in this paper was to evaluate ex-ante (before the flattening of the curve) the effectiveness of the measures adopted by the Spanish Government to mitigate the COVID-19 epidemic. Our hypothesis was that the behavior of the epidemic curve is very similar in all countries. We employed a time series design, using information from January 17 to April 5, 2020 on the new daily COVID-19 cases from Spain, China and Italy. We specified two generalized linear mixed models (GLMM) with variable response from the Gaussian family (i.e. linear mixed models): one to explain the shape of the epidemic curve of accumulated cases and the other to estimate the effect of the intervention. Just one day after implementing the measures, the variation rate of accumulated cases decreased daily, on average, by 3.059 percentage points, (95% credibility interval: −5.371, −0.879). This reduction will be greater as time passes. The reduction in the variation rate of the accumulated cases, on the last day for which we have data, has reached 5.11 percentage points. The measures taken by the Spanish Government on March 14, 2020 to mitigate the epidemic curve of COVID-19 managed to flatten the curve and although they have not (yet) managed to enter the decrease phase, they are on the way to do so.