Content uploaded by Georg Quaas
Author content
All content in this area was uploaded by Georg Quaas on Apr 28, 2020
Content may be subject to copyright.
1
The reproduction number in the classical epidemiological model
Georg Quaas
Universität Leipzig
Grimmaische Straße 12
D-04109 Leipzig
quaas@uni-leipzig.de
April 28, 2020
Abstract:
The German Robert Koch Institute aims to “protect the population from disease and improve
their state of health” (RKI 2017). To this end, it develops concrete, research-based
recommendations for policymakers and makes data available to the expert public. Since
March 4, 2020, it has been publishing the numbers of coronavirus infections reported by
health authorities daily; since March 9, these data have included the numbers of people who
have died of COVID-19; and since March 25, the RKI has reported the estimated numbers of
those who have recovered. The important reproduction number, reported daily since April 7,
has now largely replaced all other criteria used for decision-making. This paper aims to show
that the calculation of this figure is neither theory-based nor particularly reliable.
Nevertheless, there is a simple way to determine its change more or less conservatively and
precisely.
Keywords:
Classic epidemic model, reproduction number, contact rate, COVID-19, mathematics of
highly infectious diseases, public health
JEL-Classification:
C32, C61, I12, I18, J11
2
1. The theoretical framework
The classical epidemiological model (CEM), compared to econometric models of moderate
size used for forecasting and simulating economic policy measures, is quite clear as well as
plausible in its assumptions. Moreover, some parts of the model have a practical use. On the
basis of daily data, it is characterized by the following equations (Hethcote 2000: 604):
(1)
S IS N
(2)
I IS N R
(3)
R I
Here,
and
are parameters to be determined empirically; N is the population size; S is the
number of people susceptible to infection (at the beginning of the disease and without
information about the number of immune people in the population, it is set as equal to N); I
is the number of infected and infectious people (with an initial value close to zero); and R is
the number of people who recover and are assumed to be immune (without information,
this number is also set at zero at the beginning). The exact conceptual characteristics of
these three population groups are not entirely unimportant because the “susceptible” need
not be “healthy” and the “infected” need not be “sick” (Donsimoni et al. 2020). Whether this
model can be applied to COVID-19 depends on whether those who have recovered are
immune, at least for a while.
The logic behind this model becomes clear when defining proportions of the two population
groups with and without active pathogens in the total population:
i I N
und
s S N
.
These proportions can be interpreted as the probability of accidentally encountering an
infectious or a susceptible person in the population. The probability that these persons meet
has the value
i s
(Hamer 1906). The frequency of infections is assumed in the model to
increase with the size of the population, resulting – with the exception of
– in the mean
term of equation (2), which we abbreviate as H:
(4)
2
ISN IS
isN H
N N
From a practical perspective, H can be interpreted as the number of possibilities to become
infected in a population of the extent N, so to speak, the “abstract risk situation.”
3
The parameter
reflects the actual infection process with regard to how many people are
infected by an infectious agent per time unit (here: per day) on average. This depends on a
variety of factors that are not explicitly included in the model, such as population density,
the number of daily interactions, and common behaviors (hygiene, shaking hands, etc.).
Some of these factors can be influenced pragmatically so that policymakers have access to
the ongoing process. However, the effect on the parameter
is recorded with a time
delay. In the case of the novel coronavirus, the media initially assumed that each infected
person infects three more people during his or her infectious phase. The RKI reports a basic
reproduction figure
0
R
between 2 and 2.5 in its “coronavirus profile.” Initial experience
suggests that the infectious phase lasts for about 10 days, so that an average can be set as
0,3
, that is, as long as more accurate estimates are unavailable.
There is a fourth variable that does not play an explicit role in the above-formulated classical
model, and that is the number of deaths, D. The model has been interpreted (an der Heiden
& Buchholz 2020: p.1) in such a way that the number of deaths is included in the number of
recoveries. This sounds a bit cynical, but from a statistical perspective, another aspect is
important: If the number of deaths is empirically available, the model should be
supplemented with it. Both the number of deaths and the number of recoveries reduce the
number of infectious cases; therefore, equation (2) can be made more precise as:
(2a)
I IS N R D
The number of recoveries is determined by equation (3) if not previously estimated by the
data producer. For the number of new reported deaths, the above assumption applies:
(4)
D I
For this parameter,
measures the current number of newly reported deaths based on the
number of infected persons at an earlier time of infection. Deaths reduce the population
number. Therefore, in the case of a short-term event with potentially high numbers of
victims, it makes sense to supplement the population figure set constant in the model with
the number of deaths and treat all other variables as time-dependent:
(5)
'
S t I t R t N D t N t
In order to adapt the model to empirical evidence, it is essential to add time lags to
individual variables according to the characteristics of the underlying disease. This requires
4
medical information (Schilling et al. 2020). Not all of the characteristics of COVID-19 are
known; for the unknown parameters, assumptions must be made to “feed” the model.
Particularly problematic is the average duration of infectivity of an infected person, which is,
as it seems, difficult to determine at this time. It should also be noted that the accumulated
number of infected persons reported by the RKI (or the number of persons reported as
infected by public health authorities) does not correspond to the number of infectious
persons
I t
in the classical model.
2. Measurement of contact intensity
With the above equations, exponential growth is modeled in the first phase of the epidemic:
The numbers increase slowly at first, but then they behave like an explosion. But it would be
a mistake to assume exponential growth throughout and thus explain new infections. The
more the probabilities change and political measures influence the contact rate
, the more
the development deviates from exponential growth. Ultimately, the decisive factor for
reversing development is a reduction in the number of people in the population susceptible
to the pathogen either through recovery, death, or vaccination. Fig. 1 shows the course that
would have been expected according to the CEM if no measures had been taken:
Parameters of the model:
0.26
;
0.026
;
83.1 million
N
5
The number of COVID-19 victims would be about one million; this can be seen in the figure
by the slight decrease of the population curve (point C).
In the absence of a vaccine, the contact rate
should be the focus of pragmatic
considerations, reflecting what can be influenced by policy. It is, by the way, relatively easy
to determine
based on the daily number of new infections using equation (1) or (2) –
preferably as an average to compensate for statistical variations. Since the average is always
calculated over the last X days, it is a conservative estimate, which, when it comes to
relaxing the restrictions, certainly does not encourage premature action. If the aim is to
remain at a low level of new cases and detect changes as early as possible, a 3-day average is
recommended, which also provides fairly stable measurement results.
The “replacement number”
R t
, which is sometimes called the “reproduction number” in
literature as well as by the RKI, is more descriptive and, thus, easier to communicate to the
public. It represents the number of secondary infections that a typical infected person
produces during the period
T
of infectivity (Hethcote 2000: 603 f.). Due to the definitions,
the following applies:
(6)
R t t T
In principle, the problem with this figure is that, for COVID-19, the duration
T
of being
infectious is not yet known exactly and must be replaced by an assumption. However, since
the main factor in the ongoing process is the change in the contact rate, this is not an
obstacle to using either
R t
or
as a basis for decision-making.
In Germany, policymaking was initially purely pragmatic and based on the available data,
using the number of days in which the number of infected persons doubles. For reasons that
are difficult to understand, the number 14, which was noted one day before the first relaxing
of restrictions, that is, on April 15, was favored. Although criticism of this highly error-prone
number is justified (Kaßmann 2020), it cannot be replaced by purely mathematical methods
from an epidemiological perspective.
3. The reproduction number of the RKI
The RKI began reporting the value for the following variable in its daily situation reports on
April 7, 2020:
6
The reproduction number is the number of persons in average infected by a case.
This number can only be estimated and not directly extracted from the notification
system. The current estimation is R = 1.3 (1.0-1.6). This is based on the number of
cases with disease onset between 31/03/2020-03/04/2020 and 27/03/2020-
30/03/2020 and an average generation time of 4 days. Cases with more recent
disease onset are not included because their low number would lead to an unstable
estimation. (Daily Situation Report, RKI 4/7/2020)
The RKI’s estimate reflects the state of affairs dating back at least three days. Due to late
notifications by health authorities, the actual number of new infections is correct only after
at least three days. These figures are reported on the “dashboard” recommended by the RKI.
After aggregation, they should correspond to the current figures in the daily situation
reports, but this is not the case.
Reference is made to the methodological principles in the management report dated April
13. The preparation of the data for analysis is described in detail by Heiden and Hamouda
(2020: 10 ff.): With the help of the average delay in the reporting system, the time of the
event of infection is first determined (in some cases completely new). Based on this, the
time-dependent reproduction rate according to the RKI is determined by assuming that it
takes an average of four days for one infected person to infect the next (“generation time”).
In another publication, the presumed duration of infertility is given in days
10
T
(an der
Heiden & Buchholz 2020). According to the RKI profile, the virus could still be detected in
some infected persons eight days after the outbreak of the disease. Together with the
average of two days of an infectious prelude during the incubation period, one must,
therefore, expect an average of ten days of infectivity. Based on these facts, the RKI’s
method has to be critically evaluated for theoretical and empirical reasons:
With a constant generation time of 4 days, R is the quotient of the number of new
cases in two consecutive periods of 4 days each. If the number of new cases has
increased in the second time period, R is above 1. If the number of new cases is the
same in both time periods, the re-production number is 1. This then corresponds to a
linear increase in the number of cases. If, on the other hand, only every second case
7
infects another person, i.e. R = 0.5, then the number of new infections is halved
within the generational period. (an der Heiden & Hamouda 2020: 13)
The use of generation time in the absence of the much-longer infectivity period in the
theoretical justification of the method used raises doubts. Confrontation of this method with
the CEM shows that the doubts are justified.
The basis for the calculation of the reproduction figure is the number of newly infected (or
reported as infected) people
I
, which is part of equation (2). If the numbers of those
recovered and deceased are disregarded and two successive periods are added from 1 to 4
and 5 to 8, respectively, and we divide the second sum by the first, this provides the
reproduction number of the RKI:
(7)
8 8 8
5 5 5 5 8
1 8 1 8
4 4 4
1 4
1 1 1
,...,
i i i
i
i i i
RKI
i i i
i
i i i
I IS N H
R T R t t
I IS N H
Under the simplifying assumption that the term for the frequencies in the period in question
does not change, the fraction can be shortened and one can see that the authors do not
calculate the reproduction number
R t
defined by the epidemiological model according to
equation (6), but the (multiplicatively expressed) change in the four-day contact rates.
However, since the frequencies H change over time to a similar extent as the contact rate,
this effect overlays the change in the contact rate, so that, strictly speaking, we learn
nothing about it – and, correspondingly, nothing about the change in the reproduction
number, which expresses exactly the change in the contact rate (since T is constant).
Mathematically speaking,
RKI
R
is a rough measure of the curvature of the curve formed by
the accumulated infection numbers (see Fig. 1). The curve is convex in the initial phase of
the epidemic (“convex” in economic terminology), i.e., it provides
1
RKI
R
because the
subsequent differences increase (the area around point A in Fig. 1). At a certain point, which
mathematicians call the “turning point,” the curve becomes concave because the
subsequent differences are smaller than the previous ones: Now it is
1
RKI
R
(point B in
Fig. 1).
8
RKI
R provides a purely phenomenological characteristic of the epidemic’s course, while the
reproduction number is closer to the mechanism that determines the course. In the RKI
method, the four-day “generation time” merely determines the length of the measurement
interval for averaging errors in the data. If there were no errors, one would have to say that
the shorter the measuring interval, the more precisely the curvature is recorded; the longer
the measuring interval, the more inaccurate.
The curvature parameter
RKI
R was specified for the first time on April 16, 2020, with a value
below 1. At that time, the number of reproductions
R t
, calculated conservatively (10-day
average) and multiplied by a very conservative duration
12T
of being infectious, was
already at 0.8. Figure 2 compares the two “reproduction numbers” (possible only at point D):
Data source: Daily Situation Reports of the RKI as well as own calculations from CEM.
As the dot-dash line in Fig. 2 shows, the RKI’s measurements have, thus far, been quite
volatile and are since April 18 for four days surprisingly constant. It is not very likely that the
reproductive rate
R t
will fluctuate as sharply under the regime of reduced contacts, and it
should not remain constant under the condition of relaxing the restrictions. The increase in
the RKI's reproduction figures for April 27 reflects a slight fluctuation in infection figures in
9
the 17th week of the year, i.e. one week before the first nationwide loosening of
restrictions. No value was reported for April 19 due to “technical changes”; the missing value
was inserted in the graph by the mean value between values reported on April 18 and 20.
Contrary to the preference for
the true reproduction number
R t
(acc. CEM) represented in
this paper, an objection could be that the RKI controls the actual course with its method
more closely. However, if one considers that the infection figures are, to a large extent,
modified by the RKI prior to analysis, this is also doubtful. If, from an epidemiological
perspective, it is important to keep an eye on the number of new infections, then the
contact rate
and also the number of possibilities H to become infected in a population of
83 million people (the abstract risk situation) should be targeted, even if only the contact
rate can be influenced politically. The theoretically expected number H of possibilities to
become infected can help to explain the number of new infections and provide a realistic
picture of the risk situation.
4. Conclusion
Under the authoritative expert advice of the RKI, German policymakers have succeeded in
decisively reducing the contact rate in a relatively short time and, thereby, containing the
spread of the first wave of the coronavirus epidemic. Furthermore, it will be important to
prevent this rate from rising again, if possible, despite the easing of restrictions. In view of
the high and, likely, increasing economic costs of each additional day of contact restriction, it
would have been advisable to use a theoretically well-thought-out model to calculate the
contact rate and the corresponding reproduction number, the parameters of which reflect
actual, effective circumstances. Moreover, the classical epidemiological model allows more
up-to-date and stable estimates.
10
References:
an der Heiden, M., U. Buchholz (2020): Modellierung von Beispielszenarien der SARS-CoV-2-
Epidemie 2020 in Deutschland, DOI 10.25646/6571.2
an der Heiden, M., O. Hamouda (2020): Schätzung der aktuellen Entwicklung der SARS-CoV-
2-Epidemie in Deutschland – Nowcasting, Epid. Bull. 2020, Nr. 17, S. 10 – 15. DOI
10.25646/6692
Donsimoni, J. R., R. Glawion, B. Plachter, K. Wälde (2020): Projektion der COVID-19-Epidemie
in Deutschland, Wirtschaftsdienst, 100. Jahrgang, Heft 4, S. 272-276
Hamer, W. H. (1906): Epidemic Disease in England, Lancet, 1, pp. 733-739
Hethcote, H. W. (2000): The Mathematics of Infectious Diseases, SIAM Review, Vol. 42, No.
4. (Dec., 2000), pp. 599-653
Kaßmann, M. (2020): Die Fallstricke der Verdoppelung, URL: https://www.math.uni-
bielefeld.de/kassmann/index.php
Robert-Koch-Institut (2017): Leitbild, URL:
https://www.rki.de/DE/Content/Institut/Leitbild/Leitbild_node.html
RKI: SARS-CoV-2 Steckbrief zur Coronavirus-Krankheit-2019 (COVID-19), URL:
www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Steckbrief.html
RKI: Tägliche Situationsberichte ab 4. März 2020, URL:
https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Situationsberichte/
Archiv.html
Schilling, J., M. Diercke, D. Altmann, W. Haas, S. Buda (2020): Vorläufige Bewertung der
Krankheitsschwere von COVID-19 in Deutschland basierend auf übermittelten Fällen
gemäß Infektionsschutzgesetz, Epid. Bull. 2020, Nr. 17, S. 3 – 9, DOI 10.25646/6670