Content uploaded by Zhanwei Du
Author content
All content in this area was uploaded by Zhanwei Du on Feb 24, 2020
Content may be subject to copyright.
Title: The serial interval of COVID-19 from publicly reported confirmed cases
Running Head: The serial interval of COVID-19
Authors: Zhanwei Du1,+, Lin Wang2,+, Xiaoke Xu3, Ye Wu4,5, Benjamin J. Cowling6, and
Lauren Ancel Meyers1,7*
Affiliations:
1. The University of Texas at Austin, Austin, Texas 78712, The United States of
America
2. Institut Pasteur, 28 rue du Dr Roux, Paris 75015, France
3. Dalian Minzu University, Dalian 116600, China
4. Computational Communication Research Center, Beijing Normal University, Zhuhai,
519087, China
5. School of Journalism and Communication, Beijing Normal University, Beijing,
100875, China
6. The University of Hong Kong, Hong Kong SAR, China
7. Santa Fe Institute, Santa Fe, New Mexico, The United States of America
Corresponding author: Lauren Ancel Meyers
Corresponding author email: laurenmeyers@austin.utexas.edu
+ These first authors contributed equally to this article
Abstract
As a novel coronavirus (COVID-19) continues to emerge throughout China and threaten the
globe, its transmission characteristics remain uncertain. Here, we analyze the serial
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
intervals–the time period between the onset of symptoms in an index (infector) case and the
onset of symptoms in a secondary (infectee) case–of 468 infector-infectee pairs with
confirmed COVID-19 cases reported by health departments in 18 Chinese provinces between
January 21, 2020, and February 8, 2020. The reported serial intervals range from -11 days to
20 days, with a mean of 3.96 days (95% confidence interval: 3.53-4.39), a standard deviation
of 4.75 days (95% confidence interval: 4.46-5.07), and 12.1% of reports indicating
pre-symptomatic transmission.
Keywords: Wuhan, coronavirus, epidemiology, serial interval
A new coronavirus (COVID-19) emerged in Wuhan, China in late 2019 and was declared a
public health emergency of international concern by the World Health Organization (WHO)
on January 30, 2020 (1). As of February 19, 2020, the WHO has reported over 75,204
COVID-19 infections and over 2,009 COVID-19 deaths (2), while key aspects of the
transmission dynamics of COVID-19 remain unclear (3). The serial interval of COVID-19 is
defined as the time duration between a primary case (infector) developing symptoms and
secondary case (infectee) developing symptoms (4,5). Obtaining robust estimates for the
distribution of COVID-19 serial intervals is a critical input for determining the reproduction
number which can indicate the extent of interventions required to control an epidemic (6).
However, this quantity cannot be inferred from daily case count data alone (7).
To obtain reliable estimates of the serial interval, we obtained data on 468 COVID-19
transmission events reported in mainland China outside of Hubei Province between January
21, 2020, and February 8, 2020. Each report consists of a probable date of symptom onset for
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
both the infector and infectee as well as the probable locations of infection for both cases.
The data include only confirmed cases that were compiled from online reports from 18
provincial centers for disease control and prevention.
Notably, 59 of the 468 reports indicate that the infectee developed symptoms earlier than the
infector. Thus, pre-symptomatic transmission may be occurring, i.e., infected persons may be
infectious before their symptoms appear. In light of these negative-valued serial intervals, we
assume that COVID-19 serial intervals follow a normal distribution rather than the more
commonly assumed gamma or Weibull distributions that are limited to strictly positive values
(8,9). We estimate a mean serial interval for COVID-19 of 3.96 [95% CI 3.53-4.39] with a
standard deviation of 4.75 [95% CI 4.46-5.07], which is considerably lower than reported
mean serial intervals of 8.4 days for SARS (9) and 12.6 days (10) - 14.6 days (11) for MERS.
The mean serial interval is slightly but not significantly longer when the index case is
imported (4.06 [95% CI 3.55-4.57]) versus locally infected (3.66 [95% CI 2.84-4.47]).
Combining these findings with published estimates for the early exponential growth rate
COVID-19 in Wuhan (12,13), we estimate a basic reproduction number (R
0) of 1.33 (6),
which is lower than published estimates that assume a mean serial interval exceeding seven
days (13–15).
These estimates reflect reported symptom onset dates for 752 cases from 93 Chinese cities,
who range in age from 1 to 90 years (mean 45.2 years and SD 17.21 years). We note three
key caveats of the analysis. First, the data are restricted to online reports of confirmed cases
and therefore may be biased towards more severe cases in areas with a high-functioning
healthcare and public health infrastructure. Second, the distribution of serial intervals varies
throughout an epidemic, with the time between successive cases contracting around the
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
epidemic peak (16). To provide intuition, a susceptible person is likely to become infected
more quickly if they are surrounded by two infected people rather than just one. Since our
estimates are based primarily on transmission events reported during the early stages of
outbreaks, we do not explicitly account for such compression and interpret the estimates as
basic
serial intervals at the outset of an epidemic. If some of the reported infections occurred
amidst growing clusters of cases, our estimates may instead reflect effective serial intervals
that would be expected during a period of epidemic growth. Finally, rapid isolation of
symptomatic cases in some locations may have prevented longer serial intervals, potentially
biasing our estimate downwards compared to serial intervals that might be observed in an
uncontrolled epidemic.
Given the heterogeneity in type and reliability of these sources, we caution that our findings
should be interpreted as working hypotheses regarding the infectiousness of COVID-19
requiring further validation as more data become available. The potential implications for
COVID-19 control are mixed. While our lower estimates for R
0 suggest easier containment,
the large number of reported asymptomatic transmission events is concerning.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
Figure. Estimated serial interval distribution for COVID-19 based on 468 reported
transmission events in China between January 21, 2020 and February 8, 2020. Bars
indicate the number of infection events with specified serial interval and blue lines indicate
fitted normal distributions for (a) all infection events (N
= 468) reported across 93 cities of
mainland China by February 8, 2020, and (b) the subset infection events (N
= 122) in which
both the infector and infectee were infected in the reporting city (i.e., the index case was not
an importation from another city). Negative serial intervals (left of the vertical dotted lines)
suggest the possibility of COVID-2019 transmission from asymptomatic or mildly
symptomatic cases.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
Table S1. Estimated serial interval distributions based on location of index infection. We
assume that the serial intervals follow normal distributions and report the estimated means
and standard deviations for (a) all 468 infector-infectee pairs reported from 93 cities in
mainland China by February 8, 2020, (b) a subset of 122 infection events in which the index
case was infected locally, and (c) a subset of 346 infection events in which the index case was
an importation from another city. The rightmost column provides the proportion of infection
events in which the secondary case developed symptoms prior to the index case.
Group
Mean [95 CI%]
SD [95 CI%]
Proportion of
serial intervals < 0
All (N
=468)
3.96 [3.53, 4.39]
4.75 [4.46, 5.07]
12.61% (N
= 59)
Locally infected index case
(N
=122)
3.66 [2.84, 4.47]
4.54 [4.03, 5.20]
14.75% (N
= 18)
Imported index case
(N
=346)
4.06 [3.55, 4.57]
4.82 [4.48, 5.21]
11.85% (N
= 41)
Supplementary Appendix
Data
We collected publicly available online data on 6,903 confirmed cases from 271 cities of
mainland China, that were available as of February 8, 2020. The data were extracted in
Chinese from the websites of provincial public health departments and translated to English.
We then filtered the data for clearly indicated transmission events consisting of: (i) a known
infector
and infectee
, (ii) reported locations of infection for both cases, and (iii) reported dates
and locations of symptom onset for both cases. We thereby obtained 468 infector-infectee
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
pairs from 93 Chinese cities between January 21, 2020 and February 8, 2020 (Figure S1).
The index cases (infectors) for each pair are reported as either importations from the city of
Wuhan (N
= 239), importations from cities other than Wuhan (N
= 106) or local infections (N
= 122). The cases included 752 unique individuals, with 98 index cases who infected multiple
people and 17 individuals that appear as both infector and infectee. They range in age from 1
to 90 years and include 386 females, 363 males and 3 cases of unreported sex.
Inference Methods
Estimating serial interval distribution
For each pair, we calculated the number of days between the reported symptom onset date for
the infector and the reported symptom onset date for the infectee. Negative values indicate
that the infectee developed symptoms before the infectee. We then used the fitdist function in
Matlab (17) to fit a normal distribution to all 468 observations. It finds unbiased estimates of
the mean and standard deviation, with 95% confidence intervals. We applied the same
procedure to estimate the means and standard deviations with the data stratified by whether
the index case was imported or infected locally.
Estimating the basic reproduction number (R
0)
Given a epidemic growth rate r
and a normally distributed serial interval with mean ( ) and
standard deviation ( ), the basic reproduction number is given by
(6).
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
Assuming our point estimates for the mean and standard deviation of the serial interval
distribution (Table S1) and a recently published estimate for the exponential growth rate of
COVID-19 infections in Wuhan of 0.10 per day (13), we estimate an R
0 of 1.33.
Supplementary Analysis
To facilitate interpretation and future analyses, we summarize key characteristics of the
COVID-2019 infection report data set.
Age distribution
: Of the 737 unique cases in the data set, 1.7%, 3.5%, 54.1%, 26.1% and
14.5% were ages 0-4, 5-17, 18-49, 50-64, and over 65 years, respectively. Across all
transmission events, approximately one third occurred between adults ages 18 to 49, ~92%
had an adult infector (over 18), and over 99% had an adult infectee (over 18) (Table S2).
Secondary case distribution
: Across the 468 transmission events, there were 301 unique
infectors. The mean number of transmission events per infector is 1.55 (Figure S2) with a
maximum of 16 secondary infections reported from a 40 year old male in Liaocheng city of
Shandong Province.
Geographic distribution
: The 468 transmission events were reported from 93 Chinese cities
in 17 Chinese provinces and Tianjin (Figure S3). There are 22 cities with at least five
infection events and 71 cities with fewer than five infection events in the sample. The
maximum number of reports from a city is 72 for Shenzhen, which reported 339 cumulative
cases as of February 8, 2020.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
Table S2. Age distribution for the 457 of 468 infector-infectee pairs. Each value denotes
the number of infector-infectee pairs in the specified age combination. Age was not reported
for the remaining 11 pairs.
Infectee
0-4
5-17
18-49
50-46
65+
Total
Infector
0-4
0
0
0
0
0
0
5-17
0
0
1
0
1
2
18-49
12
18
154
60
44
288
50-46
1
5
47
49
13
115
65+
0
1
22
10
19
52
Total
13
24
224
119
77
457
Figure S1. Geographic composition of the infection report data set. The data consist of
468 infector-infectee pairs reported by February 8, 2020 across 93 cities in mainland China.
Colors represent the number of reported events per city, which range from 1 to 72, with an
average of 5.03 (SD 8.54) infection events. The 71 cities with fewer than five events are
colored in blues; the 22 cities with at least five events are colored in shades of orange.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
Figure S2. Number of infections per unique index case in the infection report data set.
There are 301 unique infectors across the 468 infector-infectee pairs. The number of
transmission events reported per infector ranges from 1 to 16, with ~55% having only one.
Acknowledgments
We acknowledge the financial support from NIH (U01 GM087719) and the National Natural
Science Foundation of China (61773091).
Author Bio
Dr. Du is a postdoctoral researcher in the Department of Integrative Biology at the University
of Texas at Austin. He develops mathematical models to elucidate the transmission dynamics,
surveillance, and control of infectious diseases.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
References
1. WHO | Pneumonia of unknown cause – China. 2020 Jan 30 [cited 2020 Feb 18];
Available from:
https://www.who.int/csr/don/05-january-2020-pneumonia-of-unkown-cause-china/en/
2. Organization WH, Others. Coronavirus disease 2019 (COVID-19): situation report, 30.
2020; Available from:
https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200219-sitrep
-30-covid-19.pdf?sfvrsn=6e50645_2
3. Cowling BJ, Leung GM. Epidemiological research priorities for public health control of
the ongoing global novel coronavirus (2019-nCoV) outbreak. Euro Surveill [Internet].
2020 Feb 13; Available from:
http://dx.doi.org/10.2807/1560-7917.ES.2020.25.6.2000110
4. Giesecke J. Modern infectious disease epidemiology. CRC Press; 2017.
5. Svensson A. A note on generation times in epidemic models. Math Biosci. 2007
Jul;208(1):300–11.
6. Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth
rates and reproductive numbers. Proc Biol Sci. 2007 Feb 22;274(1609):599–604.
7. Vink MA, Bootsma MCJ, Wallinga J. Serial Intervals of Respiratory Infectious
Diseases: A Systematic Review and Analysis [Internet]. Vol. 180, American Journal of
Epidemiology. 2014. p. 865–75. Available from: http://dx.doi.org/10.1093/aje/kwu209
8. Kuk AYC, Ma S. The estimation of SARS incubation distribution from serial interval
data using a convolution likelihood. Stat Med. 2005 Aug 30;24(16):2525–37.
9. Lipsitch M, Cohen T, Cooper B, Robins JM, Ma S, James L, et al. Transmission
dynamics and control of severe acute respiratory syndrome. Science. 2003 Jun
20;300(5627):1966–70.
10. Cowling BJ, Park M, Fang VJ, Wu P, Leung GM, Wu JT. Preliminary epidemiological
assessment of MERS-CoV outbreak in South Korea, May to June 2015 [Internet]. Vol.
20, Eurosurveillance. 2015. Available from:
http://dx.doi.org/10.2807/1560-7917.es2015.20.25.21163
11. Park SH, Kim Y-S, Jung Y, Choi SY, Cho N-H, Jeong HW, et al. Outbreaks of Middle
East Respiratory Syndrome in Two Hospitals Initiated by a Single Patient in Daejeon,
South Korea. Infect Chemother. 2016 Jun;48(2):99–107.
12. Jung S-M, Akhmetzhanov AR, Hayashi K, Linton NM, Yang Y, Yuan B, et al. Real
time estimation of the risk of death from novel coronavirus (2019-nCoV) infection:
Inference using exported cases [Internet]. Available from:
http://dx.doi.org/10.1101/2020.01.29.20019547
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint
13. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early Transmission Dynamics in
Wuhan, China, of Novel Coronavirus-Infected Pneumonia. N Engl J Med [Internet].
2020 Jan 29; Available from: http://dx.doi.org/10.1056/NEJMoa2001316
14. Tuite AR, Fisman DN. Reporting, Epidemic Growth, and Reproduction Numbers for the
2019 Novel Coronavirus (2019-nCoV) Epidemic [Internet]. Annals of Internal
Medicine. 2020. Available from: http://dx.doi.org/10.7326/m20-0358
15. Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and
international spread of the 2019-nCoV outbreak originating in Wuhan, China: a
modelling study. Lancet [Internet]. 2020 Jan 31; Available from:
http://dx.doi.org/10.1016/S0140-6736(20)30260-9
16. Kenah E, Lipsitch M, Robins JM. Generation interval contraction and epidemic data
analysis. Math Biosci. 2008 May;213(1):71–9.
17. Fit probability distribution object to data - MATLAB fitdist [Internet]. [cited 2020 Feb
19]. Available from: https://www.mathworks.com/help/stats/fitdist.html
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not peer-reviewed) The copyright holder for this preprint .https://doi.org/10.1101/2020.02.19.20025452doi: medRxiv preprint