Content uploaded by Wen-Bin Yu
Author content
All content in this area was uploaded by Wen-Bin Yu on May 12, 2020
Content may be subject to copyright.
Content uploaded by Wen-Bin Yu
Author content
All content in this area was uploaded by Wen-Bin Yu on May 01, 2020
Content may be subject to copyright.
Content uploaded by Wen-Bin Yu
Author content
All content in this area was uploaded by Wen-Bin Yu on Mar 04, 2020
Content may be subject to copyright.
Decodingtheevolutionandtransmissionsofthenovel
pneumoniacoronavirus(SARS-CoV-2/HCoV-19)
usingwholegenomicdata
Wen-BinYu1,2,*,Guang-DaTang3,4,LiZhang5,RichardT.Corlett1,2
1Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan
666303, China
2Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, Yunnan 666303, China
3Henry Fok College of Biology and Agriculture, Shaoguan University, Shaoguan, Guangdong 512005, China
4College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, Guangdong 510642, China
5Chinese Institute for Brain Research, Beijing 102206, China
ABSTRACT
Theoutbreak ofCOVID-19 startedin mid-December
2019 in Wuhan, China. Up to 29 February 2020,
SARS-CoV-2 (HCoV-19 / 2019-nCoV) had infected
morethan85 000people intheworld. Inthis study,
weused93completegenomesofSARS-CoV-2from
the GISAID EpiFluTM database to investigate the
evolution and human-to-human transmissions of
SARS-CoV-2inthefirsttwo monthsoftheoutbreak.
We constructed haplotypes of the SARS-CoV-2
genomes, performed phylogenomic analyses and
estimated the potential population size changes of
the virus. The date of population expansion was
calculatedbasedontheexpansionparametertau(τ)
using the formula t=τ/2u. A total of 120 substitution
siteswith119codons,including79non-synonymous
and 40 synonymous substitutions, were found in
eight coding-regions in the SARS-CoV-2 genomes.
Forty non-synonymous substitutions are potentially
associated with virus adaptation. No combinations
were detected. The 58 haplotypes (31 found in
samples from China and 31 from outside China)
wereidentifiedin 93viral genomesunderstudy and
couldbeclassifiedinto five groups. Byapplyingthe
reportedbatcoronavirusgenome(bat-RaTG13-CoV)
astheoutgroup, we foundthathaplotypesH13 and
H38 might be considered as ancestral haplotypes,
and later H1 was derived from the intermediate
haplotypeH3.ThepopulationsizeoftheSARS-CoV-
2 was estimated to have undergone a recent
expansion on 06 January 2020, and an early
expansion on 08 December 2019. Furthermore,
phyloepidemiologic approaches have recovered
specificdirections ofhuman-to-humantransmissions
and the potential sources for international infected
cases.
Keywords:COVID-19; HCoV-19; SARS-CoV-2;
Novel pneumonia outbreak; Human-to-human
transmission;Phyloepidemiology
Received:01 March 2020; Accepted: 27 April 2020; Online: 30 April
2020
Foundation items:This study was supported by grants from Ten
Thousand Talents Program of Yunnan for Top-notch Young Talents,
and the open research project of “Cross-Cooperative Team” of the
Germplasm Bank of Wild Species, Kunming Institute of Botany,
ChineseAcademyofSciences
*Correspondingauthor,E-mail:yuwenbin@xtbg.ac.cn
DOI:10.24272/j.issn.2095-8137.2020.022
OpenAccess
This is an open-access article distributed under the terms of the
Creative Commons Attribution Non-Commercial License (http://
creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted
non-commercial use, distribution, and reproduction in any medium,
providedtheoriginalworkisproperlycited.
Copyright ©2020 Editorial Office of Zoological Research, Kunming
InstituteofZoology,ChineseAcademyofSciences
Received:01 March 2020; Accepted: 27 April 2020; Online: 30 April
2020
Foundation items:This study was supported by grants from Ten
Thousand Talents Program of Yunnan for Top-notch Young Talents,
and the open research project of “Cross-Cooperative Team” of the
Germplasm Bank of Wild Species, Kunming Institute of Botany,
ChineseAcademyofSciences
*Correspondingauthor,E-mail:yuwenbin@xtbg.ac.cn
DOI:10.24272/j.issn.2095-8137.2020.022
OpenAccess
This is an open-access article distributed under the terms of the
Creative Commons Attribution Non-Commercial License (http://
creativecommons.org/licenses/by-nc/4.0/), which permits unrestricted
non-commercial use, distribution, and reproduction in any medium,
providedtheoriginalworkisproperlycited.
Copyright ©2020 Editorial Office of Zoological Research, Kunming
InstituteofZoology,ChineseAcademyofSciences
ZOOLOGICALRESEARCH
SciencePress Zoological Research41(3):247−257,2020 247
INTRODUCTION
Betacoronaviruses are characterized by enveloped, positive-
sense, single-stranded RNA, and hosted in animals,
particularly mammals (Cui et al., 2019). Before December
2019,fourspecies/strains of Betacoronavirus,HKU1, MERS-
CoV, OC43, and SARS-CoV, had been reported to cause
severe human diseases (Cui et al., 2019). The fifth
species/strain,a novelbetacoronavirusSARS-CoV-2/HCoV-
19 / 2019-nCoV (Gorbalenya et al., 2020; Jiang et al., 2020)
causinghumanpneumonia(i.e.,COVID-19),wasfirstreported
inWuhan,Hubei,CentralChina(Wuetal.,2020a;Zhouetal.,
2020;Zhuetal.,2020).Upto29February2020,SARS-CoV-2
hadinfectedmorethan85000peopleinallprovinces/regions
of China, and another 59 countries/regions across Africa,
Asia, Europe, North America, Oceania, and South America
(Wikipedia, 2020). Because SARS-CoV-2 can transmit from
human to human (Li et al., 2020), the massive exodus of
people before the Chinese Spring Festival boosted the
infection frequencies, as predicted (Wu et al., 2020b). Daily
confirmedinfectioncases weremorethan 2000 between 30
January and 16 February, 2020, and the highest was more
than15100(Wikipedia,2020),almosttwice thetotal number
forSARS-CoV(Chan-Yeung&Xu,2003).
Asa memberof subgenusSarbecovirus, SARS-CoV-2has
beensuggestedtobeof batorigin (Luet al.,2020b;Zhou et
al.,2020),andmayhavebeentransmittedtohumansthrough
non-bat intermediate mammals (e.g., pangolins (Cyranoski,
2020;Lamet al.,2020;Wong etal.,2020; Xiaoet al., 2020;
Zhang et al., 2020b)). Medical information for the first 41
infected patients in Wuhan showed that 27 patients were
linkedtotheHuananSeafood WholesaleMarket(abbreviated
asHuananmarketinthetextbelow)(Huangetal.,2020;Liet
al., 2020), which sold living wild mammals. This suggests a
high possibility that SARS-CoV-2 originated in the market,
thenthe infectedpeopletransmittedittootherpeopleoutside
ofthemarket.However,this conclusionhas beenchallenged
becausethe firstidentifiedinfectedpersonand 12others had
no link to the Huanan Market. Some researchers have
thereforeargued thatthe HuananMarket wasnottheoriginal
and/or only source of SARS-CoV-2 transmission to humans
(Cohen, 2020). The market was closed on 01 January 2020,
making it very difficult to identify the intermediate animal
vectors of SARS-CoV-2. In the absence of information on
potential intermediary reservoirs, the origin and transmission
pattern of SARS-CoV-2 are still unresolved (Wong et al.,
2020).
Since the outbreak of COVID-19 was first identified in
Wuhan in mid-December 2019, the first infected individuals
identified in other provinces and regions of China, and other
countries,duringJanuary 2020,havebeen assumedto have
been infected in Wuhan or through contact with people from
Wuhan(Chanet al., 2020;Holshue et al.,2020; Phan etal.,
2020;Rothe etal.,2020).Inthis study,weused93genomes
of SARS-CoV-2 from the GISAID EpiFluTM database (Shu &
McCauley, 2017) (access date 12 February 2020) to decode
the evolution and transmissions of SARS-CoV-2 in the first
two months of its spread. Our aims were to: (1) characterize
genomicvariationsof SARS-CoV-2;(2)infer theevolutionary
relationships of the worldwide samples; and (3) deduce the
transmissionhistory ofSARS-CoV-2 withinWuhanandoutof
Wuhantotheworld.
MATERIALSANDMETHODS
To decode the evolutionary history of SARS-CoV-2, we
retrieved96 completegenomesfromGISAID(Supplementary
Table S1, access by 12 February 2020) (Shu & McCauley,
2017). The genome EPI_ISL_402131 (bat-RaTG13-CoV,
hereafter) from GISAID was also included as the outgroup,
because it is the closest sister betacoronavirus available to
SARS-CoV-2(Zhou etal.,2020).The97genomesequences
werealignedusingMAFFT(Katoh&Standley,2013),thenthe
alignmentwasmanuallycheckedusingGeneious(Biomatters,
New Zealand). In the alignment, we found that
EPI_ISL_404253 contains six ambiguous sites at variable
positions and EPI_ISL_407079 and EPI_ISL_408978 have
175 “N” and 1476 “N” bases, respectively, so these three
genomes were excluded in this study. In addition, four
genomes (EPI_ISL_407071, EPI_ISL_407894, EPI_ISL_
407896, and EPI_ISL_409067) have their own private
ambiguous sites, which were conservatively replaced by the
commonnucleotide atthatpositioninthealignment;Notably,
EPI_ISL_406592 (H15) and EPI_ISL_406595 (H17) had
excessive amounts of private variable sites, which were
possibly affected by sequencing errors. In the alignment, the
5'untranslated region (UTR)and 3' UTR regions contain
missingandambiguoussites,sotheseregionswereexcluded
inthefollowinganalyses.
ThealignmentwasthenimportedintoDnaSP(Rozasetal.,
2017) for haplotype analyses. Population size changes were
estimated based on a constant population size hypothesis
usingDnaSP,incombinationwith neutralitytests (Tajima’sD
and Fu’s Fs). We also used Arlequin (Excoffier & Lischer,
2010)totestthesuddenpopulationexpansionhypothesisand
tocalculatetheexpansionparametertau(τ),sincethesudden
populationexpansionwas not rejected.We used theformula
t=τ/2u(Rogers&Harpending,1992)toestimatethetimesince
expansion (in days). In the formula, u is the cumulative
substitution rate per year for the genome sequence, so we
used the formula u=μk to calculate it, where μ is the
substitution rate per site per year, and k is the genome
sequence length (29 358 bp for the coding sequence (CDS)
matrix). The substitution rate was set as 0.92×10−3 (95% CI,
0.33×10−3–1.46×10−3)substitution/site/yearbasedon themost
recentestimationforSARS-CoV-2(Rambaut,2020).Toadjust
the time, we used a mean value of the expansion time
calculated from the three substitution rates, i.e., 0.33×10−3,
0.92×10−3, and 1.46×10−3 substitution/site/year. In addition,
theexpansiondatewasestimatedbasedonthesamplingdate
fromhospitalizedpatients.Theestimateddateshouldbelater
thanthe“real”dateof massivehuman-to-humantransmission
events.
248www.zoores.ac.cn
Phylogeneticnetworksofthehaplotypecodingregionmatrix
and 120 substitution sites of SARS-CoV-2 (Supplementary
Datasets) were inferred using SplitsTree (Huson & Bryant,
2006).Amedian-joining networkofhaplotypeswas generated
by the NETWORK program (Bandelt et al., 1999, 2020) with
the reported bat coronavirus (bat-RaTG13-CoV, (Zhou et al.,
2020)) as the outgroup. Transversions were arbitrarily
weighted three times as high as transitions. Hypervariable
sites(ifnumberofmutations ≥5)wereweighted as1,and the
other sites were weighted as 10. Genetically, SARS-CoV-2
and bat-RaTG13-CoV, as well as SARS-CoV, have been
proposedas thesamespecies (Gorbalenyaetal.,2020),and
genome sequence identity between bat-RaTG13-CoV and
SARS-CoV-2 was 96.2%. We carefully used three datasets
(i.e.,fourcore substitutionsites, 120substitutionsites, and1
235 substitution sites, Supplementary Datasets) to evaluate
the relationship between bat-RaTG13-CoV and four
associated/central haplotypes of SARS-CoV-2 (H1, H3, H13,
and H38). Phylogenomic analyses of haplotypes were
performedusingIQ-TREE (Minhetal., 2020).We conducted
likelihood mapping and SH-like approximate likelihood ratio
tests to assess the phylogenetic information and branch
supports,respectively.
RESULTSANDDISCUSSION
GenomicvariationsofSARS-CoV-2
GenomesizeofSARS-CoV-2variedfrom29782bpto29903
bp.Thealignedmatrixwas29910bpinlength,including140
variable sites. The CDS regions contained 120 substitution
sites(Supplementary FigureS1),whichwereclassifiedas58
haplotypes (Supplementary Table S2). Nucleotide diversity
(Pi) was 0.15×10−3±0.02×10−3 (standard deviation, SD,
hereafter).Haplotypediversity(Hd)was0.953±0.016(SD)and
varianceofHdwas0.26×10−3.
There were 120 substitution sites found in eight coding
sequence (CDS) regions of SARS-CoV-2 (Figure1,
Supplementary Table S2), including 79 transitions (65.83%)
and 41 transversions (34.17%). A chi-squared test showed
thatthedistributionofsubstitutionsitesacrossCDSregionsin
the genome was even (χ2=1.958, df=9, P=0.99). Substitution
sites occurring at the 1st to 3rd frame positions were 27
(25.55%),44(40.0%),and49(44.55%),respectively.The120
substitutionsiteswere associated with119 codons, including
79 non-synonymous (65.83%) and 40 synonymous (33.61%)
substitutions. Forty non-synonymous substitutions (50.63%)
changed the biochemical properties of the amino acid (AA),
andare thereforepotentially associatedwithvirusadaptation.
The current samplings showed that the H1 haplotype has
been found in 19 patients, but most haplotypes were just
sequenced once, suggesting that the haplotype H1 was
rapidly circulated at an early stage of human-to-human
transmissions(Figure2,SupplementaryTableS1).
Incomparisonswith publishedgenomesofSARS-CoV (Luk
et al., 2019) and MERS-CoV (Cotten et al., 2013), genomic
variations of SARS-CoV-2 are still low, without evident
Figure1Summaryinformationfor 120substitution sites crossingeight codingsequence regions inthe alignedSARS-CoV-2genomic
sequences
Zoological Research41(3):247−257,2020 249
recombination sites/regions (Rm=2, P=1.0) at this time.
According to the collection dates of the sequenced samples,
haplotypesH1andH3were foundintwo samplesatintervals
of more than 30 days, and multiple samples over 20 days
(Figure2,Supplementary Table S1).Although the incubation
periodcanbe over24days, there wasonly one caseof this
out of 1 099 observations (Guan et al., 2020). Estimation of
the substitution rates using 90 genomes of SARS-CoV-2
(Rambaut, 2020) showed that the rate for SARS-CoV-2 was
closeto orlower thanthe ratesfor MERS-CoV(Cotten etal.,
2014;Dudasetal.,2018)andSARS-CoV(Zhaoet al.,2004).
Due to the mild symptoms and low mortality (Yang et al.,
2020; Zhang et al., 2020a), the immune systems of the
infected humans may provide a suitable environment for
propagation of SARS-CoV-2 (Andersen et al., 2020). SARS-
CoV-2 is highly infectious (Yang et al., 2020) and is able to
infecthumansnotonlythroughthemucousmembranesofthe
noseand mouth,butalsousethemucousmembranesinthe
eyes (Lu et al., 2020a), which may boost regional circulation
and large-scale spread. Some large mutations may have
occurredinWuhan or otherregions, but thestrict quarantine
policy over China since 23 January 2020 may have reduced
thecirculationandspreadingofsomemutants.
Ofthe93genomesofSARS-CoV-2,39(41.93%)werefrom
infected patients in 11 countries outside China and encoded
31 haplotypes (Hd=0.987±0.009 (SD), Pi=0.16×10–3±
0.01×10–3), with 27 nationally/regionally private haplotypes.
The 54 genomes (58.07%) from China also encoded 31
haplotypes (Hd=0.906±0.001 (SD), Pi=0.14×10–3±0.03×10–3).
AproportionZ-testshowedsignificantdifferencesinhaplotype
diversity of samples between China and other countries
(χ2=4.024,df=1,P<0.05).Thehighhaplotypediversityfoundin
samples from other countries may be because the sampling
datesweremostlyafter22January2020,whilethoseinChina
were before this date (Supplementary Table S1 and Figure
S2). In addition, the low level of radiation exposure on long-
distance international flights (Bottollier-Depois et al., 2000)
mayhaveacceleratedmutationrates ofSARS-CoV-2 (Shibai
etal.,2017).
PopulationsizeexpansionofSARS-CoV-2
We used a variety of parameters to estimate the population
dynamicsofSARS-CoV-2.ConstantpopulationsizeofSARS-
CoV-2 was rejected (Ramos-Onsins and Rozas’s R2=0.025,
P<0.001;Raggedness r=0.011,P<0.05) usingDnaSP(Rozas
et al., 2017) (also see Supplementary Figure S3), while both
Fu's test (Fs=–67.681.964, P<0.001) and Tajima's D test
(D=–2.701, P<0.001) indicated that the population size of
SARS-CoV-2 was rapidly increasing. Mismatch distribution
analysis using Arlequin (Excoffier & Lischer, 2010) strongly
supported that the population of SARS-CoV-2 underwent
sudden expansion (τ=2.887, Sum of Squared deviation,
Figure2GenomichaplotypesofSARS-CoV-2changesbetweenthecollectiondatesofsamples
Theconfirmedsamples fromthe HuananSeafood WholesaleMarket areindicated usingredcircles, anda confirmedsample withno linktothe
marketisindicatedusingabluecircle.
250www.zoores.ac.cn
SSD=0.541×10–3, P=0.88, Harpending's Raggedness index,
R=0.010,P=0.88).The calculatedexpansion was28.72 days
(95% Confident Interval: 12.29–54.36 days) ago. Of the 93
genomes, the latest one was sampled on 03 February 2020,
so the estimated expansion date was on 06 January 2020
(95%CI:11December2019–22January2020),whichmaybe
relatedtotheNewYearholiday.Before06January2020,129
patientswereidentifiedasSARS-CoV-2 infectedthroughfield
investigations(Lietal.,2020).Of22genomes(17.05%of129
patients) sequenced before 06 January 2020 in Wuhan,
China, 13 haplotypes (22.41% of 58 haplotypes) were
recovered, which were H1 and its derived descendant
haplotypes, and H3 (Figures 2 and 3A). Coincidentally, the
China CDC (Chinese Center for Disease Control and
Prevention)startedtoactivatea Level-2emergencyresponse
on 06 January 2020 (Li et al., 2020). The China CDC’s
emergency response greatly reduced public activities and
travel,andmighthavereducedthelocalcirculationandlarge-
scalespreadinthefollowingweeksofJanuary.
Furthermore, mismatch distribution analysis of the 22
genomes before 06 January 2020 also showed a sudden
population expansion of SARS-CoV-2 at an earlier stage of
transmission(τ=2.818,SSD=0.010,P=0.41,R=0.046,P=0.57,
Tajima’sD=–2.241,P<0.001; Fu'sFs=–7.834,P<0.001). This
earlierpopulationexpansiontimewasestimatedat28.38days
(95% CI: 12.00–54.36 days) before 05 January 2020, which
wasthelatest samplingdate ofthe22 genomes.This earlier
expansion date was thus estimated to have occurred on 08
December 2019 (95% CI: 13 November 2019–26 December
2019), when there was only one infected patient officially
reported (Huang et al., 2020; Li et al., 2020). This suggests
thatSARS-CoV-2mighthavealreadycirculatedwidelyamong
humansinWuhanbeforeDecember2019,probablybeginning
inmidtolateNovember(Rambaut,2020).
EvolutionaryrelationshipsofSARS-CoV-2haplotypes
Phylogenetic networks showed that the 58 haplotypes were
clusteredintotwomainclades (Figure4).Clade Iincluded19
haplotypesandCladeIIincluded39haplotypes.Theoutgroup
bat-RaTG13-CoV was connected to Clade I, supposed to be
anancestralcladeforCladeII.ThelongbranchesofH15and
H17 correspond to an excessive amount of mutations, which
are possibly affected by sequencing errors, but this is still to
be determined. Three different datasets were used to infer
evolutionarynetworks,which consistentlysupported H13 and
H38as thepotentially ancestralhaplotypes,i.e.,theoutgroup
bat-RaTG13-CoVcouldconnecttobothH13andH38,orH38
alone,or throughamediumvectormv1 (anintermediate host
orthefirstinfectedhumans) connectedto bothH13 andH38
by single mutations at positions 18067 (S, synonymous
substitution) and/or 29102 (S), referring to the numbering of
thealignmentlength 29 910bp (Figure 5).Five main groups
can be recognized in the network using the dataset of 120
substitutionsites(Figure3A).TheH1,H3,andH13werethree
core haplotypes, so that Groups A–C were recognized using
them as the central (i.e., ancestral super-spreader)
haplotypes. Groups D and E were recognized based on two
new super-spreader haplotypes, H56 and a medium vector
mv2, which was a hypothesized (often ancestral) haplotype
notsampledinthecurrentsamples.Thesetwogroupscanbe
alsotreatedassubgroupsofGroupC. Moreover,the SH-like
approximate likelihood ratio test further enhanced the
Figure3Evolutionaryrelationshipandgeographicaldistributionof58haplotypesofSARS-CoV-2(A,B)
Proposed evolutionary paths (C) of haplotypes and possible transmission and spreading routes (D) are also inferred based on evolutionary
analysesandepidemiologicresearch.Samplesizesofhaplotypesandregionsareannotatedinthecircles.
Zoological Research41(3):247−257,2020 251
phylogenetic relationship retrieved from 58 haplotypes, that
eitherH13or H38(withH45) appearedin the basallineages
(Supplementary Figure S4), although it was difficult to
distinguish which one, H13 or H38, originated earlier
Figure4Phylogenetic networks of 58 haplotypes of SARS-CoV-2 with the outgroup bat-RaTG13-CoV using the whole coding region
matrix(A)and120substitutionsites(B),referringtoallvariablesitesofcodingregionsinSARS-CoV-2haplotypes
Thebottomscalebarsrepresentthenumberofsubstitutions persite.Theopenbox onbranchesofbat-RaTG13-CoV(A),H15(A,B),andH17(A,
B)indicatingalongbranchwasclipped.
252www.zoores.ac.cn
(SupplementaryFigureS5).
Inthenetwork,foursatellitehaplotypes andH35connected
toH13(GroupA),andnine satellitehaplotypesandH38+H45
and H50 connected to H3 (Group B). The connections
between the H3 and H1 are two mutations at positions 8789
(S)and28151(NS,non-synonymous substitution)(Figure 5),
thelattermutationchangedbothresiduesandthebiochemical
properties of the AA. This biochemical change may be
associated with the infectivity of SARS-CoV-2. The H1
haplotype,the mostabundant, included19 samples,while 26
satellite haplotypes and H40+(H43 and H47) haplotypes are
directlyderivedfromH1(GroupC).Moreover,fivehaplotypes
ofGroupDandfourhaplotypesofGroupEwerealsoderived
fromH1.
The Huanan Seafood Wholesale Market boosted human-
to-humantransmissionatanearlystage
Phylogenetic networks showed that bat-RaTG13-CoV was
nested with Group B in Clade I, and Clade II tends to be
derived from Clade I, i.e., H1 and its descendant haplotypes
werenewmutantsfromanancestralhaplotypeinCladeI.The
rootednetworksuggested two potentialevolutionary paths of
availablehaplotypesthat can befrom H13 throughH3 to H1
andH38,or fromH38throughH3toH1andH13(Figure3C).
Both scenarios suggested that H3 might be the ancestral
haplotypeofH1.H13was onlyrecoveredfrom fiveShenzhen
(Guangdong Province) samples, including patient 2 of the
familial cluster (Chan et al., 2020). Two derived haplotypes
were also only found in Shenzhen of Guangdong Province
(H14 from the grandson of patient 2), and the other three
haplotypeswerefoundinthree samplesfrom Japanand one
sample from Arizona in the United States (Figure 3).
According to an epidemiological study, the Shenzhen family
couldhave beeninfectedduring theirvisittoWuhan(Chan et
al.,2020). ThissuggeststhatH13mighthaveoriginated from
Wuhan.Genetically, haplotypesofGroupAhavelinkstoonly
Wuhan haplotype H3 (only EPI_ISL_406801). It is possible
thatH13was newlyderived fromH3 (Figure3C)and didnot
spreadinWuhan,orthatthreerepatriatedJapanesemightbe
infected by an unknown source of H13 in Wuhan, China or
somewhere else (The Asahi Shimbun, 2020), or that no
samples have been sequenced yet. H38 has three genomes
from the same patient (Supplementary Table S1), who was
the first identified infected patient in the United States
(Holshueetal., 2020).Thispatient mighthave been infected
while visiting his family in Wuhan, China, or was infected in
some other place. The original source of H38 can be
explainedasthatofH13, whichcan bealso derivedfromH3
(Figure 3C), and the derived H45 was from a Chongqing
patient who was reported as working in Wuhan and had no
linktotheHuananMarket.
TheH3haplotypehasonly onesamplefrom Wuhan,which
was not linked to the Huanan Market (Lu et al., 2020b), and
the other samples in this group were from outside of Wuhan
(Figure 3A). Noteworthily, all the samples from the Market
belonged to H1 or its derived haplotypes (H2, H8-H12, see
Figure 2 and Supplementary Table S1), indicating that there
werecirculatedinfections withinthe marketin theshortterm.
Other researchers have argued that the source of the
coronavirusintheMarketshouldbeimportedfromelsewhere,
oratleast itshould be notthe singlesourceof SARS-CoV-2
Figure5The inferred relationships between the outgroup bat-RaTG13-CoV and four associated/central node haplotypes (H1, H3, H13,
andH38)ofSARS-CoV-2usingthreedatasets
The dataset of four core substitution sites is the four variable sites shared among bat-RaTG13-CoV and the four central node haplotypes. The
dataset of 120 substitution sites refers to all variable sites of coding regions in SARS-CoV-2 haplotypes. The dataset of 1235 substitution sites
referstoallvariablesitesofcodingregionsamongbat-RaTG13-CoVandSARS-CoV-2haplotypes.
Zoological Research41(3):247−257,2020 253
(Cohen, 2020). In this study, evolutionary relationships
indicated that H1 and its descendant haplotypes from the
Marketshould bederivedfromH3(Figures 3,4). H3mutated
to the H1 by two substitutions, and none of the currently
available Market samples encoded H3, suggesting that H3
mighthaveoriginatedandspreadoutsideoftheMarketbefore
anearlystageofpopulation expansion.Thenon-synonymous
mutation from H3 to H1 might have enhanced the
infectiousness of SARS-CoV-2, and a functional
characterization should be performed to confirm this
speculation.ItispossiblethatSARS-CoV-2intheMarket had
been transmitted from other places (Figure 3D), or at least,
thattheMarket didnothostthe originalsourceofSARS-CoV-
2(Cohen,2020).Asthefirstidentifiedinfectedpatientshadno
link to the Market (Huang et al., 2020), it is possible that
infectedhumanstransmittedtheH1haplotypeofSARS-CoV-2
to workers or sellers in the market, after which it rapidly
circulatedthereduetoits specialsurroundings. Thecrowded
market boosted SARS-CoV-2 transmissions to buyers and
spread it to the whole city in early December 2019,
corresponding to the estimated population expansion time.
Due to insufficient sampling from Wuhan in the currently
availablesamples,it isnot clearwhetherH3 neverappeared
intheMarket, orH1 wasquicklyderived fromH3 toadaptin
theMarket.
Regionalandworldwidecirculationandspread
Of the 54 genomes from patients in China, Chongqing (3
samples), Guangdong (18), Hubei (22), Taiwan (2), and
Zhejiang(4)have morethan twosamples,and theother five
provinces have one sample. Hubei (Wuhan) samples dated
from 24 December 2019 to 05 January 2020 encoded 13
haplotypes, belonging to Groups C (H1 and 11 satellite
haplotypes) and B (only H3). These relationships indicated a
rapid transmission and circulation of SARS-CoV-2 in Wuhan
at an early stage of human-to-human transmissions. H1 (no
satellite haplotypes) and H3 are the ancestors of haplotypes
outside of Wuhan/Hubei because most of early confirmed
patients might have history in Wuhan or Hubei. Eighteen
Guangdong samples, collected from 10–23 January 2020,
encoded 15 haplotypes, belonging to Groups A, C, and E,
showing that there were multiple sources imported into
Guangdong.Threehaplotypes(H14,H15,andH17)mayhave
evolvedlocally,indicatingthathuman-to-humantransmissions
happened when SARS-CoV-2 initially spread to Shenzhen in
Guangdong Province (Chan et al., 2020). Two samples from
Taiwan Province, China, encoded H3 and H24 in Groups B
and D, respectively, and three samples from Chongqing
encodedH1,H40, andH45 inGroups BandC, respectively.
There were two sources imported into these two provinces.
Four Zhejiang samples encoded H1 and H24 in Group C,
which might be only imported from the source of the H1
haplotype.
The samples outside China encoded 31 haplotypes
belongingtoGroupsA–E.Ofthese,27haplotypesareprivate
by regional samplings, only two samples from Thailand were
theH1 haplotype,one eachfrom Australiaand Belgiumwere
the H3 haplotype, one from the United States was the H19
haplotype, and one from Singapore was the H40 haplotype.
Twelvesamples,encoding10haplotypes,werefrom patients
infive countriesin Asia.Sixhaplotypeslinkedto H1,and two
each linked to H3 and H1, respectively, indicating the 12
patientswereinfected bydifferent sources. Human-to-human
transmissionsmayhave happened frompatients with H53to
H52 haplotypes in Tokyo, Japan. Five Australian samples,
encoding six haplotypes in Groups B, C, and D, were from
patientsofthree states. Patientswith H3, H25and H26, and
withH55were in GroupsB and C,respectively, and human-
to-human transmission might have happened from the
patients with H25 to H26, who were in a same tour group in
Queensland (AAP reporters, 2020). The connection between
thepatientswith H56and H27 isnot clear.Onepossibility is
that there was an intermediary spreader with H56, who also
transmitted SARS-CoV-2 to other patients in France, the
UnitedStates,andTaiwanProvinceofChina.EightEuropean
samples, encoding seven haplotypes, were from patients in
four countries. The patients in England were reported as a
household transmission from H28 to H29 (Lillie et al., 2020).
The patients in France may have been infected by three
differentsources,i.e.,H44waslinkedtoH1,H43mightlinkto
H40(inChongqing, Singaporeorsomewhere else),and H30
might link to an intermediary spreader with H56. Of the 13
genomes from the United States, three were from the same
patient in Washington encoding the same haplotype H38,
while the other ten samples encoded eight haplotypes,
covering all five groups (Figure 3A, B), so the sources of
infectionsarecomplicated. Thereisnoevidenceofhuman-to-
humantransmissionintheUnitedStatesfromthese11cases.
Toclarifytheexactoriginsofthesehaplotypesoutside China,
we need more epidemiological investigative efforts and more
SARS-CoV-2genomicdatafrompatientsattheearlystageof
transmissions.
Phylogenetic approaches provide insights into the
epidemiologyofSARS-CoV-2
Epidemiological study of SARS-CoV-2 using traditional
approachesis verydifficult,because itwasnotidentifiedas a
newcoronavirusuntil29December,andsomeinfectedpeople
withmildsymptomsorwithoutsymptoms(Heymann&Shindo,
2020;Rotheet al., 2020;Wu & McGoogan,2020) may have
been overlooked in late November and early December.
Evolutionary analyses suggested that the source of the H1
haplotype in the Huanan Market was imported from
elsewhere, as has been suggested by other researchers
(Cohen, 2020). The rooted network suggested that H13 and
H38 should be ancestral haplotypes that connected to the
outgroup bat-RaTG13-CoV through a hypothesized
intermediate haplotype (Figure 5). The most common
ancestral haplotype was missed because the currently
available samples do not include the first identified infected
patientand otherpatientsfromearlyDecember, andbecause
oftherelativelyhighmutationrateoftheviralgenome.Ifthere
254www.zoores.ac.cn
areanyfrozensamplesfromthosepatients,itwouldbe worth
doing genomic sequencing for phyloepidemiologic study to
help to locate the birthplace of SARS-CoV-2. Meanwhile, we
expect that the H13 and H38 haplotypes might be found in
some samples from infected patients in Wuhan or in other
places across the world if more samples are sequenced in
future. This will be very helpful in the search for the original
sourcesofSARS-CoV-2, becauseboth H13and H38tendto
beancestralhaplotypes.
The evolutionary network of haplotypes can be used to
recover the directions of human-to-human transmissions at
the local scale and spread at the larger scale. The central
haplotype can be considered as the super-spreader
haplotype, and the tip haplotype is the most recent
descendant, similar to the definition and use of mtDNA
haplogroupsintracinghumandemographichistory(Yaoetal.,
2002). The transmission direction can be identified using the
connectioninformationoftipsandbranches.Forexample,the
confirmed patients from the Huanan Market shared the
common ancestral haplotype H1, indicating they might be
infectedfromacommonsource,whomayhavebeenasuper-
spreader in the market. This approach has recovered
potentiallyspecificdirectionsofhuman-to-humantransmission
in the Shenzhen family (H13 → H14), the Queensland tour
group(H25→H26),theEnglandfamily(H28→H29),andthe
Japanese (H53 → H52). It is possible that some infections
couldlinktoWuhanorHubeidirectlyorindirectly,becausethe
patientsclaimedconnectionstoWuhanorHubei,butforsome
of them it is not clear exactly where they were infected. We
suspectthatthereweresuper-spreadersmediatingthespread
ofSARS-CoV-2attheearlystageoftransmissions.
Our findings showed that SARS-CoV-2 has not had
legitimate recombination. Thus, the haplotype-based
phyloepidemiologic analyses provide a powerful way to
understand the evolution of SARS-CoV-2 at the very early
stageoftransmissionwhenreverse mutationsandillegitimate
recombination are rare. In our analysis, recombination is
rejectedbuttheoutgroupbat-RaTG13-CoV isrelativelyhighly
divergedfromSARS-CoV-2haplotypes,which mayaffect the
phyloepidemiologic analyses. Based on the estimated
mutation rate of current SARS-CoV-2 viruses, the reverse
mutationsshouldbe6×10−3(0.92×10−3×0.92×10−3persiteper
year×29358sites×2/12year),with aneglectableinfluence on
ourresult.Butourobservationsleaveoneimportantquestion:
whyareancestralhaplotypes,likeH13andH38,lessfrequent
than H1? It is highly possible that H1 acquired adaptive
mutations,such asNS ofsite 28151,from H3orH13(and/or
H38),evolved inanindependentcirculationafter theyjumped
into intermediate hosts or directly transmitted to humans,
which should be investigated in future studies if more early
genomedatasetsareavailable.The exactoriginal sourcesof
H13andH38willstayasunsolvedmysteriesiftheearlystage
sampleswerenotpreserved.
Anearlyversionofourmanuscriptwas postedat ChinaXiv
(DOI: 10.12074/202002.00033) on 19 February 2020. Since
then,therehavebeenmanynewsstories stemmingfrom our
manuscript with a biased interpretation of the results. This is
beyondour expectation.During thereviewofthismanuscript,
thereweresomereportsofanalysesofSARS-CoV-2genomic
variationsbasedon alargersample size(e.g.,Forster etal.,
2020;Tangetal.,2020),whichshowedasimilarphylogenetic
pattern as we present here. We expect more data-mining of
the increasing number of SARS-CoV-2 genomes will provide
updatedinsightsintotheoriginandtransmissionofthisvirus.
SUPPLEMENTARYDATA
Supplementarydatatothisarticlecanbefoundonline.
COMPETINGINTERESTS
Theauthorsdeclarethattheyhavenocompetinginterests.
AUTHORS’CONTRIBUTIONS
W.B.Y.conceivedtheresearch,analyzedthedata,interpreted
theresults,andwrotethedraftmanuscript;W.B.Y.andG.D.T.
collecteddata.Allauthorsreadandapprovedthefinalversion
ofthemanuscript.
ACKNOWLEDGEMENTS
We are grateful to scientists and researchers for depositing
whole genomic sequences of Novel Pneumonia Coronavirus
(SARS-CoV-2/HCoV-19/2019-nCoV)attheGlobalInitiative
on Sharing All Influenza Data (GISAID) EpiFlu™; to GISAID
database for allowing us to access the sequences for non-
commercialscientificresearch; and totwo reviewers fortheir
valuable comments and suggestions. This study was
supported by grants from Ten Thousand Talents Program of
Yunnanfor Top‐notchYoung Talents,andtheopenresearch
projectof “Cross-CooperativeTeam”oftheGermplasmBank
of Wild Species, Kunming Institute of Botany, Chinese
AcademyofSciences.
REFERENCES
AAP Reporters. 2020 (2020-01-30). Coronavirus outbreak: second case
confirmed in Queensland. https://7news.com.au/lifestyle/health-wellbeing/
qld-coronavirus-case-remains-in-isolation-c-671500.
Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF. 2020. The
proximaloriginofSARS-CoV-2.Nature Medicine,26(4):450−452.
BandeltHJ, Forster P,Röhl A.1999. Median-joiningnetworks forinferring
intraspecificphylogenies.Molecular Biology and Evolution,16(1):37−48.
Bandelt HJ, Forster P, Röhl A. 2020 (2020-02-01). Free phylogenetic
networksoftware.https://www.fluxus-engineering.com/sharenet.htm.
Bottollier-DepoisJF, ChauQ, BouissetP, KerlauG, PlawinskiL, Lebaron-
Jacobs L. 2000. Assessing exposure to cosmic radiation during long-haul
flights.Radiation Research,153(5):526−532.
ChanJFW,Yuan SF,Kok KH,To KKW,Chu H,Yang J,Xing FF,LiuJL,
YipCCY, PoonRWS,TsoiHW,LoSKF,Chan KH,PoonVKM, ChanWM,
IpJD,Cai JP,Cheng VCC,Chen H,Hui CKM,Yuen KY.2020. Afamilial
clusterofpneumonia associatedwiththe2019 novelcoronavirusindicating
person-to-person transmission: a study of a family cluster. The Lancet,
Zoological Research41(3):247−257,2020 255
395(10223):514−523.
Chan-Yeung M, Xu RH. 2003. SARS: epidemiology. Respirology, 8(S1):
S9−S14.
Cohen J. 2020. Wuhan seafood market may not be source of novel virus
spreadingglobally.Science,doi:10.1126/science.abb0611.
CottenM, WatsonSJ,KellamP,Al-RabeeahAA,Makhdoom HQ,AssiriA,
Al-TawfiqJA, AlhakeemRF, MadaniH, AlRabiahFA,HajjarSA,Al-Nassir
WN, Albarrak A, Flemban H, Balkhy HH, Alsubaie S, Palser AL, Gall A,
Bashford-RogersR,RambautA,ZumlaAI,MemishZA.2013.Transmission
andevolutionof theMiddleEastrespiratorysyndromecoronavirusinSaudi
Arabia:adescriptivegenomicstudy.The Lancet,382(9909):1993−2002.
Cotten M, Watson SJ, Zumla AI, Makhdoom HQ, Palser AL, Ong SH, Al
Rabeeah AA, Alhakeem RF, Assiri A, Al-Tawfiq JA, Albarrak A, Barry M,
ShiblA,AlrabiahFA,HajjarS, BalkhyHH,FlembanH,RambautA,Kellam
P,Memish ZA.2014. Spread,circulation, andevolution ofthe middleeast
respiratorysyndromeCoronavirus.mBio,5(1):e01062−13.
CuiJ,LiF, ShiZL.2019.Originandevolution ofpathogeniccoronaviruses.
Nature Reviews Microbiology,17(3):181−192.
CyranoskiD.2020.DidpangolinsspreadtheChinacoronavirustopeople?.
Nature,doi:10.1038/d41586-020-00364-2.
DudasG,Carvalho LM,RambautA,Bedford T.2018.MERS-CoVspillover
atthecamel-humaninterface.eLife,7:e31257.
Excoffier L, Lischer HEL. 2010. Arlequin suite ver 3.5: a new series of
programs to perform population genetics analyses under Linux and
Windows.Molecular Ecology Resources,10(3):564−567.
Forster P, Forster L, Renfrew C, Forster M. 2020. Phylogenetic network
analysis of SARS-CoV-2 genomes. Proceedings of the National Academy
of Sciences of the United States of America, doi: 10.1073/pnas.
2004999117.
Gorbalenya AE, Baker SC, Baric RS, de Groot RJ, Drosten C, Gulyaeva
AA, Haagmans BL, Lauber C, Leontovich AM, Neuman BW, Penzar D,
Perlman S, Poon LLM, Samborskiy DV, Sidorov IA, Sola I, Ziebuhr J,
CoronaviridaeStudyGroup oftheInternationalCommitteeon Taxonomyof
V. 2020. The species severe acute respiratory syndrome-related
coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nature
Microbiology,5(4):536−544.
GuanWJ, NiZY, HuY,Liang WH,Ou CQ,HeJX, LiuL, ShanH, LeiCL,
Hui DSC, Du B, Li LJ, Zeng G, Yuen KY, Chen RC, Tang CL, Wang T,
ChenPY, XiangJ, LiSY, WangJL,LiangZJ,PengYX,WeiL,LiuY,Hu
YH,PengP, WangJM,LiuJY,Chen Z,LiG,ZhengZJ, QiuSQ,LuoJ,Ye
CJ, Zhu SY, Zhong NS. 2020. Clinical characteristics of coronavirus
disease 2019 in China. The New England Journal of Medicine, doi:
10.1056/NEJMoa2002032.
HeymannDL, Shindo N.2020. COVID-19:what isnext forpublic health?.
The Lancet,395(10224):542−545.
HolshueML,DeBolt C,LindquistS,LofyKH, WiesmanJ,BruceH,Spitters
C,EricsonK,WilkersonS,TuralA,DiazG,CohnA,Fox L,PatelA,Gerber
SI,KimL,TongSX,LuXY,LindstromS,PallanschMA,Weldon WC,Biggs
HM,Uyeki TM,PillaiSK. 2020.Firstcase of2019novel coronavirusinthe
UnitedStates.The New England Journal of Medicine,382(10):929−936.
HuangCL,WangYM,LiXW,RenLL,ZhaoJP,HuY,ZhangL,FanGH,Xu
JY,Gu XY,Cheng ZS,YuT, XiaJA, WeiY,Wu WJ,Xie XL,YinW, LiH,
LiuM,XiaoY, GaoH,GuoL,XieJG, WangGF,JiangRM,GaoZC,JinQ,
Wang JW, Cao B. 2020. Clinical features of patients infected with 2019
novelcoronavirusinWuhan,China.The Lancet,395(10223):497−506.
Huson DH, Bryant D. 2006. Application of phylogenetic networks in
evolutionarystudies.Molecular Biology and Evolution,23(2):254−267.
Jiang S, Shi Z, Shu Y, Song J, Gao GF, Tan W, Guo D. 2020. A distinct
nameisneededforthenewcoronavirus.The Lancet,395(10228):949.
KatohK,StandleyDM.2013.MAFFTmultiplesequencealignmentsoftware
version 7: improvements in performance and usability. Molecular Biology
and Evolution,30(4):772−780.
LamTTY,ShumMHH,ZhuHC,TongYG,Ni XB,LiaoYS,WeiW, Cheung
WYM, Li WJ, Li LF, Leung GM, Holmes EC, Hu YL, Guan Y. 2020.
Identification of 2019-nCoV related coronaviruses in Malayan pangolins in
southernChina.bioRxiv,doi:10.1101/2020.02.13.945485.
Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, Ren R, Leung KSM, Lau
EHY,WongJY,XingX,Xiang N,WuY,LiC,ChenQ, LiD,LiuT, ZhaoJ,
LiuM,TuW,Chen C,JinL,YangR,WangQ,ZhouS,WangR,LiuH,Luo
Y,LiuY,ShaoG,Li H,TaoZ,Yang Y,DengZ,Liu B,MaZ,ZhangY,Shi
G, Lam TTY, Wu JT, Gao GF, Cowling BJ, Yang B, Leung GM, Feng Z.
2020.Early transmissiondynamicsin Wuhan,China, ofnovelcoronavirus-
infected pneumonia. The New England Journal of Medicine, 382(13):
1199−1207.
Lillie PJ, Samson A, Li A, Adams K, Capstick R, Barlow GD, Easom N,
Hamilton E, Moss PJ, Evans A, Ivan M, Phe Incident Team, Taha Y,
Duncan CJA, Schmid ML, the Airborne Hcid Network. 2020. Novel
coronavirusdisease(Covid-19):Thefirsttwo patientsintheUKwithperson
topersontransmission.Journal of Infection,80(5):578−606.
LuCW,LiuXF,JiaZF. 2020a.2019-nCoVtransmissionthrough theocular
surfacemustnotbeignored.The Lancet,395(10224):e39.
LuRJ, ZhaoX, LiJ, NiuPH,YangB,WuHL,WangWL,SongH,Huang
BY,Zhu N,BiYH, MaXJ, ZhanFX,Wang L,HuT,ZhouH, HuZH,Zhou
WM,ZhaoL,ChenJ,MengY, WangJ,LinY,YuanJY,XieZH,MaJM,Liu
WJ,Wang DY,XuWB, HolmesEC, GaoGF, WuGZ, ChenWJ, ShiWF,
TanWJ. 2020b.Genomiccharacterisationandepidemiologyof2019 novel
coronavirus:implicationsfor virusoriginsandreceptorbinding. The Lancet,
395(10224):565−574.
LukHKH,LiX,FungJ,LauSKP, WooPCY.2019.Molecularepidemiology,
evolution and phylogeny of SARS coronavirus. Infection, Genetics and
Evolution,71:21−30.
Minh BQ, Schmidt H, Chernomor O, Schrempf D, Woodhams M, von
Haeseler A, Lanfear R. 2020. IQ-TREE 2: new models and efficient
methods for phylogenetic inference in the genomic era. Molecular Biology
and Evolution,doi:10.1093/molbev/msaa015.
PhanLT,Nguyen TV,LuongQC,Nguyen TV,NguyenHT,Le HQ,Nguyen
TT, Cao TM, Pham QD. 2020. Importation and human-to-human
transmissionof anovel coronavirusin Vietnam.The New England Journal
of Medicine,382(9):872−874.
Rambaut A. 2020 (2020-02-12). Phylodynamic analysis|129 genomes |24
Feb 2020. http://virological.org/t/phylodynamic-analysis-90-genomes-12-
feb-2020/356.
Rogers AR, Harpending H. 1992. Population growth makes waves in the
distributionofpairwisegeneticdifferences.Molecular Biology and Evolution,
9(3):552−569.
Rothe C, Schunk M, Sothmann P, Bretzel G, Froeschl G, Wallrauch C,
Zimmer T, Thiel V, Janke C, Guggemos W, Seilmaier M, Drosten C,
Vollmar P, Zwirglmaier K, Zange S, Wölfel R, Hoelscher M. 2020.
256www.zoores.ac.cn
Transmission of 2019-nCoV infection from an asymptomatic contact in
Germany.The New England Journal of Medicine,382(10):970−971.
RozasJ, Ferrer-MataA, Sánchez-DelBarrioJC, Guirao-RicoS, LibradoP,
Ramos-Onsins SE, Sánchez-Gracia A. 2017. DnaSP 6: DNA sequence
polymorphismanalysisoflargedatasets. Molecular Biology and Evolution,
34(12):3299−3302.
Shibai A, Takahashi Y, Ishizawa Y, Motooka D, Nakamura S, Ying BW,
Tsuru S. 2017. Mutation accumulation under UV radiation in Escherichia
coli.Scientific Reports,7:14531.
ShuYL,McCauleyJ.2017.GISAID:Globalinitiativeonsharingallinfluenza
data-fromvisiontoreality.Eurosurveillance,22(13):30494.
Tang X, Wu C, Li X, Song Y, Yao X, Wu X, Duan Y, Zhang H, Wang Y,
QianZ,CuiJ,LuJ. 2020.Ontheorigin andcontinuingevolution ofSARS-
CoV-2.National Science Review,doi:10.1093/nsr/nwaa1036.
The Asahi Shimbun. 2020 (2020-02-02). Japan tightens immigration as 3
more infected by coronavirus. http://www.asahi.com/ajw/articles/AJ2020
02020013.html.
Wikipedia. 2020 (2020-02-29). 2019-20 coronavirus outbreak. https://en.
wikipedia.org/wiki/2019%E2%80%9320_coronavirus_outbreak.
WongG, BiYH,WangQH, ChenXW,Zhang ZG,YaoYG. 2020.Zoonotic
origins of human coronavirus 2019 (HCoV-19 / SARS-CoV-2): why is this
workimportant?.Zoological Research,41(3):213−219.
WuF, ZhaoS,Yu B,Chen YM,Wang W,Song ZG,Hu Y,Tao ZW,Tian
JH,PeiYY,YuanML,ZhangYL,DaiFH,LiuY,WangQM,ZhengJJ,XuL,
HolmesEC, ZhangYZ.2020a. Anew coronavirusassociated withhuman
respiratorydiseaseinChina.Nature,579(7798):265−269.
Wu JT, Leung K, Leung GM. 2020b. Nowcasting and forecasting the
potential domestic and international spread of the 2019-nCoV outbreak
originating in Wuhan, China: a modelling study. The Lancet, 395(10225):
689−697.
WuZY,McGooganJM.2020.Characteristicsofandimportantlessonsfrom
thecoronavirusdisease2019(COVID-19)outbreakinChina:summaryofa
report of 72 314 cases from the Chinese Center for Disease Control and
Prevention.JAMA,doi:10.1001/jama.2020.2648.
XiaoKP,ZhaiJQ,FengYY,ZhouN,ZhangX,ZouJJ,LiN,GuoYQ,LiXB,
ShenXJ,ZhangZP,ShuFF,HuangWY,LiY,ZhangZD,ChenRA,WuYJ,
PengSM, HuangM, XieWJ,Cai QH,Hou FH,LiuYH, ChenW, XiaoLH,
Shen YY. 2020. Isolation and characterization of 2019-nCoV-like
coronavirus from malayan pangolins. bioRxiv, doi: 10.1101/2020.02.17.
951335.
YangY, LuQB, LiuMJ,Wang YX,Zhang AR,JalaliN, DeanN, LonginiI,
Halloran ME, Xu B, Zhang XA, Wang LP, Liu W, Fang LQ. 2020.
Epidemiological and clinical features of the 2019 novel coronavirus
outbreakinChina.medRxiv,doi:10.1101/2020.02.10.20021675.
Yao YG, Kong QP, Bandelt HJ, Kivisild T, Zhang YP. 2002.
Phylogeographic differentiation of mitochondrial DNA in Han Chinese.
American Journal of Human Genetics,70(3):635−651.
Zhang RQ, Liu H, Li FY, Zhang B, Liu QL, Li XW, Luo LM. 2020a.
Transmission and epidemiological characteristics of Novel Coronavirus
(2019-nCoV) Pneumonia (NCP): preliminary evidence obtained in
comparisonwith2003-SARS.medRxiv,doi:10.1101/2020.01.30.20019836.
Zhang T, Wu QF, Zhang ZG. 2020b. Pangolin homology associated with
2019-nCoV.Current Biology,30:1346−1351.
ZhaoZM, LiHP, WuXZ, ZhongYX, ZhangKQ, ZhangYP, BoerwinkleE,
FuYX.2004.ModeratemutationrateintheSARScoronavirusgenomeand
itsimplications.BMC Evolutionary Biology,4:21.
ZhouP,YangXL, WangXG,HuB,ZhangL, ZhangW,SiHR,ZhuY,LiB,
HuangCL, ChenHD, ChenJ, LuoY, GuoH, JiangRD, LiuMQ, ChenY,
ShenXR, WangX,Zheng XS,Zhao K,Chen QJ,Deng F,Liu LL,Yan B,
Zhan FX, Wang YY, Xiao GF, Shi ZL. 2020. A pneumonia outbreak
associated with a new coronavirus of probable bat origin. Nature,
579(7798):270−273.
ZhuN,ZhangDY,WangWL, LiXW,YangB,SongJD,ZhaoX,HuangBY,
ShiWF, LuRJ,Niu PH,ZhanFX,MaXJ,WangDY, XuWB,Wu GZ,Gao
GF, Tan WJ. 2020. A novel coronavirus from patients with pneumonia in
China,2019.The New England Journal of Medicine,382(8):727−733.
Zoological Research41(3):247−257,2020 257