ArticlePDF Available

SARS-CoV-2 genomic variations associated with mortality rate of COVID-19

Authors:

Abstract

The coronavirus disease 2019 (COVID-19) outbreak, caused by SARS-CoV-2, has rapidly expanded to a global pandemic. However, numbers of infected cases, deaths, and mortality rates related to COVID-19 vary from country to country. Although many studies were conducted, the reasons of these differences have not been clarified. In this study, we comprehensively investigated 12,343 SARS-CoV-2 genome sequences isolated from patients/individuals in six geographic areas and identified a total of 1234 mutations by comparing with the reference SARS-CoV-2 sequence. Through a hierarchical clustering based on the mutant frequencies, we classified the 28 countries into three clusters showing different fatality rates of COVID-19. In correlation analyses, we identified that ORF1ab 4715L and S protein 614G variants, which are in a strong linkage disequilibrium, showed significant positive correlations with fatality rates (r = 0.41, P = 0.029 and r = 0.43, P = 0.022, respectively). We found that BCG-vaccination status significantly associated with the fatality rates as well as number of infected cases. In BCG-vaccinated countries, the frequency of the S 614G variant had a trend of association with the higher fatality rate. We also found that the frequency of several HLA alleles, including HLA-A*11:01, were significantly associated with the fatality rates, although these factors were associated with number of infected cases and not an independent factor to affect fatality rate in each country. Our findings suggest that SARS-CoV-2 mutations as well as BCG-vaccination status and a host genetic factor, HLA genotypes might affect the susceptibility to SARS-CoV-2 infection or severity of COVID-19.
Journal of Human Genetics
https://doi.org/10.1038/s10038-020-0808-9
ARTICLE
SARS-CoV-2 genomic variations associated with mortality rate of
COVID-19
Yujiro Toyoshima1Kensaku Nemoto1Saki Matsumoto1Yusuke Nakamura1Kazuma Kiyotani 1
Received: 9 July 2020 / Revised: 10 July 2020 / Accepted: 12 July 2020
© The Author(s) 2020. This article is published with open access
Abstract
The coronavirus disease 2019 (COVID-19) outbreak, caused by SARS-CoV-2, has rapidly expanded to a global pandemic.
However, numbers of infected cases, deaths, and mortality rates related to COVID-19 vary from country to country.
Although many studies were conducted, the reasons of these differences have not been claried. In this study, we
comprehensively investigated 12,343 SARS-CoV-2 genome sequences isolated from patients/individuals in six geographic
areas and identied a total of 1234 mutations by comparing with the reference SARS-CoV-2 sequence. Through a
hierarchical clustering based on the mutant frequencies, we classied the 28 countries into three clusters showing different
fatality rates of COVID-19. In correlation analyses, we identied that ORF1ab 4715L and S protein 614G variants, which
are in a strong linkage disequilibrium, showed signicant positive correlations with fatality rates (r=0.41, P=0.029 and
r=0.43, P=0.022, respectively). We found that BCG-vaccination status signicantly associated with the fatality rates as
well as number of infected cases. In BCG-vaccinated countries, the frequency of the S 614G variant had a trend of
association with the higher fatality rate. We also found that the frequency of several HLA alleles, including HLA-A*11:01,
were signicantly associated with the fatality rates, although these factors were associated with number of infected cases and
not an independent factor to affect fatality rate in each country. Our ndings suggest that SARS-CoV-2 mutations as well as
BCG-vaccination status and a host genetic factor, HLA genotypes might affect the susceptibility to SARS-CoV-2 infection or
severity of COVID-19.
Introduction
The novel betacoronavirus, severe acute respiratory syn-
drome coronavirus 2 (SARS-CoV-2), which causes cor-
onavirus disease 2019 (COVID-19), was rst reported in
Wuhan, China in December 2019 [1,2]. Soon after, the
virus caused an outbreak in China and has spread to the
world. According to the World Health Organization, the
current outbreak of COVID-19 has nearly 11.5 million
conrmed cases worldwide with more than 530,000 deaths,
as of July 6, 2020. The SARS-CoV-2 genome comprises of
around 30,000 nucleotides organized into specic genes
encoding structural proteins and nonstructural proteins
(Nsps) [1,2]. Structural proteins include spike (S), envelope
(E), membrane (M), and nucleocapsid (N) proteins. Surface
S glycoprotein is involved in the interaction with the hosts
angiotensin-converting enzyme 2 (ACE2) receptor and
plays an important role in rapid human to human trans-
mission. Nsps, which are generated as cleavage products of
the open reading frame 1ab (ORF1ab) viral polyproteins,
assemble to facilitate viral replication and transcription.
RNA-dependent RNA polymerase, also known as Nsp12, is
the key component that regulates viral RNA synthesis with
the assistance of Nsp7 and Nsp8 [3]. In addition, ve
accessory proteins are encoded by ORF3a, ORF6, ORF7a
ORF8, and ORF10 genes.
SARS-CoV-2 has rapidly spread around the world
compared with SARS-CoV appeared in 2002 and Middle
East respiratory syndrome coronavirus (MERS-CoV) in
2012. Although the estimated fatality rate in the conrmed
cases is 6.6% in SARS-CoV-2, which is lower than those of
SARS-CoV and MERS-CoV, 9.6% and 34.3%, respectively
*Kazuma Kiyotani
kazuma.kiyotani@jfcr.or.jp
1Project for Immunogenomics, Cancer Precision Medicine Center,
Japanese Foundation for Cancer Research, Tokyo 135-8550, Japan
Supplementary information The online version of this article (https://
doi.org/10.1038/s10038-020-0808-9) contains supplementary
material, which is available to authorized users.
1234567890();,:
1234567890();,:
[4], there is an urgent need for its effective treatment based
on antivirals and vaccines that reduce the mortality and
morbidity rates of COVID-19. However, up to now, the
causes of the large country-by-country difference of the
mortality rates related to COVID-19 have not been clearly
understood. Although many studies were conducted, the
effects of SARS-CoV-2 genetic variations and host genetic
factors remain elusive.
In this study, we comprehensively analyzed 12,343
SARS-CoV-2 genome sequences isolated from patients/
individuals in six geographic areas, including Asia, North
America, South America, Europe, Oceania, and Africa, and
investigated their correlations to the fatality rates in 28
different countries. We also investigated the associations
with BCG-vaccination status as well as human leukocyte
antigen (HLA), which is an important molecule to recognize
virus by our host immune system.
Methods
Coronavirus sequences
Full-length viral nucleotide sequence of the reference
SARS-CoV-2 (accession number MN908947) [1] was
downloaded from the NCBI GenBank. We used a total of
12,343 SARS-CoV-2 sequences isolated in 50 different
countries of six geographic areas, including 1062 sequences
from Asia, 4060 from North America, 99 from South
America, 6012 from Europe, 1028 from Oceania, and 82
from Africa regions, which were deposited in the Global
Initiative on Sharing Avian Inuenza Data as of 7 May
2020 [5]. To analyze mutations based on countries, we used
the data of 28 countries in which more than 30 SARS-CoV-
2 sequences are available, among the 50 countries.
Mutation analysis
We analyzed mutations of SARS-CoV-2 as described pre-
viously [6]. Briey, we rst aligned each of the SARS-
CoV-2 sequences to the reference sequence SARS-CoV-
2_Wuhan-Hu-1 (accession number MN908947) using
BLAT software [7]. After the alignment, we extracted
nucleotide sequences corresponding to individual proteins
of SARS-CoV-2, translated them into amino acid sequen-
ces, and then compared them to reference amino acid
sequences of SARS-CoV-2_Wuhan-Hu-1 (accession num-
bers QHD43415-QHD43423, QHI42199).
Data acquisition
Data on numbers of conrmed cases and deaths related to
COVID-19 were obtained from the Worldometer
(https://www.worldometers.info/coronavirus/) on 7 May
2020 (Supplementary Table 1). Data of conrmed cases and
deaths in each state in the United States were obtained on 3
July 2020. Fatality rate in infected individuals was calcu-
lated from total infected cases and total deaths in each
country. The allelic frequencies of HLA genes were
obtained from The Allele Frequency Net Database [8]. Data
on BCG-vaccination status in each country were obtained
from the previous reports [911].
Statistical analyses
Continuous variables were compared using the Studentst
test. Fishers exact test was used to analyze differences of
mutation rates of SARS-CoV-2 among the different geo-
graphic areas. A hierarchical clustering was performed to
identify clusters corresponding to distinct subgroups with
the selected mutations using R package stats. Global maps
of clusters or mutations were drawn using R package
rworldmap. Pearsons correlation was used to evaluate
correlations among mutant frequencies, HLA allele fre-
quencies and fatality rates. Haploview software was used to
analyze and visualize the haplotypes of SARS-CoV-2
mutations [12]. Multiple regression analysis was used to
test for an independent contribution of identied factors to
fatality rates of COVID-19. All statistical analyses were
carried out using the R statistical environment version 3.6.1.
Results
All replicating viruses, including coronavirus, continuously
accumulate genomic mutations that persist due to natural
selections. These mutations contribute to enhancement of
ability of viral proliferation and infection as well as an
escape from host immune attack. We rstly investigated
mutations in 12,343 SARS-CoV-2 genome sequences iso-
lated from patients/individuals in six different regions,
including Asia, North America, South America, Europe,
Oceania, and Africa. We identied a total of 1234 mutations
detected in at least two independent samples, including 131
mutations found at a frequency of more than 10% (Sup-
plementary Table 2). A hierarchical clustering using 16
common amino acid mutations classied 28 countries into
three clusters (Fig. 1a). The cluster 1 includes most of the
Asian countries we analyzed, whereas the cluster 2 includes
European and South American countries, and the cluster 3
includes European, North American, Oceania, African and a
few Asian countries (Fig. 1b). Comparing the mutations
among the three clusters, the average frequency of an L
variant of an ORF1ab P4715L in the countries classied as
the cluster 1 was 14.7%, which is signicantly lower than
81.3% and 73.2%, respectively, in the countries classied as
Y. Toyoshima et al.
the clusters 2 and 3 (P=1.3 × 106and P=2.5 × 105,
respectively; Supplementary Fig. 1A). The ORF1ab 4715L
variant was detected at the signicantly low frequency in
Asian countries compared with the other areas (20.8% vs.
others 54.986.8%, P=1.1 × 10118; Supplementary
Fig. 2). Similarly, the frequency of a G variant of S protein
D614G was signicantly lower in the cluster 1 than the
other two clusters (P=1.2 × 106and P=1.7 × 105,
respectively, for the clusters 2 and 3; Supplementary
Fig. 1B). In the cluster 2, K/R variants of N protein R203K/
G204R mutations were signicantly enriched at 43.1%,
compared with the other clusters (5.2%, P=0.00011 for the
cluster 1 and 11.8%, P=5.6 × 107for the cluster 3;
Supplementary Fig. 1C). In addition, in the cluster 1, L and
F variants of N P13L and ORF1ab L3606F were pre-
dominantly enriched. The L variant of N P13L was found at
17.8%, which was signicantly higher than 0.2% and 1.4%,
respectively, in the clusters 2 and 3 (P=0.012 and P=
0.0079; Supplementary Fig. 1D). The F variant of ORF1ab
L3606F was detected at a higher frequency of 40.1% than
10.0% and 7.9% in the clusters 2 and 3, respectively (P=
0.0035 and P=0.00050; Supplementary Fig. 1E). To fur-
ther analyze the mutational prole, we performed a haplo-
type analysis by drawing a linkage disequilibrium (LD) map
for SARS-CoV-2 viral genomes (Supplementary Fig. 3).
We found that ORF1ab 4715L and S protein 614G variants
were in a nearly complete LD (r2of LD =0.98 and D=
1.00). N protein 203K/204R variants were additionally
acquired in the S protein 614G type of virus genome as
indicated as r2of LD =0.11 and D=0.99. These results
indicate that S protein 614G-N protein 203K/204R haplo-
type characterizes the cluster 2.
We then investigated the association with the fatality
rates among conrmed cases in the 28 countries. In the
analysis comparing the fatality rates in the countries clas-
sied as either of the three clusters, average fatality rate of
the countries belonging to the cluster 2 was 9.3%, which
was higher than 3.0% and 5.8% of averages of the countries
(A)
(B) (C)
5
10
15
Fatality rate (%)
0
123
Mutation cluster
P= 0.026
P= 0.095
P= 0.19
Frequency of
mutants
(%)
75
50
25
0
123
Mutation cluster
D
P
L
L
P
T
A
G
P
Y
V
T
R
G
T
Q
Ref
614
4715
3606
84
13
2016
4489
251
5828
5865
378
175
203
204
265
57
Position
G
L
F
S
L
K
V
V
L
C
I
M
K
R
I
H
Mut
S
ORF1ab
ORF1ab
ORF8
N
ORF1ab
ORF1ab
ORF3a
ORF1ab
ORF1ab
ORF1ab
M
N
N
ORF1ab
ORF3a
Protein
Fig. 1 Clustering analysis of SARS-CoV-2 among 28 countries.
aHeatmap for the frequencies of SARS-CoV-2 mutants. The 28
countries were classied into three clusters based on the mutational
signature by a hierarchical clustering. Protein sequence based on the
SARS-CoV-2_Wuhan-Hu-1 sequence (GenBank accession number
MN908947) is used as a reference. Ref; amino acid in reference
SARS-CoV-2 sequence, Mut, amino acid in mutant SARS-CoV-2. bA
global mapping of the three clusters. cFatality rates according to the
clusters. Horizontal lines represent the means. The Studentsttest was
used to evaluate statistical signicance
SARS-CoV-2 genomic variations associated with mortality rate of COVID-19
belonging to the clusters 1 and 3, respectively (P=0.026
and P=0.095; Fig. 1c). Among the mutations we analyzed,
the frequencies of ORF1ab 4715L-type and S 614G-type
viruses showed signicant positive correlations with fatality
rates (Pearsons correlation coefcient (r)=0.41, P=0.029
and r=0.43, P=0.022, respectively; Fig. 2a, b). Since the
clusters 2 and 3 were separated mainly by the frequency of
N 203K/204R, we also examined the correlations of this
variant or S 614G-N 203R/204G haplotype with fatality
rates; however, the correlations were not statistically sig-
nicant (r=0.31, P=0.11; r=0.27, P=0.17, respec-
tively; Supplementary Fig. 4A, B).
It is reported that fatality rates are different among the
areas or states in the United States [13]. When we compared
fatality rates among the three different areas, Western,
Central and Eastern, in the United States, an Eastern area
showed a higher fatality rate of 6.5% than that of 2.2% in a
Western area (P=0.010) and that of 3.9% in a Central area
(P=0.10; Fig. 3a). Therefore, we further investigated the
correlations of the variants with fatality rates in the
17 states. The frequencies of ORF1ab 4715L- and S protein
614G-types tended to show positive correlations with the
fatality rates (r=0.49, P=0.047; r=0.45, P=0.070,
respectively; Fig. 3b, c). Even when integrating the data of
17 states and the remaining 27 countries, the signicant
correlations kept signicant (r=0.38, P=0.014; r=0.39,
P=0.011, respectively; Supplementary Fig. 5A, B).
Several other factors are investigated in association with
mortality related to COVID-19. Ecological studies have
suggested that countries that mandate BCG vaccination for
the population have a lower number of infections and a
reduced mortality from COVID-19, although the associa-
tion is still controversial and the underlying mechanism has
not been claried [9,14,15]. We classied 28 countries into
two groups according to the BCG-vaccination status as the
routine vaccine schedules. As a result, the mean of fatality
rates was signicantly lower in 11 BCG-vaccinated coun-
tries than in 17 BCG-non-vaccinated countries (4.1% vs.
8.1%, P=0.031; Fig. 4a). When we divided BCG-
vaccinated countries into subgroups according to the
strains of BCG vaccine, we observed some differences in
the fatality rates among the countries by different strains of
BCG vaccine, but sample sizes of subgroups are too small
to evaluate statistical signicance (Supplementary Fig. 6).
We also found the frequencies of S 614G variant showed a
trend of positive correlation with fatality rates (r=0.54,
P=0.090; Fig. 4b) in BCG-vaccinated countries, but such
correlation was not observed in BCG-non-vaccinated
countries (r=0.19, P=0.47; Fig. 4b). In addition, the
number of conrmed cases per million population was
signicantly lower in BCG-vaccinated countries than in
BCG-non-vaccinated countries (710 vs. 2912, P=0.0012;
Fig. 4c). These results suggest that BCG-vaccination may
protect from SARS-CoV-2 infection by potentiation of
innate immune response; however, ORF1ab 4715L-type
and S protein 614G-type SARS-CoV-2 variants may escape
from the immune response.
Host genetic differences, especially in HLA loci, are
well-known to contribute to individual variations in the
immune responses to pathogens. We nally searched pep-
tide epitopes with a high binding afnity to HLA molecules,
which we previously reported [6], involving the two SARS-
CoV-2 mutations, ORF1ab P4715L and S D614G, to
investigate the association with host immune responses. We
found that several epitopes, which include the position of
ORF1ab P4715L or S protein D614G, are possibly bind to
HLA molecules, including HLA-A*02:06, HLA-A*11:01,
HLA-B*07:02, and HLA-B*54:01, although the mutated
epitopes from variant SARS-CoV-2 also predicted to bind
to HLA molecules at similar afnities (Supplementary
Table 3). Using the information of 21 countries in which
allele frequency data are available, we examined a
Fatality rate (%)
100
80
60
40
20
0
51015
Frequency of
ORF1ab 4715L variant (%)
Japan
Singapore
Korea China
England
Belgium
Netherlands
Spain
India
Thailand
Canada
France
Italy
Hungary
Sweden
USA
Greece
Brazil
Australia
Taiwan Germany
Finland
Switzerland
Luxembourg
Iceland
Portugal
Congo
Denmark
r= 0.41
P= 0.029
Frequency of
S 614G variant (%)
Fatality rate (%)
100
80
60
40
20
0
51015
Japan
Singapore
Korea China
India
Thailand
Australia
Taiwan England
Netherlands
Spain
Canada
USA
Belgium
France
Italy
Hungary
Sweden
Switzerland
Greece
Brazil
Germany
Luxembourg
Iceland
Congo
Denmark
Finland
Portugal
r= 0.43
P= 0.022
00
)
B
()
A
(
Fig. 2 Correlation analysis of variant frequencies of SARS-CoV-2 ORF1ab 4715L (a) or S 614G (b) with fatality rates of COVID-19 among 28
countries. Pearsons correlation coefcients (r) were calculated. Colors of each dot were corresponding to the mutational clusters shown in Fig. 1a
Y. Toyoshima et al.
relationship between allele frequency of HLA-A*11:01 and
the fatality rates. Consequently, we found a signicant
negative correlation (r=0.61, P=0.0031; Fig. 5a).
Similarly, a trend of negative correlations was observed
between allele frequencies of HLA-A*02:06 or HLA-
B*54:01 and the fatality rates (r=0.39, P=0.14, N=16
and r=0.60, P=0.017, N=15; Fig. 5b, c). However,
the signicant correlations became not statistically sig-
nicant after adjusted by the frequency of S 614G variant in
multiple regression (P=0.13 for HLA-A*11:01,P=0.73
for HLA-A*02:06 and P=0.45 for HLA-B*54:01). We also
found negative correlations between allele frequencies of
the HLAs and the number of conrmed cases per million
population (r=0.43, P=0.054 for HLA-A*11:01, r =
0.44, P=0.086 for HLA-A*02:06 and r=0.52, P=
0.047 for HLA-B*54:01; Fig. 5df). Together, these
results suggest that differences in HLA allele frequencies
may explain different susceptibilities to SARS-CoV-2
infection among the countries, although there are
many other potential confounding factors needed to be
considered.
Discussion
The current outbreak of COVID-19 has rapidly spread
worldwide. Most patients with COVID-19 exhibit no or
mild to moderate symptoms, but ~15% progress to severe
pneumonia and about 5% eventually develop acute
respiratory distress syndrome, septic shock, and multiple
organ failures. The mortality rates related to COVID-19
vary among countries, generally known to be signicantly
higher in European and North American countries than
those of Asian countries. Although several possibilities to
explain the differences in the mortality rates are demon-
strated, including the difference of age distribution, BCG-
vaccination status, virus genomic types, and genetic back-
grounds, nothing is clear at this moment. In this study, we
)B()A(
(C)
0
10
5
Fatality rate (%)
P= 0.15
P= 0.10
P= 0.010
New York
Connecticut
Virginia
New Jersey
Massachusetts
Pennsylvania
Florida
Texas
Louisiana
Wisconsin
Illinois
Ohio
Washington
California
Arizona
Utah
Oregon
0
20
40
60
80
100
0510
Fatality rate (%)
Frequency of
ORF1ab 4715L variant (%)
r= 0.49
P= 0.047
New York
Connecticut
Virginia
New Jersey
Massachusetts
Pennsylvania
Florida
Texas
Louisiana
Wisconsin
Illinois
Ohio
Washington
California
Arizona
Oregon
0
20
40
60
80
100
0510
Fatality rate (%)
Frequency of
S 614G variant (%)
r= 0.45
P= 0.070
Utah
Fig. 3 Association of variant frequencies of SARS-CoV-2 with fatality
rates of COVID-19 among 17 states in the United States. aFatality
rates in three different areas in the United States, Western, Central, and
Eastern. Horizontal lines represent the means. The Studentsttest was
used to evaluate statistical signicance. b,cCorrelation analysis
between frequencies of SARS-CoV-2 ORF1ab 4715L (b) or S 614G
variants (c) and fatality rates. Pearsons correlation coefcients (r)
were calculated
SARS-CoV-2 genomic variations associated with mortality rate of COVID-19
investigated the SARS-CoV-2 virus mutations and found
that the frequencies of S protein 614G variant and its highly
linked variant, ORF1ab 4715L, were signicantly correlated
with fatality rates in the 28 countries and 17 states of the
United States.
The D614G spike mutation is the mutation detected in
Europe in the early phase and has widely spread around the
globe, especially to European and North American countries
[1619]. Spike glycoprotein is essential for interaction with
ACE2 expressed in host cells and is important for viral
transmission [20,21]. Therefore, spike glycoprotein is most
vital hotspot of amino acid mutations when viruses acquire
mutations to enhance the virus-cell entry to adapt environ-
ments. Structural analyses indicated that S protein having a
D614G substitution is located on the surface of the virus and
interacts with ACE2. Concordant to our results, a few
reports demonstrated that S 614G variant was associated
with the mortality related to COVID-19 [13,22]. ORF1ab
P4715L is located in Nsp12, which is important for viral
RNA replication. We found signicant associations between
these mutations and the fatality rates; however, the func-
tional signicance of these mutations has not claried yet.
Since immune responses through HLA and T cells are
important to protect from virus infections and also known to
be involved in the progression of COVID-19, we screened
epitopes around the mutations associated with fatality rates
(Supplementary Table 3). ORF1ab P4715L is located in the
epitope sequences of ORF1ab 47134721, FPPTSFGPL,
ORF1ab 47134722, FPPTSFGPLV, and ORF1ab
47154724, PTSFGPLVRK, which were predicted to have
strong binding afnities of 44, 41, and 45 nM to HLA-
B*07:02, HLA-B*54:01, and HLA-A*11:01, respectively.
In a computational prediction, corresponding mutated pep-
tides show higher binding afnities of 11, 12, and 23 nM.
Similarly, S D614G is located in the epitope sequences of
S606-615, NQVAVLYQDV, and S612-620, YQDVNC-
TEV. Both of wild-type and mutated epitopes were pre-
dicted to bind to HLA-A*02:06 at similar afnities. Among
them, the countries where the proportion of individuals with
HLA-A*11:01,HLA-A*02:06, and HLA-B*54:01 alleles are
-G
CB+
G
CB
Fatality rate (%)
5
10
15
0
P= 0.031
(A)
(C)
0
2,000
4,000
6,000
BCG+ BCG-
P= 0.0012
Cases per million population
Fatality rate (%)
100
80
60
40
20
0
Frequency of
S 614G variant (%)
51015
Japan
Singapore
Korea China
India
Thailand
Hungary
Brazil
Taiwan
Portugal
Congo
r= 0.54
P= 0.090
0
(B)
100
80
60
40
20
0
51015
England
Belgium
Netherlands
Spain
Canada
France
Italy
Sweden
USA
Greece
Australia
Germany
Finland
Switzerland
Luxembourg
Iceland
Denmark
r= 0.19
P= 0.47
0
Fatality rate (%)
BCG+ BCG-
Fig. 4 Association of BCG-vaccination status with fatality rates and
infected cases of COVID-19 among 28 countries. aFatality rates in
BCG-vaccinated (BCG+) and BCG-non-vaccinated countries (BCG).
Horizontal lines represent the means. The Studentsttest was used to
evaluate statistical signicance. bCorrelation analysis between
frequencies of S 614G variant of SARS-CoV-2 and fatality rates
in BCG+and BCGcountries. Pearsons correlation coefcients (r)
were calculated. cNumber of infected cases in BCG+and
BCGcountries. Horizontal lines represent the means. The Studentst
test was used to evaluate statistical signicance
Y. Toyoshima et al.
relatively high showed lower fatality rates as well as num-
ber of conrmed cases (Fig. 5). However, the signicant
correlations with fatality rates became not signicant after
adjusted by the frequency of S protein 614G-type virus in
multiple regression analysis. These results suggest that
individuals with HLA-A*11:01,HLA-A*02:06,orHLA-
B*54:01 might be protected from infection of SARS-CoV-
2, although further studies are needed to investigate the
effects of other potential confounding factors, such as dif-
ferent phases of outbreak, age of infected population,
management of the pandemic. In SARS-CoV and MERS-
CoV, several HLA genotypes have been reported to
associate with susceptibility or resistance, including HLA-
B*07:03, HLA-B*46:01, HLA-C*08:01, HLA-C*15:02,
HLA-DRB1*03:01, HLA-DRB1*11:01, and HLA-
DRB1*12:02 [2326]. Although further studies are required
to elucidate whether such cytotoxic T lymphocytes targeting
the epitopes are present in peripheral blood in patients,
especially in severe patients, and also large scale case-
control association studies are needed to conrm the asso-
ciation of HLA genotype with susceptibility or disease
progression of SARS-CoV-2 infection, these ndings in the
current study provide an important insight into treatment of
the current SARS-CoV-2 and prevention of the second
SARS-CoV-2 pandemic.
In summary, we comprehensively investigated SARS-
CoV-2 genome mutations, BCG-vaccination status, and HLA
genotypes in the 28 different countries and identied
signicant associations of some virus genome variants
with the fatality rates. These results may explain, at least a
part of the differences of the SARS-CoV-2 infection or the
mortality rates related to COVID-19 among various countries.
Acknowledgements The super-computing resource was provided by
Human Genome Center, the Institute of Medical Science, the Uni-
versity of Tokyo (http://sc.hgc.jp/shirokane.html).
Compliance with ethical standards
Conict of interest YN is a stockholder and a scientic advisor of
OncoTherapy Science, Inc. KK is a scientic advisor of Cancer Pre-
cision Medicine, Inc. This study is unrelated to the activity in these
companies.
Publishers note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional afliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license, and indicate if
changes were made. The images or other third party material in this
article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not
included in the articles Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted
use, you will need to obtain permission directly from the copyright
holder. To view a copy of this license, visit http://creativecommons.
org/licenses/by/4.0/.
HLA-A*11:01 allele frequency (%)
15
10
5
00102030
Belgium
Taiwan
China
India Thailand
Singapore
Finland Japan
Korea
Australia
Sweden
Netherlands
England
France
Italy
Spain
Germany
Portugal
Greece USA
Brazil
r= -0.61
P= 0.0031
(B)
Fatality rate (%)
HLA-A*02:06 allele frequency (%)
0510
15
10
5
0
Taiwan
China
India
Thailand
Singapore
Australia
Sweden
Netherlands
Italy
Spain
Germany
Portugal
USA
Brazil
Japan
Korea
r= -0.39
P= 0.14
(A)
HLA-B*54:01 allele frequency (%)
0510
15
10
5
0
r= -0.60
P= 0.017
Taiwan
Thailand
France
Germany
Netherlands
Spain
USA China
Korea
India Japan
Singapore
Portugal
Brazil
Italy
(C)
(E)
(D) (F)
Cases per million
population
0
2,000
4,000
6,000
03010 20
Taiwan Thailand
Finland
France Germany
Sweden
Spain
USA
Australia China
Korea
India
Japan
Singapore
Portugal
Brazil
Belgium
Greece
Italy
England
Netherlands
r= -0.43
P= 0.054
HLA-A*11:01 allele frequency (%)
0
2,000
4,000
6,000
0510
Taiwan
Thailand
Germany
Sweden
Spain
USA
Australia China Korea
India Japan
Singapore
Portugal
Brazil
Italy
Netherlands
r= -0.44
P= 0.086
HLA-A*02:06 allele frequency (%)
0
2,000
4,000
6,000
0510
Taiwan
Thailand
France
Germany
Spain
USA
ChinaKorea
India Japan
Singapore
Portugal
Brazil
Italy
Netherlands
r= -0.52
P= 0.047
HLA-B*54:01 allele frequency (%)
Fig. 5 Association of HLA allele frequency with fatality rates and
infected cases of COVID-19 among countries. acCorrelation
between HLA-A*11:01 (a), HLA-A*02:06 (b), and HLA-B*54:01
(c) allelic frequencies and fatality rates of COVID-19. Numbers of
analyzed countries are 21, 16, and 15, respectively, for HLA-A*11:01,
HLA-A*02:06, and HLA-B*54:01. Pearsons correlation coefcient (r)
was calculated. dfCorrelation between HLA-A*11:01 (d), HLA-
A*02:06 (e), and HLA-B*54:01 (f) allelic frequency and number of
infected cases of COVID-19. Pearsons correlation coefcient (r) was
calculated
SARS-CoV-2 genomic variations associated with mortality rate of COVID-19
References
1. Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, et al. A new
coronavirus associated with human respiratory disease in China.
Nature. 2020;579:2659.
2. Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, et al. A
pneumonia outbreak associated with a new coronavirus of prob-
able bat origin. Nature. 2020;579:2703.
3. Subissi L, Posthuma CC, Collet A, Zevenhoven-Dobbe JC, Gor-
balenya AE, Decroly E, et al. One severe acute respiratory syn-
drome coronavirus protein complex integrates processive RNA
polymerase and exonuclease activities. Proc Natl Acad Sci USA.
2014;111:E39009.
4. Wang C, Horby PW, Hayden FG, Gao GF. A novel coronavirus
outbreak of global health concern. Lancet. 2020;395:4703.
5. Shu Y, McCauley J. GISAID: Global initiative on sharing all
inuenza datafrom vision to reality. Eur Surveill. 2017;22:30494.
6. Kiyotani K, Toyoshima Y, Nemoto K, Nakamura Y. Bioinfor-
matic prediction of potential T cell epitopes for SARS-Cov-2. J
Hum Genet. 2020;65:56975.
7. Kent WJ. BLAT-the BLAST-like alignment tool. Genome Res.
2002;12:65664.
8. Gonzalez-Galarza FF, McCabe A, Santos E, Jones J, Takeshita L,
Ortega-Rivera ND, et al. Allele frequency net database (AFND)
2020 update: gold-standard data classication, open access genotype
data and new query tools. Nucleic Acids Res. 2020;48:D7838.
9. Ozdemir C, Kucuksezer UC, Tamay ZU. Is BCG vaccination
affecting the spread and severity of COVID-19? Allergy.
2020;75:18247.
10. Ritz N, Curtis N. Mapping the global use of different BCG vac-
cine strains. Tuberculosis. 2009;89:24851.
11. Zwerling A, Behr MA, Verma A, Brewer TF, Menzies D, Pai M.
The BCG World Atlas: a database of global BCG vaccination
policies and practices. PLoS Med. 2011;8:e1001012.
12. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and
visualization of LD and haplotype maps. Bioinformatics.
2005;21:2635.
13. Becerra-Flores M, Cardozo T. SARS-CoV-2 viral spike G614
mutation exhibits higher case fatality rate. Int J Clin Pract.
2020;00:e13525.
14. Gursel M, Gursel I. Is global BCG vaccination-induced trained
immunity relevant to the progression of SARS-CoV-2 pandemic?
Allergy. 2020;75:18159.
15. Hamiel U, Kozer E, Youngster I. SARS-CoV-2 rates in BCG-
vaccinated and unvaccinated young adults. JAMA.
2020;323:23401.
16. Forster P, Forster L, Renfrew C, Forster M. Phylogenetic network
analysis of SARS-CoV-2 genomes. Proc Natl Acad Sci USA.
2020;117:92413.
17. Koyama T, Weeraratne D, Snowdon JL, Parida L. Emergence of
drift variants that may affect COVID-19 vaccine development and
antibody treatment. Pathogens. 2020;9:E324.
18. Gonzalez-Reiche AS, Hernandez MM, Sullivan MJ, Ciferri B,
Alshammary H, Obla A, et al. Introductions and early spread of
SARS-CoV-2 in the New York City area. Science.
2020;369:297301.
19. Deng X, Gu W, Federman S, du Plessis L, Pybus OG,
Faria N, et al. Genomic surveillance reveals multiple introductions
of SARS-CoV-2 into Northern California. Science. 2020. In press.
20. Letko M, Marzi A, Munster V. Functional assessment of cell entry
and receptor usage for SARS-CoV-2 and other lineage B beta-
coronaviruses. Nat Microbiol. 2020;5:5629.
21. Hoffmann M, Kleine-Weber H, Schroeder S, Kruger N, Herrler T,
Erichsen S, et al. SARS-CoV-2 cell entry depends on ACE2 and
TMPRSS2 and is blocked by a clinically proven protease inhi-
bitor. Cell. 2020;181:27180.e8.
22. Eaaswarkhanth M, Al Madhoun A, Al-Mulla F. Could the D614G
substitution in the SARS-CoV-2 spike (S) protein be associated with
higher COVID-19 mortality? Int J Infect Dis. 2020;96:45960.
23. Lin M, Tseng HK, Trejaut JA, Lee HL, Loo JH, Chu CC, et al.
Association of HLA class I with severe acute respiratory syn-
drome coronavirus infection. BMC Med Genet. 2003;4:9.
24. Ng MH, Lau KM, Li L, Cheng SH, Chan WY, Hui PK, et al.
Association of human-leukocyte-antigen class I (B*0703) and
class II (DRB1*0301) genotypes with susceptibility and resistance
to the development of severe acute respiratory syndrome. J Infect
Dis. 2004;190:5158.
25. Chen YM, Liang SY, Shih YP, Chen CY, Lee YM, Chang L,
et al. Epidemiological and genetic correlates of severe acute
respiratory syndrome coronavirus infection in the hospital with the
highest nosocomial infection rate in Taiwan in 2003. J Clin
Microbiol. 2006;44:35965.
26. Hajeer AH, Balkhy H, Johani S, Yousef MZ, Arabi Y. Associa-
tion of human leukocyte antigen class II alleles with severe
Middle East respiratory syndrome-coronavirus infection. Ann
Thorac Med. 2016;11:2113.
Y. Toyoshima et al.
... Symptomatic cases report a variety of symptoms, including fever, anosmia, cough, and diarrhea; more severe cases are reported with respiratory distress, sepsis, septic shock, and death (Huang et al. 2020). Due to the diversity of symptoms, human factors such as genetics and risk factors play a critical role in the outcome of the disease (LoPresti et al. 2020;Sironi et al. 2020;Toyoshima et al. 2020). These factors tend to be specific to the population, in which particular studies are required in each geographic location. ...
... This situation reminds us that the clinical profiles depend on the viral agent and human host conditions. The human genetic, comorbidities and risk conditions have been described as the predominant factor in the clinical outcome of the COVID-19, as found in several studies (LoPresti et al. 2020;Sironi et al. 2020;Toyoshima et al. 2020;Molina-Mora et al. 2021). ...
Article
The clinical manifestations of COVID-19, caused by the SARS-CoV-2, defne a large spectrum of symptoms that are mainly dependent on the human host conditions. In Costa Rica, more than 169,000 cases and 2185 deaths were reported during the year 2020, the pre-vaccination period. To describe the clinical presentations at the time of diagnosis of SARS-CoV-2 infection in Costa Rica during the pre-vaccination period, we implemented a symptom-based clustering using machine learning to identify clusters or clinical profles at the population level among 18,974 records of positive cases. Profles were compared based on symptoms, risk factors, viral load, and genomic features of the SARS-CoV-2 sequence. A total of 18 symptoms at time of diagnosis of SARS-CoV-2 infection were reported with a frequency >1%, and those were used to identify seven clinical profles with a specifc composition of clinical manifestations. In the comparison between clusters, a lower viral load was found for the asymptomatic group, while the risk factors and the SARS-CoV-2 genomic features were distributed among all the clusters. No other distribution patterns were found for age, sex, vital status, and hospitalization. In conclusion, during the pre-vaccination time in Costa Rica, the symptoms at the time of diagnosis of SARS-CoV-2 infection were described in clinical profles. The host co-morbidities and the SARS-CoV-2 genotypes are not specifc of a particular profle, rather they are present in all the groups, including asymptomatic cases. In addition, this information can be used for decision-making by the local healthcare institutions (first point of contact with health professionals, case defnition, or infrastructure). In further analyses, these results will be compared against the profles of cases during the vaccination period.
... Una de ellas se detectó en la posición 241 (C > T) del 5'UTR, sugiriendo posiblemente un cambio conformacional en la estructura del RNA y, como consecuencia, en la función infecciosa del virus, mientras que las otras dos mutaciones no sinónimas 14,408 (C > T; P4715L) y 23,403 (A > G; D614G) se encontraron en los genes ORF1b (nsp12) y S (Figura 2.6). Si bien nuestros resultados son consistentes con los estudios que han investigado detenidamente las variantes P4715L y D614G, demostrando que desempeñan un papel clave en la transmisión del virus, la mayoría de ellos corresponden al análisis de la población humana(Daniloski et al., 2020;Toyoshima et al., 2020;Yang et al., 2020;, y, de hecho, no se sabía si estas mutaciones formaban parte del reservorio de los animales intermediarios. Nuestro estudio sugiere, por tanto, que las mutaciones no sinónimas identificadas en el gen S del SARS-CoV (G > T; A577S) y del SARS-CoV-2 (A > G; D614G) están asociadas directamente o indirectamente con el salto de especie desde los murciélagos, un patrón que parece común entre los virus de RNA(Boni et al., 2020;Geoghegan et al., 2017). ...
Thesis
Full-text available
Betacoronavirus have caused earlier deadly epidemics, including the 2002 SARS-CoV outbreak and the ongoing prevalence of MERS-CoV, which was first detected in 2012. In late 2019, the emergence of the COVID-19 pandemic encouraged scientists around the globe to apply their respective insights to address how SARS-CoV-2 infects humans. The main strategy has been the implementation of standard health surveillance systems to identify, manage and control viral infections caused by these emerging viruses. Even though monitoring the genetic evolution of the virus has been of high significance, to what extent zoonotic transmission across susceptible and non-susceptible animal species is possible, as well as eventual functionality the structural architecture of the RNA genome of Betacoronavirus in the pathophysiology, mainly for SARS-CoV, MERS-CoV and SARS-CoV-2 is unclear. To fill this knowledge gap and facilitate the development of effective treatments, a comprehensive study of Betacoronavirus genomes was performed by means of the analysis of 1,252,952 viral sequences reported in databases which have circulated since 2002 from natural reservoirs to intermediate hosts and humans. This study includes two different approaches to represent genomic information, as introduced and discussed in Chapter 2: Sequence analyses. This part of the work represents an evolutionary analysis of horizontal transmission in viral sequences to thoroughly characterize and describe the intra- host variation and transmission routes of Betacoronavirus. The results reveal that amino acid changes within S protein S1 subunit of SARS-CoV (G > T; A577S), MERS-CoV (C > T; S746R and C > T; N762A) and SARS-CoV-2 (A > G; D614G) with signals of positive selection are pivotal factors underlying the possible jumping from bats barrier to intermediate host. Chapter 3: Structural analyses, is a section that explores Betacoronavirus at the structural level as a proposal to discover whether the folding of conserved RNA secondary structures may act as putative loci for processing virus-derived small RNAs, with a potential function associated with pathogenesis in the process of selection. Over 87.58% of these RNA structures indicate that 12 regions carry small RNAs in Betacoronavirus, suggesting the possibility of modulation of transcriptional re-programming of the new host upon infection. The findings of this study provide a collection of significant molecular signatures that contribute to pushing the frontiers of human therapeutics in the context of the current global health crisis. Institutional Repository of Universidad Nacional de Colombia: https://repositorio.unal.edu.co/handle/unal/81628
... Their analysis revealed that East and Southeast Asian countries had a lower fatality than did western countries. However, the analysis did not consider potential confounders such as patient age group, pandemic stage, and pandemic management 11 . ...
Article
Full-text available
The coronavirus disease 2019 (COVID-19) pandemic has affected people globally. In many countries, the situation is getting worse with rapid transmission of COVID-19. The daily incidence is drastically increasing. In this review, we aim to summarized epidemiological information and adverse health outcomes of severe acute respiratory syndrome coronavirus 2 infection. The major outcomes are described based on information from the literature and hospital reports. COVID-19 has affected on individuals both physiologically and psychologically. The physiological effect on human health includes respiratory distress, pneumonia, cardiac injury, kidney failure, nervous system symptoms, and gastrointestinal symptoms. The psychological effect was observed in patients, health care workers, and general population in the community. This review could be a source of clinical information on COVID-19 for physicians, researchers, and general population for further studies to deal with this pandemic.
... Moreover, other viral proteins of SARS-CoV-2 have also been found to undergo variations during the progression of COVID-19 (Toyoshima et al., 2020;Joshi et al., 2021), whereas the roles of viral transmission, pathogenicity, and viral lifecycle of each SARS-CoV-2 viral protein and variant remain not fully understood (Ceraolo and Giorgi, 2020). Among these SARS-CoV-2 viral proteins, the accessory protein ORF8, an immunoglobulin-like protein with highly immunogenic property, remains one of the most hypervariable gene evolving proteins among betacoronaviruses and exhibits the least homology with SARS-CoV viral proteins (Ceraolo and Giorgi, 2020;Lu et al., 2020;Tan et al., 2020;Zinzula, 2021). ...
Article
Full-text available
COVID-19 is currently global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Accompanying the rapid spread of the error-prone RNA-based genome, several dominant SARS-CoV-2 variants have been genetically identified. The mutations in the spike protein, which are essential for receptor binding and fusion, have been intensively investigated for their contributions to viral transmission. Nevertheless, the importance of other viral proteins and their mutations in SARS-CoV-2 lifecycle and transmission remains fairly understood. Here, we report the strong potency of an accessory protein ORF8 in modulating the level and processing of the spike protein. The expression of ORF8 protein does not affect propagation but expression of spike protein, which may lead to pseudovirions with less spike protein on the surface, therefore less infection potential. At the protein level, ORF8 expression led to downregulation and insufficient S1/S2 cleavage of the spike protein in a dose-dependent manner. ORF8 exhibits a strong interaction with the spike protein mainly at S1 domains and mediates its degradation through multiple pathways. The dominant clinical isolated ORF8 variants with the reduced protein stability exhibited the increased capacity of viral transmission without compromising their inhibitory effects on HLA-A2. Although the increase in spike protein level and Spike pseudovirus production observed by using highly transmissible clinical spike variants, there was no significant compromise in ORF8-mediated downregulation. Because ORF8 is important for immune surveillance and might be required for viral fitness in vivo, the alteration of the spike protein might be an optional strategy used by SARS-CoV-2 to promote viral transmission by escaping the inhibitory effects of ORF8. Therefore, our report emphasized the importance of ORF8 in SARS-CoV-2 spike protein production, maturation, and possible evolution.
... To safeguard our communities against worsening and future epidemics, high COVID-19 immunization rates are critical. Hundreds of millions of vaccination doses will require tremendous planning and implementation [182,183] Despite the fact that this may be the world's largest single vaccination attempt, best practices and lessons learned in pandemic preparedness, supply chain management, distribution, and clinical practice can help us immunize against SARS-CoV-2 [184][185][186]To successfully manage vaccine delivery and administration to hundreds of millions of people, deliberate planning and coordination with local and international partners are essential [187][188][189] Figure 3 shows the packaging and distribution process of the vaccines from production to use. The coronavirus disease vaccine is packaged and distributed in various levels to ensure cold storage throughout the supply chain for a successful vaccination campaign (taken from ) [190]. ...
Article
Full-text available
To prevent the coronavirus disease 2019 (COVID-19) pandemic and aid restoration to pre-pandemic normality, global mass vaccination is urgently needed. Inducing herd immunity through mass vaccination has proven to be a highly effective strategy for preventing the spread of many infectious diseases, which protects the most vulnerable population groups that are unable to develop immunity, such as people with immunodeficiencies or weakened immune systems due to underlying medical or debilitating conditions. In achieving global outreach, the maintenance of the vaccine potency, transportation, and needle waste generation become major issues. Moreover, needle phobia and vaccine hesitancy act as hurdles to successful mass vaccination. The use of dissolv-able microneedles for COVID-19 vaccination could act as a major paradigm shift in attaining the desired goal to vaccinate billions in the shortest time possible. In addressing these points, we discuss the potential of the use of dissolvable microneedles for COVID-19 vaccination based on the current literature.
... Deletions occurring in RDRs is notable in maximum in Alpha-originated variants (e.g., S: HV 69-70, S: Y144 in RDR1, and RDR2 respectively), in tandem with Beta-stemmed variants (e.g., S: LAL 242-244, RDR4) and B.1.36 (e.g., S: I210, RDR3), which ends in the resistance for antibody neutralization, wiped epitopes, support in host's immune evasion together with vaccines, or Abs neutralizing declination (10,49 Among the rapidly disseminating arising variants that include alpha, beta, gamma, delta, kappa, eta, lota, epsilon, lambda, mu, and omicron variants, the most well-known D614G mutation (44,46,50) provides a reasonable benefit in terms of infectivity (47,51,52) and improves transmissibility (53), implying a higher fatality and infectivity rate (54)(55)(56). Similarly, the N501Y alteration observed in alpha, beta, gamma, delta, and omicron imparts better ACE2 binding, demonstrating (57-59) the massive increase in ACE2 affinity with a single RBD mutation (57). ...
Article
Full-text available
The emergence of several novel SARS-CoV-2 variants regarded as variants of concern (VOCs) has exacerbated pathogenic and immunologic prominences, as well as reduced diagnostic sensitivity due to phenotype modification-capable mutations. Furthermore, latent and more virulent strains that have arisen as a result of unique mutations with increased evolutionary potential represent a threat to vaccine effectiveness in terms of incoming and existing variants. As a result, resisting natural immunity, which leads to higher reinfection rates, and avoiding vaccination-induced immunization, which leads to a lack of vaccine effectiveness, has become a crucial problem for public health around the world. This study attempts to review the genomic variation and pandemic impact of emerging variations of concern based on clinical characteristics management and immunization effectiveness. The goal of this study is to gain a better understanding of the link between genome level polymorphism, clinical symptom manifestation, and current vaccination in the instance of VOCs.
... Refined clinical management could likewise have resulted in a reduced length of hospitalization in wave 2 compared to wave 1. Third, potential changes in SARS-CoV-2 genomic variations from wave 1 to wave 2 could have influenced the disease severity among patients, as SARS-CoV-2 genomic variations have been associated with the mortality rate of COVID-19 [23]. Finally, the Danish testing strategy may have increased the number of patients with a milder course of disease during wave 2 in the present study. ...
Article
Full-text available
Background Only a few studies have performed comprehensive comparisons between hospitalized patients from different waves of COVID-19. Thus, we aimed to compare the clinical characteristics and laboratory data of patients admitted to the western part of Denmark during the first and second waves of COVID-19 in 2020. Furthermore, we aimed to identify risk factors for critical COVID-19 disease and to describe the available information on the sources of infection. Methods We performed a retrospective study of medical records from 311 consecutive hospitalized patients, 157 patients from wave 1 and 154 patients from wave 2. The period from March 7 to June 30, 2020, was considered wave 1, and the period from July 1st to December 31, 2020, was considered wave 2. Data are presented as the total study population, as a comparison between waves 1 and 2, and as a comparison between patients with and without critical COVID-19 disease (nonsurvivors and patients admitted to the intensive care unit (ICU)). Results Patients admitted during the first COVID-19 wave experienced a more severe course of disease than patients admitted during wave 2. Admissions to the ICU and fatal disease were significantly higher among patients admitted during wave 1 compared to wave 2. The percentage of patients infected at hospital decreased in wave 2 compared to wave 1, whereas more patients were infected at home during wave 2. We found no significant differences in sociodemographics, lifestyle information, or laboratory data in the comparison of patients from waves 1 and 2. However, age, sex, smoking status, comorbidities, fever, and dyspnea were identified as risk factors for critical COVID-19 disease. Furthermore, we observed significantly increased levels of C-reactive protein and creatinine, and lower hemoglobin levels among patients with critical disease. Conclusions At admission, patients were more severely ill during wave 1 than during wave 2, and the outcomes were worse during wave 1. We confirmed previously identified risk factors for critical COVID-19 disease. In addition, we found that most COVID-19 infections were acquired at home.
... Two non-synonymous mutations of this quartet, C14408T and A23403G, that respectively correspond to the RdRp protein P4715L and S protein D614G were found to be key mutations in this study and may impact current treatment options, while D614G was associated with an increase of human-to-human transmission efficiency [46]. They also showed significant positive correlations with fatality rates by Toyoshima et al. [47], which may explain why the disease severity increased when infected by these four VOCs [48], while the other two did not belong to the key mutation list in our experiment due to their location in the 5' untranslated region (C241T) and having low coding consequential impacts of a synonymous mutation in the nsp3 (C3037T), which is known as the largest encoded multi-domain protein in CoV genera [49]. In addition, the raw sequencing data used as input for our cloud workflow open the possibility of precisely identifying existing or future SARS-CoV-two variants, which cannot be conducted with the current existing diagnostics test assay of reverse-transcription polymerase chain reaction (RT-PCR). ...
Article
Full-text available
Several variants of the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are emerging all over the world. Variant surveillance from genome sequencing has become crucial to determine if mutations in these variants are rendering the virus more infectious, potent, or resistant to existing vaccines and therapeutics. Meanwhile, analyzing many raw sequencing data repeatedly with currently available code-based bioinformatics tools is tremendously challenging to be implemented in this unprecedented pandemic time due to the fact of limited experts and computational resources. Therefore, in order to hasten variant surveillance efforts, we developed an installation-free cloud workflow for robust mutation profiling of SARS-CoV-2 variants from multiple Illumina sequencing data. Herein, 55 raw sequencing data representing four early SARS-CoV-2 variants of concern (Alpha, Beta, Gamma, and Delta) from an open-access database were used to test our workflow performance. As a result, our workflow could automatically identify mutated sites of the variants along with reliable annotation of the protein-coding genes at cost-effective and timely manner for all by harnessing parallel cloud computing in one execution under resource-limitation settings. In addition, our workflow can also generate a consensus genome sequence which can be shared with others in public data repositories to support global variant surveillance efforts.
Article
Full-text available
Egypt is the third most densely inhabited African country. Due to the economic burden and healthcare costs of overpopulation, genomic and genetic testing is a huge challenge. However, in the era of precision medicine, Egypt is taking a shift in approach from “one-size-fits all” to more personalized healthcare via advancing the practice of medical genetics and genomics across the country. This shift necessitates concrete knowledge of the Egyptian genome and related diseases to direct effective preventive, diagnostic and counseling services of prevalent genetic diseases in Egypt. Understanding disease molecular mechanisms will enhance the capacity for personalized interventions. From this perspective, we highlight research efforts and available services for rare genetic diseases, communicable diseases including the coronavirus 2019 disease (COVID19), and cancer. The current state of genetic services in Egypt including availability and access to genetic services is described. Drivers for applying genomics in Egypt are illustrated with a SWOT analysis of the current genetic/genomic services. Barriers to genetic service development in Egypt, whether economic, geographic, cultural or educational are discussed as well. The sensitive topic of communicating genomic results and its ethical considerations is also tackled. To understand disease pathogenesis, much can be gained through the advancement and integration of genomic technologies via clinical applications and research efforts in Egypt. Three main pillars of multidisciplinary collaboration for advancing genomics in Egypt are envisaged: resources, infrastructure and training. Finally, we highlight the recent national plan to establish a genome center that will aim to prepare a map of the Egyptian human genome to discover and accurately determine the genetic characteristics of various diseases. The Reference Genome Project for Egyptians and Ancient Egyptians will initialize a new genomics era in Egypt. We propose a multidisciplinary governance system in Egypt to support genomic medicine research efforts and integrate into the healthcare system whilst ensuring ethical conduct of data.
Preprint
Full-text available
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly transmissible coronavirus and has caused a pandemic of acute respiratory disease, named ‘coronavirus disease 2019’ (COVID-19). COVID-19 has a deep impact on public health as one of the most serious pandemics in the last century. Tracking SARS-CoV-2 is important for monitoring and assessing its evolution. This is only possible by detecting all mutations in the viral genome through genomic sequencing. Moreover, accurate detection of SARS-CoV-2 and tracking its mutations is also required for its correct diagnosis. Potential effects of mutations on the prognosis of the disease can be observed. Assignment of epidemiological lineages in an emerging pandemic requires efforts. To address this, we collected 1000 SARS-CoV-2 samples from different geographical regions in Turkey and analyze their genome comprehensively. To track the virus across Turkey we focus on 10 distinct cities in different geographic regions. Each SARS-CoV-2 genome was analyzed and named according to the nomenclature system of Nextclade and Pangolin Lineage. Furthermore, the frequency of the variations observed in 10 months was also determined by region. In this way, we have observed how the virus mutations and what kind of transmission mechanism it has. The effects of age and disease severity on lineage distribution were other considered parameters. The temporal rates of SARS-CoV-2 variants by time in Turkey were close to the global trend. This study is one of the most comprehensive whole genome analyses of SARS-CoV-2 that represents a general picture of the distribution of SARS-CoV-2 variations in Turkey in 2021. Author Summary Since the outbreak of the COVID-19 pandemic in 2019, the viral genome of SARS-CoV-2 was analysed intensively all over the world both to detect its zoonotic origin and the emerging variants worldwide together with the variants’ effect on the prognosis and treatment, respectively, of the infection. Remarkable COVID-19 studies were also made in Turkey as it was in the rest of the world. To date, indeed, almost all studies on COVID-19 in Turkey either sequenced only a small number of the viral genome or analysed the viral genome which was obtained from online databases. In respect thereof, our study constitutes a milestone regarding both the huge sample size consisting of 1000 viral genomes and the widespread geographic origin of the viral genome samples. Our study provides new insights both into the SARS-CoV-2 landscape of Turkey and the transmission of the emerging viral pathogen and its interaction with its vertebrate host.
Article
Full-text available
New York City (NYC) has emerged as one of the epicenters of the current SARS-CoV-2 pandemic. To identify the early transmission events underlying the rapid spread of the virus in the NYC metropolitan area, we sequenced the virus causing COVID-19 in patients seeking care at the Mount Sinai Health System. Phylogenetic analysis of 84 distinct SARS-CoV2 genomes indicates multiple, independent but isolated introductions mainly from Europe and other parts of the United States. Moreover, we find evidence for community transmission of SARS-CoV-2 as suggested by clusters of related viruses found in patients living in different neighborhoods of the city.
Article
Full-text available
Increasing number of deaths due to COVID-19 pandemic has raised serious global concerns. Higher testing capacity and ample intensive care availability could explain lower mortality in some countries compared to others. Nevertheless, it is also plausible that the SARS-CoV-2 mutations giving rise to different phylogenetic clades are responsible for the obvious death disparities around the world. Current research literature linking the genetic make-up of SARS-CoV-2 with fatality is lacking. Here, we suggest that this disparity in fatality rates may be attributed to SARS-CoV-2 evolving mutations and urge the international community to begin addressing the phylogenetic clade classification of SARS-CoV-2 in relation to clinical outcomes.
Article
Full-text available
Aim The COVID pandemic is caused by infection with the SARS‐CoV‐2 virus. The major mutation detected to date in the SARS‐CoV‐2 viral envelope spike protein, which is responsible for virus attachment to the host and is also the main target for host antibodies, is a mutation of an aspartate (D) at position 614 found frequently in Chinese strains to a glycine (G). We sought to infer health impact of this mutation. Result Increased case fatality rate correlated strongly with the proportion of viruses bearing G614 on a country by country basis. The amino acid at position 614 occurs at an internal protein interface of the viral spike, and the presence of G at this position was calculated to destabilize a specific conformation of the viral spike, within which the key host receptor binding site is more accessible. Conclusion These results imply that G614 is a more pathogenic strain of SARS‐CoV‐2, which may influence vaccine design. The prevalence of this form of the virus should also be included in epidemiologic models predicting the COVID‐19 health burden and fatality over time in specific regions. Physicians should be aware of this characteristic of the virus to anticipate the clinical course of infection.
Article
Full-text available
To control and prevent the current COVID-19 pandemic, the development of novel vaccines is an emergent issue. In addition, we need to develop tools that can measure/monitor T-cell and B-cell responses to know how our immune system is responding to this deleterious virus. However, little information is currently available about the immune target epitopes of novel coronavirus (SARS-CoV-2) to induce host immune responses. Through a comprehensive bioinformatic screening of potential epitopes derived from the SARS-CoV-2 sequences for HLAs commonly present in the Japanese population, we identified 2013 and 1399 possible peptide epitopes that are likely to have the high affinity (<0.5%- and 2%-rank, respectively) to HLA class I and II molecules, respectively, that may induce CD8⁺ and CD4⁺ T-cell responses. These epitopes distributed across the structural (spike, envelope, membrane, and nucleocapsid proteins) and the nonstructural proteins (proteins corresponding to six open reading frames); however, we found several regions where high-affinity epitopes were significantly enriched. By comparing the sequences of these predicted T cell epitopes to the other coronaviruses, we identified 781 HLA-class I and 418 HLA-class II epitopes that have high homologies to SARS-CoV. To further select commonly-available epitopes that would be applicable to larger populations, we calculated population coverages based on the allele frequencies of HLA molecules, and found 2 HLA-class I epitopes covering 83.8% of the Japanese population. The findings in the current study provide us valuable information to design widely-available vaccine epitopes against SARS-CoV-2 and also provide the useful information for monitoring T-cell responses.
Article
Full-text available
New coronavirus (SARS-CoV-2) treatments and vaccines are under development to combat COVID-19. Several approaches are being used by scientists for investigation, including (1) various small molecule approaches targeting RNA polymerase, 3C-like protease, and RNA endonuclease; and (2) exploration of antibodies obtained from convalescent plasma from patients who have recovered from COVID-19. The coronavirus genome is highly prone to mutations that lead to genetic drift and escape from immune recognition; thus, it is imperative that sub-strains with different mutations are also accounted for during vaccine development. As the disease has grown to become a pandemic, B-cell and T-cell epitopes predicted from SARS coronavirus have been reported. Using the epitope information along with variants of the virus, we have found several variants which might cause drifts. Among such variants, 23403A>G variant (p.D614G) in spike protein B-cell epitope is observed frequently in European countries, such as the Netherlands, Switzerland, and France, but seldom observed in China.
Article
Full-text available
In a phylogenetic network analysis of 160 complete human severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2) genomes, we find three central variants distinguished by amino acid changes, which we have named A, B, and C, with A being the ancestral type according to the bat outgroup coronavirus. The A and C types are found in significant proportions outside East Asia, that is, in Europeans and Americans. In contrast, the B type is the most common type in East Asia, and its ancestral genome appears not to have spread outside East Asia without first mutating into derived B types, pointing to founder effects or immunological or environmental resistance against this type outside Asia. The network faithfully traces routes of infections for documented coronavirus disease 2019 (COVID-19) cases, indicating that phylogenetic networks can likewise be successfully used to help trace undocumented COVID-19 infection sources, which can then be quarantined to prevent recurrent spread of the disease worldwide.
Article
The COVID-19 pandemic caused by the novel coronavirus SARS-CoV-2 has spread globally, with >52,000 cases in California as of May 4, 2020. Here we investigate the genomic epidemiology of SARS-CoV-2 in Northern California from late January to mid-March 2020, using samples from 36 patients spanning 9 counties and the Grand Princess cruise ship. Phylogenetic analyses revealed the cryptic introduction of at least 7 different SARS-CoV-2 lineages into California, including epidemic WA1 strains associated with Washington State, with lack of a predominant lineage and limited transmission between communities. Lineages associated with outbreak clusters in 2 counties were defined by a single base substitution in the viral genome. These findings support contact tracing, social distancing, and travel restrictions to contain SARS-CoV-2 spread in California and other states.