PreprintPDF Available

Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120 and Gag

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

We are currently witnessing a major epidemic caused by the 2019 novel coronavirus (2019- nCoV). The evolution of 2019-nCoV remains elusive. We found 4 insertions in the spike glycoprotein (S) which are unique to the 2019-nCoV and are not present in other coronaviruses. Importantly, amino acid residues in all the 4 inserts have identity or similarity to those in the HIV-1 gp120 or HIV-1 Gag. Interestingly, despite the inserts being discontinuous on the primary amino acid sequence, 3D-modelling of the 2019-nCoV suggests that they converge to constitute the receptor binding site. The finding of 4 unique inserts in the 2019-nCoV, all of which have identity /similarity to amino acid residues in key structural proteins of HIV-1 is unlikely to be fortuitous in nature. This work provides yet unknown insights on 2019-nCoV and sheds light on the evolution and pathogenicity of this virus with important implications for diagnosis of this virus.
Content may be subject to copyright.
Uncanny similarity of unique inserts in the 2019-nCoV spike protein to HIV-1 gp120
and Gag
Prashant Pradhan$1,2, Ashutosh Kumar Pandey$1, Akhilesh Mishra$1, Parul Gupta1, Praveen
Kumar Tripathi1, Manoj Balakrishnan Menon1, James Gomes1, Perumal Vivekanandan*1and
Bishwajit Kundu*1
1Kusuma School of biological sciences, Indian institute of technology, New Delhi-110016, India.
2Acharya Narendra Dev College, University of Delhi, New Delhi-110019, India
$Equal contribution
* Corresponding authors- email: bkundu@bioschool.iitd.ac.in
vperumal@bioschool.iitd.ac.in
Abstract:
We are currently witnessing a major epidemic caused by the 2019 novel coronavirus (2019-
nCoV). The evolution of 2019-nCoV remains elusive. We found 4 insertions in the spike
glycoprotein (S) which are unique to the 2019-nCoV and are not present in other coronaviruses.
Importantly, amino acid residues in all the 4 inserts have identity or similarity to those in the HIV-
1 gp120 or HIV-1 Gag. Interestingly, despite the inserts being discontinuous on the primary
amino acid sequence, 3D-modelling of the 2019-nCoV suggests that they converge to constitute
the receptor binding site. The finding of 4 unique inserts in the 2019-nCoV, all of which have
identity /similarity to amino acid residues in key structural proteins of HIV-1 is unlikely to be
fortuitous in nature. This work provides yet unknown insights on 2019-nCoV and sheds light on
the evolution and pathogenicity of this virus with important implications for diagnosis of this virus.
Introduction
Coronaviruses (CoV) are single-stranded positive-sense RNA viruses that infect animals and
humans. These are classified into 4 genera based on their host specificity: Alphacoronavirus,
Betacoronavirus, Deltacoronavirus and Gammacoronavirus (Snijder et al., 2006). There are seven
known types of CoVs that includes 229E and NL63 (Genus Alphacoronavirus), OC43, HKU1,
MERS and SARS (Genus Betacoronavirus). While 229E, NL63, OC43, and HKU1 commonly
infect humans, the SARS and MERS outbreak in 2002 and 2012 respectively occurred when the
virus crossed-over from animals to humans causing significant mortality (J. Chan et al., n.d.; J. F.
W. Chan et al., 2015). In December 2019, another outbreak of coronavirus was reported from
Wuhan, China that also transmitted from animals to humans. This new virus has been temporarily
termed as 2019-novel Coronavirus (2019-nCoV) by the World Health Organization (WHO) (J. F.-
W. Chan et al., 2020; Zhu et al., 2020). While there are several hypotheses about the origin of
2019-nCoV, the source of this ongoing outbreak remains elusive.
The transmission patterns of 2019-nCoV is similar to patterns of transmission documented in the
previous outbreaks including by bodily or aerosol contact with persons infected with the virus.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Cases of mild to severe illness, and death from the infection have been reported from Wuhan. This
outbreak has spread rapidly distant nations including France, Australia and USA among others.
The number of cases within and outside China are increasing steeply. Our current understanding
is limited to the virus genome sequences and modest epidemiological and clinical data.
Comprehensive analysis of the available 2019- nCoV sequences may provide important clues that
may help advance our current understanding to manage the ongoing outbreak.
The spike glycoprotein (S) of cornonavirus is cleaved into two subunits (S1 and S2). The S1
subunit helps in receptor binding and the S2 subunit facilitates membrane fusion (Bosch et al.,
2003; Li, 2016). The spike glycoproteins of coronoviruses are important determinants of tissue
tropism and host range. In addition the spike glycoproteins are critical targets for vaccine
development (Du et al., 2013). For this reason, the spike proteins represent the most extensively
studied among coronaviruses. We therefore sought to investigate the spike glycoprotein of the
2019-nCoV to understand its evolution, novel features sequence and structural features using
computational tools.
Methodology
Retrieval and alignment of nucleic acid and protein sequences
We retrieved all the available coronavirus sequences (n=55) from NCBI viral genome database
(https://www.ncbi.nlm.nih.gov/) and we used the GISAID (Elbe & Buckland-Merrett,
2017)[https://www.gisaid.org/] to retrieve all available full-length sequences (n=28) of 2019-
nCoV as on 27 Jan 2020. Multiple sequence alignment of all coronavirus genomes was performed
by using MUSCLE software (Edgar, 2004) based on neighbour joining method. Out of 55
coronavirus genome 32 representative genomes of all category were used for phylogenetic tree
development using MEGAX software (Kumar et al., 2018). The closest relative was found to be
SARS CoV. The glycoprotein region of SARS CoV and 2019-nCoV were aligned and visualized
using Multalin software (Corpet, 1988). The identified amino acid and nucleotide sequence were
aligned with whole viral genome database using BLASTp and BLASTn. The conservation of the
nucleotide and amino acid motifs in 28 clinical variants of 2019-nCoV genome were presented by
performing multiple sequence alignment using MEGAX software. The three dimensional structure
of 2019-nCoV glycoprotein was generated by using SWISS-MODEL online server (Biasini et al.,
2014) and the structure was marked and visualized by using PyMol (DeLano, 2002).
Results
Uncanny similarity of novel inserts in the 2019-nCoV spike protein to HIV-1 gp120 and
Gag
Our phylogentic tree of full-length coronaviruses suggests that 2019-nCoV is closely related to
SARS CoV [Fig1]. In addition, other recent studies have linked the 2019-nCoV to SARS CoV.
We therefore compared the spike glycoprotein sequences of the 2019-nCoV to that of the SARS
CoV (NCBI Accession number: AY390556.1). On careful examination of the sequence
alignment we found that the 2019- nCoV spike glycoprotein contains 4 insertions [Fig.2]. To
further investigate if these inserts are present in any other corona virus, we performed a multiple
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
sequence alignment of the spike glycoprotein amino acid sequences of all available
coronaviruses (n=55) [refer Table S.File1] in NCBI refseq (ncbi.nlm.nih.gov) this includes one
sequence of 2019-nCoV[Fig.S1]. We found that these 4 insertions [inserts 1, 2, 3 and 4] are
unique to 2019-nCoV and are not present in other coronaviruses analyzed. Another group from
China had documented three insertions comparing fewer spike glycoprotein sequences of
coronaviruses . Another group from China had documented three insertions comparing fewer
spike glycoprotein sequences of coronaviruses (Zhou et al., 2020).
Figure 1: Maximum likelihood genealogy show the evolution of 2019- nCoV: The evolutionary history
was inferred by using the Maximum Likelihood method and JTT matrix-based model. The tree
with the highest log likelihood (12458.88) is shown. Initial tree(s) for the heuristic search were
obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise
distances estimated using a JTT model, and then selecting the topology with superior log likelihood
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
value. This analysis involved 5 amino acid sequences. There were a total of 1387 positions in the
final dataset. Evolutionary analyses were conducted in MEGA X.
Figure 2: Multiple sequence alignment between spike proteins of 2019-nCoV and SARS. The
sequences of spike proteins of 2019-nCoV (Wuhan-HU-1, Accession NC_045512) and of SARS
CoV (GZ02, Accession AY390556) were aligned using MultiAlin software. The sites of difference
are highlighted in boxes.
We then analyzed all available full-length sequences (n=28) of 2019-nCoV in GISAID (Elbe &
Buckland-Merrett, 2017) as on January 27, 2020 for the presence of these inserts. As most of these
sequences are not annotated, we compared the nucleotide sequences of the spike glycoprotein of
all available 2019-nCoV sequences using BLASTp. Interestingly, all the 4 insertions were
absolutely (100%) conserved in all the available 2019- nCoV sequences analyzed [Fig.S2, Fig.S3].
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
We then translated the aligned genome and found that these inserts are present in all Wuhan 2019-
nCoV viruses except the 2019-nCoV virus of Bat as a host [Fig.S4]. Intrigued by the 4 highly
conserved inserts unique to 2019-nCoV we wanted to understand their origin. For this purpose,
we used the 2019-nCoV local alignment with each insert as query against all virus genomes and
considered hits with 100% sequence coverage. Surprisingly, each of the four inserts aligned with
short segments of the Human immunodeficiency Virus-1 (HIV-1) proteins. The amino acid
positions of the inserts in 2019-nCoV and the corresponding residues in HIV-1 gp120 and HIV-1
Gag are shown in Table 1. The first 3 inserts (insert 1,2 and 3) aligned to short segments of amino
acid residues in HIV-1 gp120. The insert 4 aligned to HIV-1 Gag. The insert 1 (6 amino acid
residues) and insert 2 (6 amino acid residues) in the spike glycoprotein of 2019-nCoV are 100%
identical to the residues mapped to HIV-1 gp120. The insert 3 (12 amino acid residues) in 2019-
nCoV maps to HIV-1 gp120 with gaps [see Table 1]. The insert 4 (8 amino acid residues) maps to
HIV-1 Gag with gaps.
Although, the 4 inserts represent discontiguous short stretches of amino acids in spike glycoprotein
of 2019-nCoV, the fact that all three of them share amino acid identity or similarity with HIV-1
gp120 and HIV-1 Gag (among all annotated virus proteins) suggests that this is not a random
fortuitous finding. In other words, one may sporadically expect a fortuitous match for a stretch of
6-12 contiguous amino acid residues in an unrelated protein. However, it is unlikely that all 4
inserts in the 2019-nCoV spike glycoprotein fortuitously match with 2 key structural proteins of
an unrelated virus (HIV-1).
The amino acid residues of inserts 1, 2 and 3 of 2019-nCoV spike glycoprotein that mapped to
HIV-1 were a part of the V4, V5 and V1 domains respectively in gp120 [Table 1]. Since the 2019-
nCoV inserts mapped to variable regions of HIV-1, they were not ubiquitous in HIV-1 gp120, but
were limited to selected sequences of HIV-1 [ refer S.File1] primarily from Asia and Africa.
The HIV-1 Gag protein enables interaction of virus with negatively charged host surface
(Murakami, 2008) and a high positive charge on the Gag protein is a key feature for the host-virus
interaction. On analyzing the pI values for each of the 4 inserts in 2019-nCoV and the
corresponding stretches of amino acid residues from HIV-1 proteins we found that a) the pI values
were very similar for each pair analyzed b) most of these pI values were 10±2 [Refer Table 1] . Of
note, despite the gaps in inserts 3 and 4 the pI values were comparable. This uniformity in the pI
values for all the 4 inserts merits further investigation.
As none of these 4 inserts are present in any other coronavirus, the genomic region encoding these
inserts represent ideal candidates for designing primers that can distinguish 2019-nCoV from other
coronaviruses.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Motifs
Virus
Glycoprotein
Motif Alignment
HIV
protein
and
Variable
region
HIV
Genome
Source
Country/
subtype
Total
Char
ge
pI
Valu
e
Insert
1
2019- nCoV (GP)
HIV1(GP120)
71 76
TNGTKR
TNGTKR
404 409
gp120-
V4
Thailand
*/
CRF01_
AE
2
2
11
11
Insert
2
2019- nCoV (GP)
HIV1(GP120)
145 150
HKNNKS
HKNNKS
462 467
gp120-
V5
Kenya*/
G
2
2
10
10
Insert
3
2019- nCoV (GP)
HIV1(GP120)
245 256
RSYL- - - -TPGDSSSG
RTYLFNETRGNSSSG
136 150
gp120-
V1
India*/C
2
1
10.84
8.75
Insert
4
2019- nCoV (Poly
P)
HIV1(gag)
676 684
QTNS-----------------------PRRA
QTNSSILMQRSNFKG PRRA
366 384
Gag
India*/C
2
4
12.00
12.30
Table 1: Aligned sequences of 2019-nCoV and gp120 protein of HIV-1 with their positions
in primary sequence of protein. All the inserts have a high density of positively charged
residues. The deleted fragments in insert 3 and 4 increase the positive charge to surface area
ratio. *please see Supp. Table 1 for accession numbers
The novel inserts are part of the receptor binding site of 2019-nCoV
To get structural insights and to understand the role of these insertions in 2019-nCoV glycoprotein,
we modelled its structure based on available structure of SARS spike glycoprotein (PDB:
6ACD.1.A). The comparison of the modelled structure reveals that although inserts 1,2 and 3 are
at non-contiguous locations in the protein primary sequence, they fold to constitute the part of
glycoprotein binding site that recognizes the host receptor (Kirchdoerfer et al., 2016) (Figure 4).
The insert 1 corresponds to the NTD (N-terminal domain) and the inserts 2 and 3 correspond to
the CTD (C-terminal domain) of the S1 subunit in the 2019-nCoV spike glycoprotein. The insert
4 is at the junction of the SD1 (sub domain 1) and SD2 (sub domain 2) of the S1 subunit (Ou et
al., 2017). We speculate, that these insertions provide additional flexibility to the glycoprotein
binding site by forming a hydrophilic loop in the protein structure that may facilitate or enhance
virus-host interactions.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Figure 3. Modelled homo-trimer spike glycoprotein of 2019-nCoV virus. The inserts from HIV
envelop protein are shown with colored beads, present at the binding site of the protein.
Evolutionary Analysis of 2019-nCoV
It has been speculated that 2019-nCoV is a variant of Coronavirus derived from an animal source
which got transmitted to humans. Considering the change of specificity for host, we decided to
study the sequences of spike glycoprotein (S protein) of the virus. S proteins are surface proteins
that help the virus in host recognition and attachment. Thus, a change in these proteins can be
reflected as a change of host specificity of the virus. To know the alterations in S protein gene of
2019-nCoV and its consequences in structural re-arrangements we performed in-sillico analysis of
2019-nCoV with respect to all other viruses. A multiple sequence alignment between the S protein
amino acid sequences of 2019-nCoV, Bat-SARS-Like, SARS-GZ02 and MERS revealed that S
protein has evolved with closest significant diversity from the SARS-GZ02 (Figure 1).
Insertions in Spike protein region of 2019-nCoV
Since the S protein of 2019-nCoV shares closest ancestry with SARS GZ02, the sequence coding
for spike proteins of these two viruses were compared using MultiAlin software. We found four
new insertions in the protein of 2019-nCoV- “GTNGTKR” (IS1), “HKNNKS” (IS2), “GDSSSG”
(IS3) and “QTNSPRRA” (IS4) (Figure 2). To our surprise, these sequence insertions were not only
absent in S protein of SARS but were also not observed in any other member of the Coronaviridae
family (Supplementary figure). This is startling as it is quite unlikely for a virus to have acquired
such unique insertions naturally in a short duration of time.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Insertions share similarity to HIV
The insertions were observed to be present in all the genomic sequences of 2019-nCoV virus
available from the recent clinical isolates (Supplementary Figure 1). To know the source of these
insertions in 2019-nCoV a local alignment was done with BLASTp using these insertions as query
with all virus genome. Unexpectedly, all the insertions got aligned with Human immunodeficiency
Virus-1 (HIV-1). Further analysis revealed that aligned sequences of HIV-1 with 2019-nCoV were
derived from surface glycoprotein gp120 (amino acid sequence positions: 404-409, 462-467, 136-
150) and from Gag protein (366-384 amino acid) (Table 1). Gag protein of HIV is involved in host
membrane binding, packaging of the virus and for the formation of virus-like particles. Gp120
plays crucial role in recognizing the host cell by binding to the primary receptor CD4.This binding
induces structural rearrangements in GP120, creating a high affinity binding site for a chemokine
co-receptor like CXCR4 and/or CCR5.
Discussion
The current outbreak of 2019-nCoV warrants a thorough investigation and understanding of its
ability to infect human beings. Keeping in mind that there has been a clear change in the preference
of host from previous coronaviruses to this virus, we studied the change in spike protein between
2019-nCoV and other viruses. We found four new insertions in the S protein of 2019-nCoV when
compared to its nearest relative, SARS CoV. The genome sequence from the recent 28 clinical
isolates showed that the sequence coding for these insertions are conserved amongst all these
isolates. This indicates that these insertions have been preferably acquired by the 2019-nCoV,
providing it with additional survival and infectivity advantage. Delving deeper we found that these
insertions were similar to HIV-1. Our results highlight an astonishing relation between the gp120
and Gag protein of HIV, with 2019-nCoV spike glycoprotein. These proteins are critical for the
viruses to identify and latch on to their host cells and for viral assembly (Beniac et al., 2006).
Since surface proteins are responsible for host tropism, changes in these proteins imply a change
in host specificity of the virus. According to reports from China, there has been a gain of host
specificity in case 2019-nCoV as the virus was originally known to infect animals and not humans
but after the mutations, it has gained tropism to humans as well.
Moving ahead, 3D modelling of the protein structure displayed that these insertions are present at
the binding site of 2019-nCoV. Due to the presence of gp120 motifs in 2019-nCoV spike
glycoprotein at its binding domain, we propose that these motif insertions could have provided an
enhanced affinity towards host cell receptors. Further, this structural change might have also
increased the range of host cells that 2019-nCoV can infect. To the best of our knowledge, the
function of these motifs is still not clear in HIV and need to be explored. The exchange of genetic
material among the viruses is well known and such critical exchange highlights the risk and the
need to investigate the relations between seemingly unrelated virus families.
Conclusions
Our analysis of the spike glycoprotein of 2019-nCoV revealed several interesting findings: First,
we identified 4 unique inserts in the 2019-nCoV spike glycoprotein that are not present in any
other coronavirus reported till date. To our surprise, all the 4 inserts in the 2019-nCoV mapped to
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
short segments of amino acids in the HIV-1 gp120 and Gag among all annotated virus proteins in
the NCBI database. This uncanny similarity of novel inserts in the 2019- nCoV spike protein to
HIV-1 gp120 and Gag is unlikely to be fortuitous. Further, 3D modelling suggests that atleast 3 of
the unique inserts which are non-contiguous in the primary protein sequence of the 2019-nCoV
spike glycoprotein converge to constitute the key components of the receptor binding site. Of note,
all the 4 inserts have pI values of around 10 that may facilitate virus-host interactions. Taken
together, our findings suggest unconventional evolution of 2019-nCoV that warrants further
investigation. Our work highlights novel evolutionary aspects of the 2019-nCoV and has
implications on the pathogenesis and diagnosis of this virus.
References
Beniac, D. R., Andonov, A., Grudeski, E., & Booth, T. F. (2006). Architecture of the SARS coronavirus
prefusion spike. Nature Structural and Molecular Biology, 13(8), 751752.
https://doi.org/10.1038/nsmb1123
Biasini, M., Bienert, S., Waterhouse, A., Arnold, K., Studer, G., Schmidt, T., Kiefer, F., Cassarino, T. G.,
Bertoni, M., Bordoli, L., & Schwede, T. (2014). SWISS-MODEL: Modelling protein tertiary and
quaternary structure using evolutionary information. Nucleic Acids Research.
https://doi.org/10.1093/nar/gku340
Bosch, B. J., van der Zee, R., de Haan, C. A. M., & Rottier, P. J. M. (2003). The Coronavirus Spike Protein Is
a Class I Virus Fusion Protein: Structural and Functional Characterization of the Fusion Core
Complex. Journal of Virology, 77(16), 88018811. https://doi.org/10.1128/jvi.77.16.8801-
8811.2003
Chan, J. F.-W., Kok, K.-H., Zhu, Z., Chu, H., To, K. K.-W., Yuan, S., & Yuen, K.-Y. (2020). Genomic
characterization of the 2019 novel human-pathogenic coronavirus isolated from a patient with
atypical pneumonia after visiting Wuhan. Emerging Microbes & Infections, 9(1), 221236.
https://doi.org/10.1080/22221751.2020.1719902
Chan, J. F. W., Lau, S. K. P., To, K. K. W., Cheng, V. C. C., Woo, P. C. Y., & Yuen, K.-Y. (2015). Middle East
Respiratory Syndrome Coronavirus: Another Zoonotic Betacoronavirus Causing SARS-Like Disease.
https://doi.org/10.1128/CMR.00102-14
Chan, J., To, K., Tse, H., Jin, D., microbiology, K. Y.-T. in, & 2013, undefined. (n.d.). Interspecies
transmission and emergence of novel viruses: lessons from bats and birds. Elsevier.
Corpet, F. (1988). Multiple sequence alignment with hierarchical clustering. Nucleic Acids Research.
https://doi.org/10.1093/nar/16.22.10881
DeLano, W. L. (2002). The PyMOL Molecular Graphics System, Version 1.1. Schr{ö}dinger LLC.
https://doi.org/10.1038/hr.2014.17
Du, L., Zhao, G., Kou, Z., Ma, C., Sun, S., Poon, V. K. M., Lu, L., Wang, L., Debnath, A. K., Zheng, B.-J., Zhou,
Y., & Jiang, S. (2013). Identification of a Receptor-Binding Domain in the S Protein of the Novel
Human Coronavirus Middle East Respiratory Syndrome Coronavirus as an Essential Target for
Vaccine Development. Journal of Virology, 87(17), 99399942. https://doi.org/10.1128/jvi.01048-
13
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput.
Nucleic Acids Research. https://doi.org/10.1093/nar/gkh340
Elbe, S., & Buckland-Merrett, G. (2017). Data, disease and diplomacy: GISAID’s innovative contribution
to global health. Global Challenges. https://doi.org/10.1002/gch2.1018
Kirchdoerfer, R. N., Cottrell, C. A., Wang, N., Pallesen, J., Yassine, H. M., Turner, H. L., Corbett, K. S.,
Graham, B. S., McLellan, J. S., & Ward, A. B. (2016). Pre-fusion structure of a human coronavirus
spike protein. Nature. https://doi.org/10.1038/nature17200
Kumar, S., Stecher, G., Li, M., Knyaz, C., & Tamura, K. (2018). MEGA X: Molecular evolutionary genetics
analysis across computing platforms. Molecular Biology and Evolution.
https://doi.org/10.1093/molbev/msy096
Li, F. (2016). Structure, Function, and Evolution of Coronavirus Spike Proteins. Annual Review of
Virology, 3(1), 237261. https://doi.org/10.1146/annurev-virology-110615-042301
Murakami, T. (2008). Roles of the interactions between Env and Gag proteins in the HIV-1 replication
cycle. Microbiology and Immunology, 52(5), 287295. https://doi.org/10.1111/j.1348-
0421.2008.00008.x
Ou, X., Guan, H., Qin, B., Mu, Z., Wojdyla, J. A., Wang, M., Dominguez, S. R., Qian, Z., & Cui, S. (2017).
Crystal structure of the receptor binding domain of the spike glycoprotein of human
betacoronavirus HKU1. Nature Communications. https://doi.org/10.1038/ncomms15216
Snijder, E. J., van der Meer, Y., Zevenhoven-Dobbe, J., Onderwater, J. J. M., van der Meulen, J., Koerten,
H. K., & Mommaas, A. M. (2006). Ultrastructure and origin of membrane vesicles associated with
the severe acute respiratory syndrome coronavirus replication complex. Journal of Virology,
80(12), 59275940. https://doi.org/10.1128/JVI.02501-05
Zhou, P., Yang, X.-L., Wang, X.-G., Hu, B., Zhang, L., Zhang, W., Si, H.-R., Zhu, Y., Li, B., Huang, C.-L., Chen,
H.-D., Chen, J., Luo, Y., Guo, H., Jiang, R.-D., Liu, M.-Q., Chen, Y., Shen, X.-R., Wang, X., … Shi, Z.-L.
(2020). Discovery of a novel coronavirus associated with the recent pneumonia outbreak in
humans and its potential bat origin. BioRxiv. https://doi.org/10.1101/2020.01.22.914952
Zhu, N., Zhang, D., Wang, W., Li, X., Yang, B., Song, J., Zhao, X., Huang, B., Shi, W., Lu, R., Niu, P., Zhan, F.,
Ma, X., Wang, D., Xu, W., Wu, G., Gao, G. F., & Tan, W. (2020). A Novel Coronavirus from Patients
with Pneumonia in China, 2019. New England Journal of Medicine, NEJMoa2001017.
https://doi.org/10.1056/NEJMoa2001017
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Fig.S1 Multiple sequence alignment of glycoprotein of coronaviridae family, representing all the
four inserts.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Fig.S2: All four inserts are present in the aligned 28 Wuhan 2019-nCoV virus genomes obtained
from GISAID. The gap in the Bat-SARS Like CoV in the last row shows that insert 1 and 4 is very
unique to Wuhan 2019-nCoV.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Fig.S3 Phylogenetic tree of 28 clinical isolates genome of 2019-nCoV including one from bat as a host.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
Supplementary Fig 4. Genome alingment of Coronaviridae family. Highlighted black sequences are the
inserts represented here.
.CC-BY-NC-ND 4.0 International licenseIt is made available under a perpetuity.preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in
The copyright holder for this. http://dx.doi.org/10.1101/2020.01.30.927871doi: bioRxiv preprint first posted online Jan. 31, 2020;
... First, it is optimized for human Angiotensin-converting enzyme 2 receptor (hACE2) binding [50]. Second, the Spike bears 4 HIV-like inserts with a high-density positive charge, very similar to HIV-1 surface proteins gp120 and Gag (as stated by Pradhan and Zhang in papers that were retracted for other reasons [51,52]), absent in other coronaviruses (absent even in SARS). For this reason, SARS-CoV-2 Spike, similar to HIV, binds Top. ...
... First, it is optimized for human Angiotensin-converting enzyme 2 receptor (hACE2) binding [50]. Second, the Spike bears 4 HIV-like inserts with a high-density positive charge, very similar to HIV-1 surface proteins gp120 and Gag (as stated by Pradhan and Zhang in papers that were retracted for other reasons [51,52]), absent in other coronaviruses (absent even in SARS). For this reason, SARS-CoV-2 Spike, similar to HIV, binds CLEC4M (or CD299) and DC-SIGNR (or CD209), facilitating the infection of the immune system [53]. ...
Article
Full-text available
Despite what its name suggests, the effects of the COVID-19 pandemic causative agent “Severe Acute Respiratory Syndrome Coronavirus-2” (SARS-CoV-2) were not always confined, neither temporarily (being long-term rather than acute, referred to as Long COVID) nor spatially (affecting several body systems). Moreover, the in-depth study of this ss(+) RNA virus is defying the established scheme according to which it just had a lytic cycle taking place confined to cell membranes and the cytoplasm, leaving the nucleus basically “untouched”. Cumulative evidence shows that SARS-CoV-2 components disturb the transport of certain proteins through the nuclear pores. Some SARS-CoV-2 structural proteins such as Spike (S) and Nucleocapsid (N), most non-structural proteins (remarkably, Nsp1 and Nsp3), as well as some accessory proteins (ORF3d, ORF6, ORF9a) can reach the nucleoplasm either due to their nuclear localization signals (NLS) or taking a shuttle with other proteins. A percentage of SARS-CoV-2 RNA can also reach the nucleoplasm. Remarkably, controversy has recently been raised by proving that-at least under certain conditions-, SARS-CoV-2 sequences can be retrotranscribed and inserted as DNA in the host genome, giving rise to chimeric genes. In turn, the expression of viral-host chimeric proteins could potentially create neo-antigens, activate autoimmunity and promote a chronic pro-inflammatory state.
... Figure tirée de l'article rétracté de Pradhan et al.[85] : modèle de glycoprotéine spicule du SRAS-CoV-2 (dénommé 2019-nCoV au moment de la publication). Les quatre séquences issues du VIH sont présentées en perles colorées : trois séquences de la Gp120 du VIH se liant à DC-SIGN se retrouvent sur les sites de liaison de la spike du SRAS-CoV-2. ...
... La quatrième séquence a une homologie avec la protéine gag du VIH et concerne le site furine Pourtant, des séquences trop courtes pour avoir une activité biologique pourraient être justement la trace laissée par ces recherches. Quant à leur spécificité, elle a été repérée dès le 31 janvier 2020, soit quelques semaines après la première publication de la séquence du virus89 , dans un article indien[85] retracté deux jours plus tard par leurs auteurs devant les attaques des scientifiques institutionnels, en premier lieu Anthony Fauci[75]. Cet article montrait également l'existence de ces séquences d'homologie avec le VIH sur les sites de liaison de la protéine spicule du SRAS-CoV-2. ...
Conference Paper
Full-text available
La présentation aborde la colonisation des institutions, de l’industrie du médicament et des dispositifs biomédicaux par un complexe biosécuritaire militaro-industriel qui a engendré un risque collectif intentionnel aux effets multiples et imprévisibles. Elle déroule un fil conducteur qui va de la Conférence de Washington de 1989 à la pandémie de covid de 2019 pour montrer que le choix biosécuritaire des Etats-Unis, affairiste et ruineux, s'est soldé par un désastre pour la population mondiale.
... Finally, in a notable case of incorrect findings having a potential negative impact when released without peer review, a preprint claiming to find "uncanny similarity" between SARS-CoV-2 and HIV was posted to bioRxiv on 31 January 2020. However, it was withdrawn two days later after other scientists posted public comments identifying errors in its analysis (Pradhan, et al., 2020). ...
Article
Full-text available
Many argue that swift and fundamental interventions in the system of scholarly communication are needed. However, there are substantial disagreements over the short- and long-term benefits of most proposed approaches to changing the practice of science communication, and the lack of systematic, empirically based research in this area makes these controversies difficult to resolve. We argue that experience within public health can be usefully applied to scholarly communication. Starting with the history of DDT (Dichlorodiphenyltrichloroethane) application, we illustrate four ways complex human systems threaten reliable predictions and blunt ad-hoc interventions. We then show how these apply to interventions in scholarly publication – open access based on the article processing charge (APC), and preprints – to yield surprising results. Finally, we offer approaches to help guide the design of future interventions: identifying measures and outcomes, developing infrastructure, incorporating assessment, and contributing to theories of systemic change.
... A paper published on bioRxiv suggested that the COVID-19 virus was genetically engineered because it is similar to HIV. It received widespread public attention, with some citing it as evidence that COVID-19 is a biological weapon (12,14) . ...
Article
Full-text available
Introduction: Preprints have become an important tool for meeting the challenges of health communication in the context of COVID-19. They allow scientists to disseminate their results more quickly due to the absence of a peer review process. Preprints have been well-received by scientists, however, there have been concerns about the exposure of wider public audiences to preprints due in part to this lack of peer review. Methods: The aim of this study is to examine the dissemination of preprints on medRxiv and bioRxiv during the COVID-19 pandemic using content analysis and statistical analysis. Results: Our findings show that preprints have played an unprecedented role in disseminating COVID-19-related science results to the public. Discussion: While the overall media coverage of preprints is unsatisfactory, digital native news media performed better than legacy media in reporting preprints, which means that we could make the most of digital native media to improve health communication. This study contributes to understanding how science communication has evolved in response to the COVID-19 pandemic and provides some practical recommendations.
... In the context of SARS-CoV, one of the controversies regarding the natural origin of SARS-CoV-2 is that its S gene has multiple novel sequence insertions. Zhang C. et al. analyzed the report by Pradhan et al. (withdrawn) (Pradhan et al., 2020) on the presence of four unique novel sequences in the SARS-CoV-2 S gene and showed that these four sequence insertions were not related to the receptor-binding domain (RBD) (Zhang C. et al., 2020). A recent study identified S gene novel sequence insertions among several key genomic features that differentiate SARS-CoV-2 from other beta-coronaviruses, particularly SARS-CoV and MERS-CoV (Gussow et al., 2020). ...
Article
Full-text available
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes severe pathophysiology in vulnerable older populations and appears to be highly pathogenic and more transmissible than other coronaviruses. The spike (S) protein appears to be a major pathogenic factor that contributes to the unique pathogenesis of SARS-CoV-2. Although the S protein is a surface transmembrane type 1 glycoprotein, it has been predicted to be translocated into the nucleus due to the novel nuclear localization signal (NLS) “PRRARSV,” which is absent from the S protein of other coronaviruses. Indeed, S proteins translocate into the nucleus in SARS-CoV-2-infected cells. S mRNAs also translocate into the nucleus. S mRNA colocalizes with S protein, aiding the nuclear translocation of S mRNA. While nuclear translocation of nucleoprotein (N) has been shown in many coronaviruses, the nuclear translocation of both S mRNA and S protein reveals a novel feature of SARS-CoV-2.
... The standard scientific literature review procedures have been aimed for in this paper and we highlighted results collected using computational biology methods suggesting that there is a direct correlation between four fragments of the spike glycoprotein and certain regions from the gp120 and gag glycoproteins that are found in the HIV genome. Namely, the GTNGTKR, YYHKNNKS, G0SSSG and QTNSPRRA amino acid sequence inserts in the SARS-CoV-2 spike glycoprotein had not been found in previous coronaviruses and they had been determined to be genetically similar with specific regions of the gp120 and gag retroviral proteins (Pradhan et al., 2020). The computational biology methods used to compare the concerned structures were evaluated with care and using an objective approach. ...
Preprint
SARS-CoV-2 represents the most recent species of the Betacoronavirus genus, and it caused the most severe public health crisis since the 1918’s pandemic disease of Influenza A, which resulted in 30-50 million deaths worldwide. With regards to the origins of the virus, the scientific community is now divided into two major groups. The first group believes it fully underwent natural selection and that the first infection incident that was related to the Huanan sea market occurred after a zoonotic process took place between bats and humans, via the pangolin species. The second group believes that, as the virus was being naturally selected and undergoing a process of zoonosis, local laboratory researchers isolated the evolving virus in the P4 section of Wuhan’s Institute of Laboratory and added to it further antigens in order to test the reaction of the immune system in certain bats. As time went by, more and more evidence that the virus underwent a “gain-of-function” research before an accidental or a deliberate lab-leak occurred was presented in scientific literature. Moreover, there are peculiar commonalities between the novel coronavirus and the human immunodeficiency virus at genomic and morbidity levels. It may be suspected that SARS-CoV-2 underwent a specific form of co-evolution with HIV-1, given the existence of similarities between SARS-CoV and HIV-1, and that an alleged gain-of-function research of the coronavirus increased the speed of a possible process of molecular “alikening” with the retrovirus in cause. It may also be suspected that genomic evidence with regards to such a matter remains substantial, despite the retraction of a preprint showing genomic similarities, given the lack of author’s reasonable explanation for paper retraction, as well as the existence of a molecular mechanism called LINE-1/HIV-1 Reverse Transcription, which was shown in a number of clinical occasions to insert around 1% of the viral genome - including approximately 1% of the +ssRNA encoding the spike protein - into the DNA of the host cell. It is far less common for other natural RNA viruses that are not retroviral in nature to experience reverse transcription in the human body, and there is virtually no scientific literature covering such an event for other non-retroviral RNA pathogens. Interestingly, there was a widespread antiretroviral drug repurposing for COVID-19, which included HAART, as well as ritonavir, lopinavir, darunavir and nelfinavir (Yu et al., 2021). Severe and prolonged forms of COVID-19 are associated with increased incidences of induced immunodeficiencies, whether short- or long-term (Garmendia et al., 2022). Moreover, both ssRNA viruses implicate the AIP4 protein as an important target for virulence, significantly affecting the quality of the induced interferon system’s activity (Olga Tarasova et al., 2020), and neuropilin-1 (NRP1) and the human Vascular Endothelial Growth Factor Receptor 2 (hVEGFR2) also represent important molecular and therapeutic targets for both SARS-CoV-2-induced COVID-19 and HIV-induced AIDS (Abel et al, 2022). The CXCR4 (Lourda M. et al, 2021) and CCR5 (Agresti N. et al., 2021) receptors on the helper CD4+ and cytotoxic CD8+ T-Lymphocytes were also shown to be significantly dysregulated following a SARS-CoV-2 infection of CD4+ helper and CD8+ cytotoxic T-lymphocytes, and CXCR4 (Daoud et al., 2022) and CCR5 (Patterson B. K. et al., 2021) receptor inhibitors have displayed a substantial level of efficacy against COVID-19 and AIDS. Both viruses may directly dysregulate T-lymphocyte activities of prime importance and often cause significant morbidity in the gastrointestinal tract. One in-vitro study indicated that SARS-CoV-2 is capable of infecting T-lymphocytes, and particularly the T cells expressing CD4+ upon their surface, without the involvement of activity by the ACE2 receptor (Shen X.R. et al., 2022). SARS-CoV-2 was also found to substantially dysregulate the activities of the CXCR4 and CCR5 receptors on their cellular surface (Agresti N. et al., 2021). The virus was also shown to be the most virulent in the circulatory system, and not in the respiratory system, and mucosal immunity was shown to be the strongest restricting factor for viral infectious spread in the body. There is a considerable probability that at least a few of the correlations presented above are accompanied by some dependence, and it is even a reduced extent of co-dependence as such that would still pose a substantial threat for the overall long-term safety of newly developed approaches that had the purpose of preventing severe cases of disease. It may be that an indirect and partial SARS-CoV-2 co-evolution with HIV-1 would have a major impact upon the safety and effectiveness of the spike protein-based vaccines, and there may now be substantial reasons to believe that the ongoing mass vaccination should be halted and re-evaluated at least for a precautionary purpose, given the approximately 1 in 800 recently projected ratio of severe adverse event per administered vaccines, and the phenomenon of immune evasion by new viral variants, despite the multiple number of administered vaccine doses.
... The standard scientific literature review procedures have been aimed for in this paper and we highlighted results collected using computational biology methods suggesting that there is a direct correlation between four fragments of the spike glycoprotein and certain regions from the gp120 and gag glycoproteins that are found in the HIV genome. Namely, the GTNGTKR, YYHKNNKS, G0SSSG and QTNSPRRA amino acid sequence inserts in the SARS-CoV-2 spike glycoprotein had not been found in previous coronavirus and they had been determined to be genetically similar with specific regions of the gp120 and gag retroviral proteins (Pradhan et al., 2020). The computational biology methods used to compare the concerned structures were evaluated with care and using an objective approach. ...
Preprint
SARS-CoV-2 represents the most recent species of the Betacoronavirus genus, and it caused the most severe public health crisis since the 1918’s pandemic disease of Influenza A, which resulted in 30-50 million deaths worldwide. With regards to the origins of the virus, the scientific community is now divided into two major groups. The first group believes it fully underwent natural selection and that the first infection incident that was related to the Huanan sea market occurred after a zoonotic process took place between bats and humans, via the pangolin species. The second group believes that, as the virus was being naturally selected and undergoing a process of zoonosis, local laboratory researchers isolated the evolving virus in the P4 section of Wuhan’s Institute of Laboratory and added to it further antigens in order to test the reaction of the immune system in certain bats. As time went by, more and more evidence that the virus underwent a “gain-of-function” research before an accidental or a deliberate lab-leak occurred was presented in scientific literature. Moreover, there are peculiar commonalities between the novel coronavirus and the human immunodeficiency virus at genomic and morbidity levels. It may be suspected that SARS-CoV-2 underwent a specific form of co-evolution with HIV-1, given the existence of similarities between SARS-CoV and HIV-1, and that an alleged gain-of-function research of the coronavirus increased the speed of a possible process of molecular “alikening” with the retrovirus in cause. It may also be suspected that genomic evidence with regards to such a matter remains substantial, despite the retraction of a preprint showing genomic similarities, given the lack of author’s reasonable explanation for paper retraction, as well as the existence of a molecular mechanism called LINE-1/HIV-1 Reverse Transcription, which was shown in a number of clinical occasions to insert around 1% of the viral genome - including approximately 1% of the +ssRNA encoding the spike protein - into the DNA of the host cell. It is far less common for other natural RNA viruses that are not retroviral in nature to experience reverse transcription in the human body, and there is virtually no scientific literature covering such an event for other non-retroviral RNA pathogens. Interestingly, there was a widespread antiretroviral drug repurposing for COVID-19, which included HAART, as well as ritonavir, lopinavir, darunavir and nelfinavir (Yu et al., 2021). Severe and prolonged forms of COVID-19 are associated with increased incidences of induced immunodeficiencies, whether short- or long-term (Garmendia et al., 2022). Moreover, both ssRNA viruses implicate the AIP4 protein as an important target for virulence, significantly affecting the quality of the induced interferon system’s activity (Olga Tarasova et al., 2020), and neuropilin-1 (NRP1) and the human Vascular Endothelial Growth Factor Receptor 2 (hVEGFR2) also represent important molecular and therapeutic targets for both SARS-CoV-2-induced COVID-19 and HIV-induced AIDS (Abel et al, 2022). The CXCR4 (Lourda M. et al, 2021) and CCR5 (Agresti N. et al., 2021) receptors on the helper CD4+ and cytotoxic CD8+ T-Lymphocytes were also shown to be significantly dysregulated following a SARS-CoV-2 infection of CD4+ helper and CD8+ cytotoxic T-lymphocytes, and CXCR4 (Daoud et al., 2022) and CCR5 (Patterson B. K. et al., 2021) receptor inhibitors have displayed a substantial level of efficacy against COVID-19 and AIDS. Both viruses may directly dysregulate T-lymphocyte activities of prime importance and often cause significant morbidity in the gastrointestinal tract. One in-vitro study indicated that SARS-CoV-2 is capable of infecting T-lymphocytes, and particularly the T cells expressing CD4+ upon their surface, without the involvement of activity by the ACE2 receptor (Shen X.R. et al., 2022). SARS-CoV-2 was also found to substantially dysregulate the activities of the CXCR4 and CCR5 receptors on their cellular surface (Agresti N. et al., 2021). The virus was also shown to be the most virulent in the circulatory system, and not in the respiratory system, and mucosal immunity was shown to be the strongest restricting factor for viral infectious spread in the body. There is a considerable probability that at least a few of the correlations presented above are accompanied by some dependence, and it is even a reduced extent of co-dependence as such that would still pose a substantial threat for the overall long-term safety of newly developed approaches that had the purpose of preventing severe cases of disease. It may be that an indirect and partial SARS-CoV-2 co-evolution with HIV-1 would have a major impact upon the safety and effectiveness of the spike protein-based vaccines, and there may now be substantial reasons to believe that the ongoing mass vaccination should be halted and re-evaluated at least for a precautionary purpose, given the approximately 1 in 800 recently projected ratio of severe adverse event per administered vaccines, and the phenomenon of immune evasion by new viral variants, despite the multiple number of administered vaccine doses.
Article
Background : The quality of COVID-19 preprints should be considered with great care, as their contents can influence public policy. Efforts to improve preprint quality have mostly focused on introducing quick peer review, but surprisingly little has been done to calibrate the public’s evaluation of preprints and their contents. The PRECHECK project aimed to generate a tool to teach and guide scientifically literate non-experts to critically evaluate preprints, on COVID-19 and beyond. Methods : To create a checklist, we applied a four-step procedure consisting of an initial internal review, an external review by a pool of experts (methodologists, meta-researchers/experts on preprints, journal editors, and science journalists), a final internal review, and an implementation stage. For the external review step, experts rated the relevance of each element of the checklist on five-point Likert scales, and provided written feedback. After each internal review round, we applied the checklist on a set of high-quality preprints from an online list of milestone research works on COVID-19 and low-quality preprints, which were eventually retracted, to verify whether the checklist can discriminate between the two categories. Results : At the external review step, 26 of the 54 contacted experts responded. The final checklist contained four elements (Research question, study type, transparency and integrity, and limitations), with ‘superficial’ and ‘deep’ levels for evaluation. When using both levels of evaluation, the checklist was effective at discriminating high- from low-quality preprints. Its usability was confirmed in workshops with our target audience: Bachelors students in Psychology and Medicine, and science journalists. Conclusions : We created a simple, easy-to-use tool for helping scientifically literate non-experts navigate preprints with a critical mind. We believe that our checklist has great potential to help guide decisions about the quality of preprints on COVID-19 in our target audience and that this extends beyond COVID-19.
Preprint
Full-text available
Due to the fact that to date, the question of the origin of SARS-CoV-2 has not been resolved yet, the author analyzed the main advances in the development of genetic engineering of viruses that took place before the onset of the COVID-19 pandemic. The first artificial genetically modified viruses could appear in nature in the mid-1950s. The technique of nucleic acid hybridization was developed by the end-1960s. In the late 1970s, a method called the "reverse genetics" emerged to synthesize RNA and DNA molecules. In the early 1980-s, it became possible to combine the genes of different viruses and insert the genes of one virus into the genome of another virus. Since that time, the production of vector vaccines began. Currently, by modern technologies one can assemble any virus based on the nucleotide sequence available in the virus database or designed by a computer as a virtual model.Scientists around the world are invited to answer the call of Neil Harrison and Jeffrey Sachs of Columbia University, for a thorough and independent investigation into the origin of SARS-CoV-2. Only a full understanding of the origin of the new virus can minimize the likelihood of a similar pandemic in the future.
Preprint
The COVID-19 pandemic and the recently-emerged highly transmissible SARS-CoV-2 Omicron variants have increased the demands for novel immunising and therapeutic approaches to protect the lives of patients with significant co-morbidities. Following a worldwide campaign of mass vaccination, there is still a significant demand to quell the harmful effects of the novel SARS-CoV-2 variants on people with serious co-morbidities, and there is still a dilemma of how we could prevent potentially catastrophic effects of future pandemics upon the human race. And the concerns intersect at a specific point; a gained evolutionary ability of several viruses over the previous centuries to go undetected during the first stages of infection by means of capping the 5' end of their genetic material, reducing the synthetic rate of Type I and Type III Interferons, temporarily inhibiting the apoptotic pathways of infected cells to facilitate a rapid viral replication, and inhibiting antigenic presentation. Type I and III Interferon-based viral immune evasion may be primarily associated with a delayed clearance of the viral load. Past clinical data also suggests that the SARS-CoV-2 spike glycoprotein is capable of inhibiting the V(D)J antibody gene rearrangement in developing B-lymphocytes, as well as diverse important cellular processes of DNA repair by downregulating the BRCA1 and 53BP1 genes. Furthermore, most traditional methods of vaccination do not particularly boost mucosal immunity and as a result, there is a visible gap that viruses can easily fill in, which implicates a reduced stimulation of a mucosal plasma cell production. Serum plasma antibodies do not cross the nasal epithelium and hence, offer little protection against mucosal inflammation, unlike the antibodies produced by mucosal plasma cells. We acknowledge the existence of a significant challenge to stimulate mucosal immune responses due to the high complexity of its structure-function axis. Nevertheless, over the past half century, numerous scientists developed ways of immunisation and early treatment worldwide that generally showed outstanding levels of success and insignificant risks of adverse events. An important example implicates the administration of human interferons I and III into the nasal mucosa to simulate local infection and train the innate immune system to robustly become activated and transmit essential signals before viruses silence it. Recently, it was discovered that specific plants secrete proteins that also stimulate the production of Type I Interferons. It might be that focusing on directly offering the immune system the information about the genetics and protein structure of the pathogen, rather than training its first-line mechanisms to develop faster, excessively increases its specificity, making it reach a level that brings the virus the opportunity to evolve and escape previously-developed host immune mechanisms. Naturally-selected polymorphic viruses had generated long-term evolutionary responses to deeply tackle the ability of the complex human immune system to neutralise viruses during the first stages of cellular infection. It is until the scientific community realises this that we will probably continue to face serious epidemics and pandemics of respiratory diseases over the coming several decades.
Article
Full-text available
A mysterious outbreak of atypical pneumonia in late 2019 was traced to a seafood wholesale market in Wuhan of China. Within a few weeks, a novel coronavirus tentatively named as 2019 novel coronavirus (2019-nCoV) was announced by the World Health Organization. We performed bioinformatics analysis on a virus genome from a patient with 2019-nCoV infection and compared it with other related coronavirus genomes. Overall, the genome of 2019-nCoV has 89% nucleotide identity with bat SARS-like-CoVZXC21 and 82% with that of human SARS-CoV. The phylogenetic trees of their orf1a/b, Spike, Envelope, Membrane and Nucleoprotein also clustered closely with those of the bat, civet and human SARS coronaviruses. However, the external subdomain of Spike’s receptor binding domain of 2019-nCoV shares only 40% amino acid identity with other SARS-related coronaviruses. Remarkably, its orf3b encodes a completely novel short protein. Furthermore, its new orf8 likely encodes a secreted protein with an alpha-helix, following with a beta-sheet(s) containing six strands. Learning from the roles of civet in SARS and camel in MERS, hunting for the animal source of 2019-nCoV and its more ancestral virus would be important for understanding the origin and evolution of this novel lineage B betacoronavirus. These findings provide the basis for starting further studies on the pathogenesis, and optimizing the design of diagnostic, antiviral and vaccination strategies for this emerging infection.
Article
Full-text available
In December 2019, a cluster of patients with pneumonia of unknown cause was linked to a seafood wholesale market in Wuhan, China. A previously unknown betacoronavirus was discovered through the use of unbiased sequencing in samples from patients with pneumonia. Human airway epithelial cells were used to isolate a novel coronavirus, named 2019-nCoV, which formed another clade within the subgenus sarbecovirus, Orthocoronavirinae subfamily. Different from both MERS-CoV and SARS-CoV, 2019-nCoV is the seventh member of the family of coronaviruses that infect humans. Enhanced surveillance and further investigation are ongoing. (Funded by the National Key Research and Development Program of China and the National Major Project for Control and Prevention of Infectious Disease in China.).
Article
Full-text available
The molecular evolutionary genetics analysis (Mega) software implements many analytical methods and tools for phylogenomics and phylomedicine. Here, we report a transformation of Mega to enable cross-platform use on Microsoft Windows and Linux operating systems. Mega X does not require virtualization or emulation software and provides a uniform user experience across platforms. Mega X has additionally been upgraded to use multiple computing cores for many molecular evolutionary analyses. Mega X is available in two interfaces (graphical and command line) and can be downloaded from www.megasoftware.net free of charge.
Article
Full-text available
Human coronavirus (CoV) HKU1 is a pathogen causing acute respiratory illnesses and so far little is known about its biology. HKU1 virus uses its S1 subunit C-terminal domain (CTD) and not the N-terminal domain like other lineage A b-CoVs to bind to its yet unknown human receptor. Here we present the crystal structure of HKU1 CTD at 1.9Å resolution. The structure consists of three subdomains: core, insertion and subdomain-1 (SD-1). While the structure of the core and SD-1 subdomains of HKU1 are highly similar to those of other b-CoVs, the insertion subdomain adopts a novel fold, which is largely invisible in the cryo-EM structure of the HKU1 S trimer. We identify five residues in the insertion subdomain that are critical for binding of neutralizing antibodies and two residues essential for receptor binding. Our study contributes to a better understanding of entry, immunity and evolution of CoV S proteins.
Article
Full-text available
The international sharing of virus data is critical for protecting populations against lethal infectious disease outbreaks. Scientists must rapidly share information to assess the nature of the threat and develop new medical countermeasures. Governments need the data to trace the extent of the outbreak, initiate public health responses, and coordinate access to medicines and vaccines. Recent outbreaks suggest, however, that the sharing of such data cannot be taken for granted – making the timely international exchange of virus data a vital global challenge. This article undertakes the first analysis of the Global Initiative on Sharing All Influenza Data as an innovative policy effort to promote the international sharing of genetic and associated influenza virus data. Based on more than 20 semi-structured interviews conducted with key informants in the international community, coupled with analysis of a wide range of primary and secondary sources, the article finds that the Global Initiative on Sharing All Influenza Data contributes to global health in at least five ways: (1) collating the most complete repository of high-quality influenza data in the world; (2) facilitating the rapid sharing of potentially pandemic virus information during recent outbreaks; (3) supporting the World Health Organization's biannual seasonal flu vaccine strain selection process; (4) developing informal mechanisms for conflict resolution around the sharing of virus data; and (5) building greater trust with several countries key to global pandemic preparedness.
Article
Full-text available
HKU1 is a human betacoronavirus that causes mild yet prevalent respiratory disease, and is related to the zoonotic SARS and MERS betacoronaviruses, which have high fatality rates and pandemic potential. Cell tropism and host range is determined in part by the coronavirus spike (S) protein, which binds cellular receptors and mediates membrane fusion. As the largest known class I fusion protein, its size and extensive glycosylation have hindered structural studies of the full ectodomain, thus preventing a molecular understanding of its function and limiting development of effective interventions. Here we present the 4.0 Å resolution structure of the trimeric HKU1 S protein determined using single-particle cryo-electron microscopy. In the pre-fusion conformation, the receptor-binding subunits, S1, rest above the fusion-mediating subunits, S2, preventing their conformational rearrangement. Surprisingly, the S1 C-terminal domains are interdigitated and form extensive quaternary interactions that occlude surfaces known in other coronaviruses to bind protein receptors. These features, along with the location of the two protease sites known to be important for coronavirus entry, provide a structural basis to support a model of membrane fusion mediated by progressive S protein destabilization through receptor binding and proteolytic cleavage. These studies should also serve as a foundation for the structure-based design of betacoronavirus vaccine immunogens.
Article
Full-text available
Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates and select the ones to be used for model building. In cases where multiple alternative template structures are available for a protein of interest, a user-guided template selection step allows building models in different functional states. SWISS-MODEL is available at http://swissmodel.expasy.org/.
Preprint
Since the SARS outbreak 18 years ago, a large number of severe acute respiratory syndrome related coronaviruses (SARSr-CoV) have been discovered in their natural reservoir host, bats. Previous studies indicated that some of those bat SARSr-CoVs have the potential to infect humans. Here we report the identification and characterization of a novel coronavirus (nCoV-2019) which caused an epidemic of acute respiratory syndrome in humans, in Wuhan, China. The epidemic, started from December 12th, 2019, has caused 198 laboratory confirmed infections with three fatal cases by January 20th, 2020. Full-length genome sequences were obtained from five patients at the early stage of the outbreak. They are almost identical to each other and share 79.5% sequence identify to SARS-CoV. Furthermore, it was found that nCoV-2019 is 96% identical at the whole genome level to a bat coronavirus. The pairwise protein sequence analysis of seven conserved non-structural proteins show that this virus belongs to the species of SARSr-CoV. The nCoV-2019 virus was then isolated from the bronchoalveolar lavage fluid of a critically ill patient, which can be neutralized by sera from several patients. Importantly, we have confirmed that this novel CoV uses the same cell entry receptor, ACE2, as SARS-CoV.
Article
The coronavirus spike protein is a multifunctional molecular machine that mediates coronavirus entry into host cells. It first binds to a receptor on the host cell surface through its S1 subunit and then fuses viral and host membranes through its S2 subunit. Two domains in S1 from different coronaviruses recognize a variety of host receptors, leading to viral attachment. The spike protein exists in two structurally distinct conformations, prefusion and postfusion. The transition from prefusion to postfusion conformation of the spike protein must be triggered, leading to membrane fusion. This article reviews current knowledge about the structures and functions of coronavirus spike proteins, illustrating how the two S1 domains recognize different receptors and how the spike proteins are regulated to undergo conformational transitions. I further discuss the evolution of these two critical functions of coronavirus spike proteins, receptor recognition and membrane fusion, in the context of the corresponding functions from other viruses and host cells. Expected final online publication date for the Annual Review of Virology Volume 3 is September 29, 2016. Please see http://www.annualreviews.org/catalog/pubdates.aspx for revised estimates.
Article
The source of the severe acute respiratory syndrome (SARS) epidemic was traced to wildlife market civets and ultimately to bats. Subsequent hunting for novel coronaviruses (CoVs) led to the discovery of two additional human and over 40 animal CoVs, including the prototype lineage C betacoronaviruses, Tylonycteris bat CoV HKU4 and Pipistrellus bat CoV HKU5; these are phylogenetically closely related to the Middle East respiratory syndrome (MERS) CoV, which has affected more than 1,000 patients with over 35% fatality since its emergence in 2012. All primary cases of MERS are epidemiologically linked to the Middle East. Some of these patients had contacted camels which shed virus and/or had positive serology. Most secondary cases are related to health care-associated clusters. The disease is especially severe in elderly men with comorbidities. Clinical severity may be related to MERS-CoV's ability to infect a broad range of cells with DPP4 expression, evade the host innate immune response, and induce cytokine dysregulation. Reverse transcription-PCR on respiratory and/or extrapulmonary specimens rapidly establishes diagnosis. Supportive treatment with extracorporeal membrane oxygenation and dialysis is often required in patients with organ failure. Antivirals with potent in vitro activities include neutralizing monoclonal antibodies, antiviral peptides, interferons, mycophenolic acid, and lopinavir. They should be evaluated in suitable animal models before clinical trials. Developing an effective camel MERS-CoV vaccine and implementing appropriate infection control measures may control the continuing epidemic.