Background. Coronaviruses (CoVs) are enveloped positive-strand RNA viruses which have club-like spikes at the surface with a unique replication process. Coronaviruses are categorized as major pathogenic viruses causing a variety of diseases in birds and mammals including humans (lethal respiratory dysfunctions). Nowadays, a new strain of coronaviruses is identified and named as SARS-CoV-2. Multiple cases of SARS-CoV-2 attacks are being reported all over the world. SARS-CoV-2 showed high death rate; however, no specific treatment is available against SARS-CoV-2. Methods. In the current study, immunoinformatics approaches were employed to predict the antigenic epitopes against SARS-CoV-2 for the development of the coronavirus vaccine. Cytotoxic T-lymphocyte and B-cell epitopes were predicted for SARS-CoV-2 coronavirus protein. Multiple sequence alignment of three genomes (SARS-CoV, MERS-CoV, and SARS-CoV-2) was used to conserved binding domain analysis. Results. The docking complexes of 4 CTL epitopes with antigenic sites were analyzed followed by binding affinity and binding interaction analyses of top-ranked predicted peptides with MHC-I HLA molecule. The molecular docking (Food and Drug Regulatory Authority library) was performed, and four compounds exhibiting least binding energy were identified. The designed epitopes lead to the molecular docking against MHC-I, and interactional analyses of the selected docked complexes were investigated. In conclusion, four CTL epitopes (GTDLEGNFY, TVNVLAWLY, GSVGFNIDY, and QTFSVLACY) and four FDA-scrutinized compounds exhibited potential targets as peptide vaccines and potential biomolecules against deadly SARS-CoV-2, respectively. A multiepitope vaccine was also designed from different epitopes of coronavirus proteins joined by linkers and led by an adjuvant. Conclusion. Our investigations predicted epitopes and the reported molecules that may have the potential to inhibit the SARS-CoV-2 virus. These findings can be a step towards the development of a peptide-based vaccine or natural compound drug target against SARS-CoV-2.
There are a variety of human diseases with unknown etiology. A viral parentage has been purposed for numerous diseases and also has significance to search new viruses . Various difficulties have been faced which scrutinize new viruses, such as some viruses do not replicate in vitro and have cytopathic effects (CPE). The viruses that are unable to replicate in vitro leads to the failure of virus discovery. The DNA-amplified restriction fragment length polymorphism (cDNA-AFLP 4) technique helps to identify the new viruses including the discovery of new coronavirus .
Coronaviruses, a genus of the Coronaviridae family, are enveloped viruses recognized as of large plus RNA strand genome. The size of RNA is 27-32 kb and polyadenylated. There are three groups of coronaviruses that are serologically distinct. Viruses are characterized within each group by their genomic sequence and host range . Coronaviruses have been discovered in mice, turkeys, cats, horse, and humans and cause many diseases including respiratory tract and gastroenteritis .
Two human viruses (HCoV-229E, HCoV-OC43) were identified in the mid-1960s and are known to cause the common cold. The recently identified SARS-CoV can cause a life-threatening pneumonia and is the most pathogenic human coronaviruses identified thus far . SARS-CoV is probable to occupy in animal source and recently initiated the epidemic in humans through zoonotic transmission . SARS-CoV is the first membrane of a fourth group of coronaviruses .
In Wuhan (Hubei province, China), multiple patients associated to Hunan south China seafood market diagnosed with third zoonotic human coronavirus (CoV) of the century emerged in 31st of December 2019. CoV is similar to severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) infections including fever, lung infiltration, and difficulty breathing . After an extensive speculation about the causative agent of CoV, the identification of novel CoV was announced by the Chinese Center for Disease Control (CDS) on 19th of January 2020 . The novel CoV, SARS-CoV-2, was insulated from a single patient and later corroborated from 16 more patients . The viral pneumonia of SARS-CoV-2 was quickly predicted as the likely causative agent, while not yet confirmed.
The first sequence of SARS-CoV-2 has been submitted after its conformation . Later, five more sequences of SARS-CoV-2 were deposited to the GSAID database on 11th of January from Chinese institutes  (Supplementary 1); multiple sequence alignment of SARS-CoV, MERS-CoV, and SARS-CoV-2 carried out and conserved part in DNA, as well as protein sequence, was observed. Hundreds of human deaths were linked with infection having significant morbidities with the age>50. Various clinical symptoms have been highlighted such as dry cough, leukopenia, fever, and shortness of breath. The extracorporeal membrane oxygenation of the patients considered severe cases and need supportive care. The infection of SARS-CoV-2 in elderly patients are less virulent as compared to SARS-CoV (10% mortality) and MERS-CoV (35% mortality) .
The source of the SARS-CoV-2 is still unclear, although the initial cases have been associated with the Huanan South China Seafood Market. The early patients present in the Market got the virus through either human-to-human transmission or a more widespread animal source .
The samples from the infected market showed positive results for the novel coronavirus while no specific animal association has been identified . Through codon analyses, it is suggested that the snakes might be the possible source of the viral infection , although the assertion has been disputed by others  including possible animal vectors, and the researchers are trying to discover the source of SARS-CoV-2.
Coronavirus was thought to infect humans and bats more effectively as both are more related to Coronavirus lifecycle . It has been evidenced that several bats are capable of infecting human cells without intermediate adaptation . The human serology data shows the association of bat CoV proteins leads to zoonotic transmission of SARS-like bat coronavirus for deadliest out breaks . MERS-CoV is also a zoonotic virus and have the origin from the bats . The zoonotic contacts of camel has been evidenced in primary cases of MERS-CoV . These lessons from SARS and MERS highlight the importance of rapidly finding source for SARS-CoV-2 in order to stem the ongoing outbreak .
1.2. Susceptible Populations
With low patient data, who may be most sensitive to SARS-CoV-2 is difficult to make robust resolution. Disease severity such as SARS-CoV and MERS-CoV equated strongly to host the condition including biological sex, age, and the overall health , and similar findings have been observed in early patients of SARS-CoV-2. The SARS- and MERS-CoV infection leads to increase the severity and death rate in people over the age of 50 years . The observed patients having novel CoV had poor health conditions including diabetes, kidney or heart function issues, and hypertension that make them more susceptible for MERS-CoV outbreak, while diabetes, smoking, cardiovascular disease, hypertension, and other chronic illness have also been observed. In the majority of deaths and corresponding to findings in animal models , the results indicate that vigilance is essential for these weak patients following SARS-CoV-2 infection .
1.3. Insights from the Sequence
Dr. Zhang’s group at Fudan University and many other groups in China instance the dedication and increased the capacity of the scientific infrastructures in China by rapid sequencing of nearly 30,000 nucleotide of the (COVID) genome . The whole genome analyses of SARS-CoV-2 showed ~80% nucleotide identity to the original SARS epidemic virus. The two different bat SARS-like CoVs (ZC45 and ZXC21) shared ~89% identity with the genome of SARS-CoV-2 . It has been observed that the novel CoV showed recombination with previously identified bat coronaviruses through phylogenetic analyses . A CoV sequence of bat (RaTG3) having 92% sequence identity with the novel virus supports the bat origins for the SARS-CoV-2 .
The SARS-CoV-2 spike protein has roughly 75% amino acid identity with SARS-CoV  while the SARS-CoV-2 receptor-binding domain (RBD) is 73% conserved with spike RBD of SARS-CoV by narrowing analysis relative to the epidemic RBD . The receptor-binding domain of SARS-CoV-2 was capable of binding with ACE2 in the context of the SARS-CoV spike protein .
1.4. Genomic Features and Lifecycle of the Coronavirus
Coronaviruses have unique club-like spikes, and the RNA genome is larger than other virus which leads to a unique mode of replication. Coronaviruses contain ~30 kb of positive-strand RNA genome . The significant features of coronavirus genomes include a 5 caped end which plays an important role in the replication of RNA, as 5 end has a leader sequence along with a UTR region, possessing essential loops. The 3 poly-A tail end has essential structures for RNA genome synthesis and replication . These two modifications allow RNA viruses for translation of replication (replicase) proteins .
A coronavirus genome has significant parts and helps for the synthesis and replications of whole genome (Figure 1) .