January 2023
·
2 Reads
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
January 2023
·
2 Reads
August 2022
·
98 Reads
·
6 Citations
Epitopes are short amino acid sequences that define the antigen signature to which an antibody or T cell receptor binds. In light of the current pandemic, epitope analysis and prediction are paramount to improving serological testing and developing vaccines. In this paper, known epitope sequences from SARS-CoV, SARS-CoV-2, and other Coronaviridae were leveraged to identify additional antigen regions in 62K SARS-CoV-2 genomes. Additionally, we present epitope distribution across SARS-CoV-2 genomes, locate the most commonly found epitopes, and discuss where epitopes are located on proteins and how epitopes can be grouped into classes. The mutation density of different protein regions is presented using a big data approach. It was observed that there are 112 B cell and 279 T cell conserved epitopes between SARS-CoV-2 and SARS-CoV, with more diverse sequences found in Nucleoprotein and Spike glycoprotein.
February 2022
·
29 Reads
Epitopes are short amino acid sequences that define the antigen signature to which an antibody binds. In light of the current pandemic, epitope analysis and prediction is paramount to improving serological testing and developing vaccines. In this paper, we leverage known epitope sequences from SARS-CoV, SARS-CoV-2 and other Coronaviridae and use those known epitopes to identify additional antigen regions in 62k SARS-CoV-2 genomes. Additionally, we present epitope distribution across SARS-CoV-2 genomes, locate the most commonly found epitopes, discuss where epitopes are located on proteins, and how epitopes can be grouped into classes. We also discuss the mutation density of different regions on proteins using a big data approach. We find that there are many conserved epitopes between SARS-CoV-2 and SARS-CoV, with more diverse sequences found in Nucleoprotein and Spike Glycoprotein.
December 2021
·
78 Reads
·
6 Citations
SARS-CoV-2 genomic sequencing efforts have scaled dramatically to address the current global pandemic and aid public health. However, autonomous genome annotation of SARS-CoV-2 genes, proteins, and domains is not readily accomplished by existing methods and results in missing or incorrect sequences. To overcome this limitation, we developed a novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on the use of a single reference genome and by overcoming atypical genomic traits that challenge traditional bioinformatic methods. We analyzed an initial corpus of 66,000 SARS-CoV-2 genome sequences collected from labs across the world using our method and identified the comprehensive set of known proteins with 98.5% set membership accuracy and 99.1% accuracy in length prediction, compared to proteome references, including Replicase polyprotein 1ab (with its transcriptional slippage site). Compared to other published tools, such as Prokka (base) and VAPiD, we yielded a 6.4- and 1.8-fold increase in protein annotations. Our method generated 13,000,000 gene, protein, and domain sequences—some conserved across time and geography and others representing emerging variants. We observed 3362 non-redundant sequences per protein on average within this corpus and described key D614G and N501Y variants spatiotemporally in the initial genome corpus. For spike glycoprotein domains, we achieved greater than 97.9% sequence identity to references and characterized receptor binding domain variants. We further demonstrated the robustness and extensibility of our method on an additional 4000 variant diverse genomes containing all named variants of concern and interest as of August 2021. In this cohort, we successfully identified all keystone spike glycoprotein mutations in our predicted protein sequences with greater than 99% accuracy as well as demonstrating high accuracy of the protein and domain annotations. This work comprehensively presents the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable, high-accuracy method to analyze newly sequenced infections as they arise.
December 2021
·
263 Reads
·
33 Citations
npj Science of Food
In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of unexpected contaminants or environmental changes. To test this hypothesis, we sequenced the total RNA of 31 high protein powder (HPP) samples of poultry meal pet food ingredients. We developed a microbiome analysis pipeline employing a key eukaryotic matrix filtering step that improved microbe detection specificity to >99.96% during in silico validation. The pipeline identified 119 microbial genera per HPP sample on average with 65 genera present in all samples. The most abundant of these were Bacteroides, Clostridium, Lactococcus, Aeromonas , and Citrobacter . We also observed shifts in the microbial community corresponding to ingredient composition differences. When comparing culture-based results for Salmonella with total RNA sequencing, we found that Salmonella growth did not correlate with multiple sequence analyses. We conclude that microbiome sequencing is useful to characterize complex food microbial communities, while additional work is required for predicting specific species’ viability from total RNA sequencing.
April 2021
·
115 Reads
·
22 Citations
Rapid tests for active SARS-CoV-2 infections rely on reverse transcription polymerase chain reaction (RT-PCR). RT-PCR uses reverse transcription of RNA into complementary DNA (cDNA) and amplification of specific DNA (primer and probe) targets using polymerase chain reaction (PCR). The technology makes rapid and specific identification of the virus possible based on sequence homology of nucleic acid sequence and is much faster than tissue culture or animal cell models. However the technique can lose sensitivity over time as the virus evolves and the target sequences diverge from the selective primer sequences. Different primer sequences have been adopted in different geographic regions. As we rely on these existing RT-PCR primers to track and manage the spread of the Coronavirus, it is imperative to understand how SARS-CoV-2 mutations, over time and geographically, diverge from existing primers used today. In this study, we analyze the performance of the SARS-CoV-2 primers in use today by measuring the number of mismatches between primer sequence and genome targets over time and spatially. We find that there is a growing number of mismatches, an increase by 2% per month, as well as a high specificity of virus based on geographic location.
January 2021
·
50 Reads
Rapid tests for active SARS-CoV-2 infections rely on reverse transcription polymerase chain reaction (RT-PCR). RT-PCR uses reverse transcription of RNA into complementary DNA (cDNA) and amplification of specific DNA (primer and probe) targets using polymerase chain reaction (PCR). The technology makes rapid and specific identification of the virus possible based on sequence homology of nucleic acid sequence and is much faster than tissue culture or animal cell models. However the technique can lose sensitivity over time as the virus evolves and the target sequences diverge from the selective primer sequences. Different primer sequences have been adopted in different geographic regions. As we rely on these existing RT-PCR primers to track and manage the spread of the Coronavirus, it is imperative to understand how SARS-CoV-2 mutations, over time and geographically, diverge from existing primers used today. In this study, we analyze the performance of the SARS-CoV-2 primers in use today by measuring the number of mismatches between primer sequence and genome targets over time and spatially. We find that there is a growing number of mismatches, an increase by 2% per month, as well as a high specificity of virus based on geographic location.
December 2020
·
33 Reads
·
1 Citation
Rapid tests for active SARS-CoV-2 infections rely on reverse transcription polymerase chain reaction (RT-PCR). RT-PCR uses reverse transcription of RNA into complementary DNA (cDNA) and amplification of specific DNA (primer and probe) targets using polymerase chain reaction (PCR). The technology makes rapid and specific identification of the virus possible based on sequence homology of nucleic acid sequence and is much faster than than tissue culture or animal cell models. However the technique can lose sensitivity over time as the virus evolves and the target sequences diverge from the selective primer sequences. As we rely on existing RT-PCR primers to track and manage the spread of the Coronavirus as public life re-opens, it is imperative to understand how SARS-CoV-2 mutations, over time and geographically, diverge from existing primers used today. In this study, we analyze the performance of the SARS-CoV-2 primers in use today by measuring the number of mismatches between primer sequence and genome targets over time and spatially. We find that there is a growing number of mismatches, an increase by 2% per month, as well as a high specificity of virus based on geographic location.
May 2020
·
210 Reads
·
3 Citations
In this work, we hypothesized that shifts in the food microbiome can be used as an indicator of unexpected contaminants or environmental changes. To test this hypothesis, we sequenced total RNA of 31 high protein powder (HPP) samples of poultry meal pet food ingredients. We developed a microbiome analysis pipeline employing a key eukaryotic matrix filtering step that improved microbe detection specificity to >99.96% during in silico validation. The pipeline identified 119 microbial genera per HPP sample on average with 65 genera present in all samples. The most abundant of these were Bacteroides , Clostridium , Lactococcus , Aeromonas , and Citrobacter . We also observed shifts in the microbial community corresponding to ingredient composition differences. When comparing culture-based results for Salmonella with total RNA sequencing, we found that Salmonella growth did not correlate with multiple sequence analyses. We conclude that microbiome sequencing is useful to characterize complex food microbial communities, while additional work is required for predicting specific species' viability from total RNA sequencing.
November 2019
·
150 Reads
·
40 Citations
npj Science of Food
Here we propose that using shotgun sequencing to examine food leads to accurate authentication of ingredients and detection of contaminants. To demonstrate this, we developed a bioinformatic pipeline, FASER (Food Authentication from SEquencing Reads), designed to resolve the relative composition of mixtures of eukaryotic species using RNA or DNA sequencing. Our comprehensive database includes >6000 plants and animals that may be present in food. FASER accurately identified eukaryotic species with 0.4% median absolute difference between observed and expected proportions on sequence data from various sources including sausage meat, plants, and fish. FASER was applied to 31 high protein powder raw factory ingredient total RNA samples. The samples mostly contained the expected source ingredient, chicken, while three samples unexpectedly contained pork and beef. Our results demonstrate that DNA/RNA sequencing of food ingredients, combined with a robust analysis, can be used to find contaminants and authenticate food ingredients in a single assay.
... Pathways related to the activation of B and T cells Intracellular antigens generate short peptides in the presence of proteases, which are presented to T-cell receptors (TCRs) on T cell by the MHC. Antigenic peptides on MHC-I molecules were recognized by CD8 + T cells, whereas peptides on MHC-II molecules were recognized by CD4 + T cells (Agarwal et al., 2022). In addition, CD28/CTLA4 on T-cell binds to the costimulatory molecules CD86 and CD80 to activate T cells . ...
August 2022
... Coronaviridae epitope data were retrieved from the Immune Epitope Database and Analysis Resource (IEDB) on 20 July 2020 and again in April 2021 [44]. The IBM Research Functional Genomics Platform (FGP) [45] with semi-supervised SARS-CoV-2 genome annotation method [46] was used to identify and retrieve the protein sequences, domain sequences, and genome accessions from January 2020 to April 2021. This includes ancestral lineage as well as sampled genomes spanning eight variants of concern and of interest, as described by Beck et al. [46]. ...
December 2021
... RNA viruses have extremely high mutation rates and new variants emerge, either due to genome recombination/reassortment, selection, or the accumulation of point mutations due to the highly error-prone RNA-dependent RNA polymerase (RdRp) 14 . As the genotypic distribution of the virus shifts as a result of RNA evolution, PCR primers can lose sensitivity, which is reported in human viruses such as SARS-COV-2 15 . A multi-year meta-transcriptomic survey of over 2000 viromes from China during 2016-2019 identified 23 novel viruses from both honey bees and mites 16 , demonstrating one of the many benefits of conducting meta-virome studies. ...
April 2021
... Bacterial cells were enzymatically lysed according to the protocol used by the 100K pathogen project [24], and then RNA was isolated using Trizol LS (Ambion, Austin, TX, USA) according to manufacturer instructions. RNA sequencing libraries were prepared as described previously [25][26][27], with RNA purity and integrity confirmed using TapeStation The same method was applied to all three swabs ( Figure 2). In brief, the oral mucosa lateral to the palatoglossal folds was swabbed using a cytobrush (FLOQSwabs, Coplan, Italy, EU). ...
December 2021
npj Science of Food
... Metagenomics is a powerful tool for characterizing microbial communities, and the translation of "omics" technologies like this to food microbiology will have a significant impact in the food industry and for public health (31,32). The applications of this technology extend far beyond just public health, they can also provide valuable insights about food quality, and there is evidence that the microbiome is likely an important and effective hazard indicator within the food supply chain (33). ...
May 2020
... Beyond this, milk is used as an ingredient to make a variety of products and other foods, with raw milk quality having considerable impacts on finished product quality, safety, and production efficiency. Other studies have aimed to characterize the microbiome of food ingredients in production settings, for example, in high protein powders (5,6), produce (7,8), and fermented foods (9)(10)(11)(12). These studies are useful in demonstrating the potential that metagenomics and metatranscriptomics have in advancing food safety and quality for targeted assessments as well as for improving sensitivity for regular surveillance. ...
November 2019
npj Science of Food
... characteristics between the NAND-flash memory and dynamic random-access memory (DRAM). [1,2] Two types of 3D cross-point memories have been properly documented, including the storagemapped memory using phase change random-access memory (PCRAM) [3][4][5] or resistive random-access memory (ReRAM) [6][7][8][9][10] and memory mapped memory using p-spin torque transfer random-access memory (p-STT-MRAM). [11,12] In addition, ReRAM or conductive bridge random-access memory (CBRAM)-based neurons and synapses [13][14][15][16][17][18][19][20][21] have been extensively studied for artificial neural networks in contrast with the complementary metal oxide semiconductor field effect transistor-based neurons and synapses that have a limited ability to achieve a higher neural density. ...
September 2019
... Currently, food safety regulatory agencies including the Food and Drug Administration (FDA), Centers for Disease Control and Prevention (CDC), United States Department of Agriculture (USDA), and European Food Safety Authority (EFSA) are converging on the use of WGS for pathogen detection and outbreak investigation. Large scale WGS of food-associated bacteria was first initiated via the 100 K Pathogen Genome Project 9 with the goal of expanding the diversity of bacterial reference genomes-a crucial need for foodborne illness outbreak investigation, traceability, and microbiome studies 10,11 . However, since WGS relies on culturing a microbial isolate prior to sequencing, there are inherent biases and limitations in its ability to describe the microorganisms and their interactions in a food sample. ...
October 2019
Current Issues in Molecular Biology
... Sequences were assembled using Shovil (v1.0.4) (83), checked for quality, size (4.5-6.5Mbp genome), completeness (>95% estimate), and contamination (<10% estimate) using CheckM (84), and assessed for approximate genera and species and further identity test for possible contamination using Kraken (85)(86)(87)(88)(89)(90). Sixteen sequences that did not meet quality criteria were removed from downstream analysis. ...
January 2019
... Another recent example, in a bacterial setting, was the cholerae outbreak in Haiti wherein the phylogenetic analysis resolved the origin of the pathogen 27 . However, for this analysis to succeed, a substantial genome sequence database, of isolates collected across time and geographic location, was needed to enable placement in a phylogenetic context 28,29 . As outbreaks are bound to happen in the future, investment in cataloguing the genomic space of pathogens is even more important than previously appreciated so that populations of appropriate size can be examined as systematically examined in bacteria 30,31 . ...
November 2018