De-MetaST-BLAST: A Tool for the Validation of Degenerate Primer Sets and Data Mining of Publicly Available Metagenomes

Department of Microbiology, University of Tennessee, Knoxville, Tennessee, United States of America.
PLoS ONE (Impact Factor: 3.23). 11/2012; 7(11):e50362. DOI: 10.1371/journal.pone.0050362
Source: PubMed


Development and use of primer sets to amplify nucleic acid sequences of interest is fundamental to studies spanning many life science disciplines. As such, the validation of primer sets is essential. Several computer programs have been created to aid in the initial selection of primer sequences that may or may not require multiple nucleotide combinations (i.e., degeneracies). Conversely, validation of primer specificity has remained largely unchanged for several decades, and there are currently few available programs that allows for an evaluation of primers containing degenerate nucleotide bases. To alleviate this gap, we developed the program De-MetaST that performs an in silico amplification using user defined nucleotide sequence dataset(s) and primer sequences that may contain degenerate bases. The program returns an output file that contains the in silico amplicons. When De-MetaST is paired with NCBI's BLAST (De-MetaST-BLAST), the program also returns the top 10 nr NCBI database hits for each recovered in silico amplicon. While the original motivation for development of this search tool was degenerate primer validation using the wealth of nucleotide sequences available in environmental metagenome and metatranscriptome databases, this search tool has potential utility in many data mining applications.

Download full-text


Available from: Steven Wilhelm
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The distribution of cyanomyoviruses was estimated using a quantitative PCR (qPCR) approach that targeted the g20 gene as a proxy for phage. Samples were collected spatially during a >3,000 km transect through the Sargasso Sea and temporally during a gyre-constrained phytoplankton bloom within the southern Pacific Ocean. Cyanomyovirus abundances were lower in the Sargasso Sea than in the southern Pacific Ocean, ranging from 2.75 x 10(3) to 5.15 x 10(4) mL(-1) and correlating with the abundance of their potential hosts (Prochlorococcus and Synechococcus). Cyanomyovirus abundance in the southern Pacific Ocean (east of New Zealand) followed Synechococcus host populations in the system: this included a decrease in g20 gene copies (from 4.3 x 10(5) to 9.6 x 10(3) mL(-1) ) following the demise of a Synechococcus bloom. When compared with direct counts of viruses, observations suggest that the cyanomyoviruses comprised 0.5 to >25% of the total virus community. We estimated daily lysis rates of 0.2 - 46% of the standing stock of Synechococcus in the Pacific Ocean compared to ~ < 1.0 % in the Sargasso Sea. In total, our observations confirm this family of viruses is abundant in marine systems and that they are an important source of cyanobacterial mortality. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
    Full-text · Article · Dec 2012 · FEMS Microbiology Ecology
  • [Show abstract] [Hide abstract]
    ABSTRACT: Serotyping analysis of bacterial pathogens in food products is important for foodborne disease surveillance and outbreak investigations. Traditional immunological techniques are labor-intensive and time-consuming, whereas polymerase chain r eaction (PCR)-based techniques are more robust, consistent and rapid. PCR-based methods also provide easier standardization and better reproducibility. Here, we summarize some recent developments and applications of PCR-based serotyping for common foodborne pathogens, and provide a list of available bioinformatics tools for developing PCR-based serotyping assays. Published by Elsevier B.V.
    No preview · Article · Jan 2015 · Journal of Microbiological Methods
  • [Show abstract] [Hide abstract]
    ABSTRACT: Microorganisms are central players in the turnover of nutrients in soil and drive the decomposition of complex organic materials into simpler forms that can be utilized by other biota. Therefore microbes strongly drive soil quality and ecosystem services provided by soils, including plant yield and quality. Thus it is one of the major goals of soil sciences to describe the most relevant enzymes that are involved in nutrient mobilization and to understand the regulation of gene expression of the corresponding genes. This task is however impeded by the enormous microbial diversity in soils. Indeed, we are far to appreciate the number of species present in 1 g of soil, as well as the major functional traits they carry. Here, also most next-generation sequencing (NGS) approaches fail as immense sequencing efforts are needed to fully uncover the functional diversity of soils. Thus even if a gene of interest can be identifi ed by BLAST similarity analysis, the obtained number of reads by NGS is too low for a quantitative assessment of the gene or for a description of its taxonomic diversity. Here we present an integrated approach, which we termed the second-generation full cycle approach, to quantify the abundance and diversity of key enzymes involved in nutrient mobilization. This approach involves the functional annotation of metagenomic data with a relative low coverage (5 Gbases or less) and the design of highly targeted primer systems to assess the abundance or diversity of enzyme-coding genes that are drivers for a particular transformation step in nutrient turnover.
    No preview · Chapter · Jan 2016