Article

A bioinformatician's guide to metagenomics.

Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA, USA.
Microbiology and molecular biology reviews: MMBR (Impact Factor: 12.59). 01/2009; 72(4):557-78, Table of Contents. DOI: 10.1128/MMBR.00009-08
Source: PubMed

ABSTRACT As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for the best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe the chain of decisions accompanying a metagenomic project from the viewpoint of the bioinformatic analysis step by step. We guide the reader through a standard workflow for a metagenomic project beginning with presequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries, and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic data sets in contrast to genome projects. Different types of data analyses particular to metagenomes are then presented, including binning, dominant population analysis, and gene-centric analysis. Finally, data management issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.

1 Bookmark
 · 
145 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the rapid development of high-throughput sequencing technology, biomedical research has entered into the era of big data. It causes problems about storage and analysis of massive biological data which need to be solved by high-performance computing. Therefore, we build the localized high-performance one-stop data analysis platform to provide convenient and efficient computational analysis services for biomedical researchers. We deploy Galaxy and integrate software tools and datasets into Galaxy in computing cluster, build stable web service, FTP service and management database in order to optimize and improve the performance of Galaxy, and use distributed resource management application interface to collaborate Galaxy with Sun Grid Engine for automatically scheduling and assigning computing resources. Currently the platform has been put into trial operation. The peak performance is 10 Teraflops and the capacity of storage is 40TB. The platform provides many functions such as sequence alignment, short sequence mapping, gene annotation, transcriptome analysis, metagenomic analysis and phylogenetic analysis, and approximately 700GB reference databases including human genome, viruses, bacteria, fungi, etc.
    2013 6th International Conference on Biomedical Engineering and Informatics (BMEI); 12/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Were he alive today, would Louis Pasteur still champion culture methods he pioneered over 150 years ago for identifying bacterial pathogens? Or, might he suggest that new molecular techniques may prove a better way forward for quickly detecting the true microbial diversity of wounds? As modern clinicians faced with treating complex patients with diabetic foot infections (DFI), should we still request venerated and familiar culture and sensitivity methods, or is it time to ask for newer molecular tests, such as 16S rRNA gene sequencing? Or, are molecular techniques as yet too experimental, non-specific and expensive for current clinical use? While molecular techniques help us to identify more microorganisms from a DFI, can they tell us ‘who done it?’, that is, which are the causative pathogens and which are merely colonizers? Furthermore, can molecular techniques provide clinically relevant, rapid information on the virulence of wound isolates and their antibiotic sensitivities? We herein review current knowledge on the microbiology of DFI, from standard culture methods to the current era of rapid and comprehensive ‘crime scene investigation’ (CSI) techniques.
    BMC Medicine 12/2015; 13(1). DOI:10.1186/s12916-014-0232-0 · 7.28 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The relationship between viruses and prokaryotes in the Arctic appears to differ from that at lower latitudes because of low temperature and low abundances of viruses and bacteria, but the impact of viruses on Arctic bacterial genomes is unclear. In the present study, genome fragments of bacteria from the western Arctic Ocean were sequenced to explore the prevalence of virus genes in the genomes of bacteria inhabiting these perennially cold waters. Arctic bacterial genomes were sampled by cloning bacterial environmental DNA into a fosmid vector, and virus DNA within the bacterial genomes was identified using marine viral meta genomes to query the bacterial genomes using the basic local alignment search tool (BLAST). Virus DNA flanked by bacterial DNA was identified as being part of bacterial genomes. Virus DNA was similar to 3-fold more abundant in Arctic bacterial genomes than in bacterial genomes from Monterey Bay and similar to 10-fold more abundant than in bacterial genomes from Antarctic waters. Phage terminase genes involved in packaging DNA into phage capsids were the most abundant gene family identified in the Arctic bacterial genomes. Viruses appear to have a larger impact on prokaryotic communities in the Arctic than what might be inferred from the low bacterial and viral abundances in this high latitude ocean.
    Aquatic Microbial Ecology 05/2012; 66(2):107-116. DOI:10.3354/ame01569 · 1.90 Impact Factor

Full-text (2 Sources)

Download
64 Downloads
Available from
May 16, 2014