Comparison of alignment software for genome-wide bisulphite sequence data.

Department of Pathology, Dunedin School of Medicine, University of Otago, 270 Great King Street, Dunedin 9054, New Zealand.
Nucleic Acids Research (Impact Factor: 8.81). 02/2012; 40(10):e79. DOI: 10.1093/nar/gks150
Source: PubMed

ABSTRACT Recent advances in next generation sequencing (NGS) technology now provide the opportunity to rapidly interrogate the methylation status of the genome. However, there are challenges in handling and interpretation of the methylation sequence data because of its large volume and the consequences of bisulphite modification. We sequenced reduced representation human genomes on the Illumina platform and efficiently mapped and visualized the data with different pipelines and software packages. We examined three pipelines for aligning bisulphite converted sequencing reads and compared their performance. We also comment on pre-processing and quality control of Illumina data. This comparison highlights differences in methods for NGS data processing and provides guidance to advance sequence-based methylation data analysis for molecular biologists.

  • [Show abstract] [Hide abstract]
    ABSTRACT: The rapid development of high-throughput sequencing technologies has enabled epigeneticists to quantify DNA methylation on a massive scale. Progressive increases in sequencing capacity present challenges in terms of processing analysis and the interpretation of the large amount of data, investigating differential methylation between genome-scale data from multiple samples highlights this challenge. We have developed a differential methylation analysis package (DMAP) to generate coverage-filtered reference methylomes and identify differentially methylated regions across multiple samples from reduced representation (RRBS) and whole genome bisulphite sequencing (WGBS) experiments. We introduce a novel fragment-based approach for investigating DNA methylation patterns for RRBS data. Further, DMAP provides the identity of gene and CpG features and distances to the differentially methylated regions in a format that is easily analysed with limited bioinformatics knowledge. The software has been implemented in C and has been written to ensure portability between different platforms. The source code and documentation is freely available (DMAP: as compressed TAR archive folder) from Two test datasets are also available for download from the website. Test dataset 1 contains reads from chromosome 1 of a patient and a control, which is used, for comparative analysis in the current article. Test dataset 2 contains reads from a part of chromosome 21 of three disease and three control samples for testing the operation of DMAP, especially for the analysis of variance (ANOVA). Example commands for the analyses are included.Contact:; Information: Supplementary data are available at Bioinformatics online.
    Bioinformatics 03/2014; · 5.47 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Epigenetic mechanisms are proposed as an important way in which the genome responds to the environment. Epigenetic marks, including DNA methylation and Histone modifications, can be triggered by environmental effects, and lead to permanent changes in gene expression, affecting the phenotype of an organism. Epigenetic mechanisms have been proposed as key in plasticity, allowing environmental exposure to shape future gene expression. While we are beginning to understand how these mechanisms have roles in human biology and disease, we have little understanding of their roles and impacts on ecology and evolution. In this review, we discuss different types of epigenetic marks, their roles in gene expression and plasticity, methods for assaying epigenetic changes, and point out the future advances we require to understand fully the impact of this field. J. Exp. Zool. (Mol. Dev. Evol.) 9999B: 1–13, 2014. © 2014 Wiley Periodicals, Inc.
    Journal of Experimental Zoology Part B Molecular and Developmental Evolution 04/2014; · 2.12 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Bisulfite sequencing is the most efficient single nucleotide resolution method for analysis of methylation status at whole genome scale, but improved quality control metrics are needed to better standardize experiments. We describe BisQC, a step-by-step method for multiplexed bisulfite-converted DNA library construction, pooling, spike-in content, and bioinformatics. We demonstrate technical improvements for library preparation and bioinformatic analyses that can be done in standard laboratories. We find that decoupling amplification of bisulfite converted (bis) DNA from the indexing reaction is an advantage, specifically in reducing total PCR cycle number and pre-selecting high quality bis-libraries. We also introduce a progressive PCR method for optimal library amplification and size-selection. At the sequencing stage, we thoroughly test the benefits of pooling non-bis DNA library with bis-libraries and find that BisSeq libraries can be pooled with a high proportion of non-bis DNA libraries with minimal impact on BisSeq output. For informatics analysis, we propose a series of optimization steps including the utilization of the mitochondrial genome as a QC standard, and we assess the validity of using duplicate reads for coverage statistics. We demonstrate several quality control checkpoints at the library preparation, pre-sequencing, post-sequencing, and post-alignment stages, which should prove useful in determining sample and processing quality. We also determine that including a significant portion of non-bisulfite converted DNA with bisulfite converted DNA has a minimal impact on usable bisulfite read output.
    BMC Genomics 04/2014; 15(1):290. · 4.40 Impact Factor

Full-text (2 Sources)

Available from
Jun 2, 2014