Profile analysis and prediction of tissue-specific CpG island methylation classes

Department of Molecular Biophysics, DKFZ, German Cancer Research Center, Heidelberg, Germany.
BMC Bioinformatics (Impact Factor: 2.67). 05/2009; 10:116. DOI: 10.1186/1471-2105-10-116
Source: PubMed

ABSTRACT The computational prediction of DNA methylation has become an important topic in the recent years due to its role in the epigenetic control of normal and cancer-related processes. While previous prediction approaches focused merely on differences between methylated and unmethylated DNA sequences, recent experimental results have shown the presence of much more complex patterns of methylation across tissues and time in the human genome. These patterns are only partially described by a binary model of DNA methylation. In this work we propose a novel approach, based on profile analysis of tissue-specific methylation that uncovers significant differences in the sequences of CpG islands (CGIs) that predispose them to a tissue- specific methylation pattern.
We defined CGI methylation profiles that separate not only between constitutively methylated and unmethylated CGIs, but also identify CGIs showing a differential degree of methylation across tissues and cell-types or a lack of methylation exclusively in sperm. These profiles are clearly distinguished by a number of CGI attributes including their evolutionary conservation, their significance, as well as the evolutionary evidence of prior methylation. Additionally, we assess profile functionality with respect to the different compartments of protein coding genes and their possible use in the prediction of DNA methylation.
Our approach provides new insights into the biological features that determine if a CGI has a functional role in the epigenetic control of gene expression and the features associated with CGI methylation susceptibility. Moreover, we show that the ability to predict CGI methylation is based primarily on the quality of the biological information used and the relationships uncovered between different sources of knowledge. The strategy presented here is able to predict, besides the constitutively methylated and unmethylated classes, two more tissue specific methylation classes conserving the accuracy provided by leading binary methylation classification methods.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Previous studies have examined DNA methylation in different trinucleotide repeat diseases. We have combined this data and used a pattern searching algorithm to identify motifs in the DNA surrounding aberrantly methylated CpGs found in the DNA of patients with one of the three trinucleotide repeat (TNR) expansion diseases: fragile X syndrome (FRAXA), myotonic dystrophy type I (DM1), or Friedreich's ataxia (FRDA). We examined sequences surrounding both the variably methylated (VM) CpGs, which are hypermethylated in patients compared with unaffected controls, and the nonvariably methylated CpGs which remain either always methylated (AM) or never methylated (NM) in both patients and controls. Using the J48 algorithm of WEKA analysis, we identified that two patterns are all that is necessary to classify our three regions CCGG∗ which is found in VM and not in AM regions and AATT∗ which distinguished between NM and VM + AM using proportional frequency. Furthermore, comparing our software with MEME software, we have demonstrated that our software identifies more patterns than MEME in these short DNA sequences. Thus, we present evidence that the DNA sequence surrounding CpG can influence its susceptibility to be de novo methylated in a disease state associated with a trinucleotide repeat.
    Journal of nucleic acids 01/2013; 2013:689798. DOI:10.1155/2013/689798
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA methylation is an inheritable chemical modification of cytosine, and represents one of the most important epigenetic events. Computational prediction of the DNA methylation status can be employed to speed up the genome-wide methylation profiling, and to identify the key features that are correlated with various methylation patterns. Here, we develop CpGIMethPred, the support vector machine-based models to predict the methylation status of the CpG islands in the human genome under normal conditions. The features for prediction include those that have been previously demonstrated effective (CpG island specific attributes, DNA sequence composition patterns, DNA structure patterns, distribution patterns of conserved transcription factor binding sites and conserved elements, and histone methylation status) as well as those that have not been extensively explored but are likely to contribute additional information from a biological point of view (nucleosome positioning propensities, gene functions, and histone acetylation status). Statistical tests are performed to identify the features that are significantly correlated with the methylation status of the CpG islands, and principal component analysis is then performed to decorrelate the selected features. Data from the Human Epigenome Project (HEP) are used to train, validate and test the predictive models. Specifically, the models are trained and validated by using the DNA methylation data obtained in the CD4 lymphocytes, and are then tested for generalizability using the DNA methylation data obtained in the other 11 normal tissues and cell types. Our experiments have shown that (1) an eight-dimensional feature space that is selected via the principal component analysis and that combines all categories of information is effective for predicting the CpG island methylation status, (2) by incorporating the information regarding the nucleosome positioning, gene functions, and histone acetylation, the models can achieve higher specificity and accuracy than the existing models while maintaining a comparable sensitivity measure, (3) the histone modification (methylation and acetylation) information contributes significantly to the prediction, without which the performance of the models deteriorate, and, (4) the predictive models generalize well to different tissues and cell types. The developed program CpGIMethPred is freely available at
    BMC Medical Genomics 01/2013; 6(1). DOI:10.1186/1755-8794-6-S1-S13 · 3.91 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Epigenetic modifications are important in the normal functioning of the cell, from regulating dynamic expression of essential genes and associated proteins to repressing those that are unneeded. Epigenetic changes are essential for development and functioning of the kidney, and aberrant methylation, histone modifications, and expression of microRNA could lead to chronic kidney disease (CKD). Here, epigenetic modifications modulate transforming growth factor β signaling, inflammation, profibrotic genes, and the epithelial-to-mesenchymal transition, promoting renal fibrosis and progression of CKD. Identification of these epigenetic changes is important because they are potentially reversible and may serve as therapeutic targets in the future to prevent subsequent renal fibrosis and CKD. In this review we discuss the different types of epigenetic control, methods to study epigenetic modifications, and how epigenetics promotes progression of CKD.
    Seminars in Nephrology 07/2013; 33(4):363-74. DOI:10.1016/j.semnephrol.2013.05.008 · 2.94 Impact Factor

Full-text (2 Sources)

Available from
May 30, 2014