ChromHMM: automating chromatin-state discovery and characterization.

1] Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [2] Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts, USA. [3] Department of Biological Chemistry, University of California Los Angeles, Los Angeles, California, USA.
Nature Methods (Impact Factor: 25.95). 02/2012; 9(3):215-6. DOI: 10.1038/nmeth.1906
Source: PubMed
  • [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide mapping of chromatin states is essential for defining regulatory elements and inferring their activities in eukaryotic genomes. A number of hidden Markov model (HMM)-based methods have been developed to infer chromatin state maps from genome-wide histone modification data for an individual genome. In order to perform a principled comparison of evolutionarily distant epigenomes, we must consider species specific biases such as differences in genome size, strength of signal enrichment, and co-occurrence patterns of histone modifications. Here, we present a new Bayesian non-parametric method called hierarchically-linked infinite hidden Markov model (hiHMM) to jointly infer chromatin state maps in multiple genomes (different species, cell types, and developmental stages) using genome-wide histone modification data. This flexible framework provides a new way to learn a consistent definition of chromatin states across multiple genomes, thus facilitating a direct comparison among them. We demonstrate the utility of this method using synthetic data as well as multiple modENCODE ChIP-seq datasets. The hierarchical and Bayesian non-parametric formulation in our approach is an important extension to the current set of methodologies for comparative chromatin landscape analysis., SUPPLEMENTARY INFORMATION: Source codes are available at Chromatin data are available at © The Author (2015). Published by Oxford University Press. All rights reserved. For Permissions, please email:
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The current epidemic of obesity and associated diseases calls for swift actions to better understand the mechanisms by which genetics and environmental factors affect metabolic health in humans. Monozygotic (MZ) twin pairs showing discordance for obesity suggest that epigenetic influences represent one such mechanism. We studied genome-wide leukocyte DNA methylation variation in 30 clinically healthy young adult MZ twin pairs discordant for body mass index (BMI; average within-pair BMI difference: 5.4 ± 2.0 kg/m(2)). There were no differentially methylated cytosine-guanine (CpG) sites between the co-twins discordant for BMI. However, stratification of the twin pairs based on the level of liver fat accumulation revealed two epigenetically highly different groups. Significant DNA methylation differences (n = 1,236 CpG sites (CpGs)) between the co-twins were only observed if the heavier co-twins had excessive liver fat (n = 13 twin pairs). This unhealthy pattern of obesity was coupled with insulin resistance and low-grade inflammation. The differentially methylated CpGs included 23 genes known to be associated with obesity, liver fat, type 2 diabetes mellitus (T2DM) and metabolic syndrome, and potential novel metabolic genes. Differentially methylated CpG sites were overrepresented at promoters, insulators, and heterochromatic and repressed regions. Based on predictions by overlapping histone marks, repressed and weakly transcribed sites were significantly more often hypomethylated, whereas sites with strong enhancers and active promoters were hypermethylated. Further, significant clustering of differentially methylated genes in vitamin, amino acid, fatty acid, sulfur, and renin-angiotensin metabolism pathways was observed. The methylome in leukocytes is altered in obesity associated with metabolic disturbances, and our findings indicate several novel candidate genes and pathways in obesity and obesity-related complications.
    04/2015; 7(1):39. DOI:10.1186/s13148-015-0073-5
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Transcription factors (TFs) and epigenetic modifications play crucial roles in the regulation of gene expression , and correlations between the two types of factors have been discovered. However, methods for quantitatively studying the correlations remain limited. Here, we present a computational approach to systematically investigating how epigenetic changes in chromatin architectures or DNA sequences relate to TF binding. We implemented statistical analyses to illustrate that epigenetic modifications are predictive of TF binding affinities, without the need of sequence information. Intriguingly, by considering genome locations relative to transcription start sites (TSSs) or enhancer midpoints, our analyses show that different locations display various relationship patterns. For instance, H3K4me3, H3k9ac and H3k27ac contribute more in the regions near TSSs, whereas H3K4me1 and H3k79me2 dominate in the regions far from TSSs. DNA methylation plays relatively important roles when close to TSSs than in other regions. In addition, the results show that epigenetic modification models for the predictions of TF binding affinities are cell line-specific. Taken together, our study elucidates highly coordinated, but location-and cell type-specific relationships between epigenetic modifications and binding affinities of TFs.
    Nucleic Acids Research 01/2015; 43(1). DOI:10.1093/nar/gkv255 · 8.81 Impact Factor

Full-text (2 Sources)

Available from
Oct 5, 2014