[Show abstract][Hide abstract] ABSTRACT: We propose the Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. MAGeCK demonstrates better performance compared with existing methods, identifies both positively- and negatively-selected genes simultaneously, and reports robust results across different experimental conditions. Using public datasets, MAGeCK identified novel essential genes and pathways, including EGFR in vemurafenib treated A375 cells harboring a BRAF mutation. MAGeCK also detected cell-type specific essential genes including BCR and ABL1 in the KBM7 cells bearing a BCR-ABL fusion, and IGF1R in the HL-60 cells, which depends on the insulin signaling pathway for proliferation.
[Show abstract][Hide abstract] ABSTRACT: We propose a statistical algorithm MethylPurify that uses regions with bisulfite reads showing discordant methylation levels to infer tumor purity from tumor samples alone. MethylPurify can identify differentially methylated regions (DMRs) from individual tumor methylome samples, without genomic variation information or prior knowledge from other datasets. In simulations with mixed bisulfite reads from cancer and normal cell lines, MethylPurify correctly inferred tumor purity and identified over 96% of the DMRs. From patient data, MethylPurify gave satisfactory DMR calls from tumor methylome samples alone, and revealed potential missed DMRs by tumor to normal comparison due to tumor heterogeneity.
[Show abstract][Hide abstract] ABSTRACT: Sequencing of DNase I hypersensitive sites (DNase-seq) is a powerful technique for identifying cis-regulatory elements across the genome. We studied the key experimental parameters to optimize performance of DNase-seq. Sequencing short fragments of 50-100 base pairs (bp) that accumulate in long internucleosome linker regions was more efficient for identifying transcription factor binding sites compared to sequencing longer fragments. We also assessed the potential of DNase-seq to predict transcription factor occupancy via generation of nucleotide-resolution transcription factor footprints. In modeling the sequence-specific DNase I cutting bias, we found a strong effect that varied over more than two orders of magnitude. This indicates that the nucleotide-resolution cleavage patterns at many transcription factor binding sites are derived from intrinsic DNase I cleavage bias rather than from specific protein-DNA interactions. In contrast, quantitative comparison of DNase I hypersensitivity between states can predict transcription factor occupancy associated with particular biological perturbations.
[Show abstract][Hide abstract] ABSTRACT: If trait-associated variants alter regulatory regions, then they should fall within chromatin marks in relevant cell types. However, it is unclear which of the many marks are most useful in defining cell types associated with disease and fine mapping variants. We hypothesized that informative marks are phenotypically cell type specific; that is, SNPs associated with the same trait likely overlap marks in the same cell type. We examined 15 chromatin marks and found that those highlighting active gene regulation were phenotypically cell type specific. Trimethylation of histone H3 at lysine 4 (H3K4me3) was the most phenotypically cell type specific (P < 1 × 10(-6)), driven by colocalization of variants and marks rather than gene proximity (P < 0.001). H3K4me3 peaks overlapped with 37 SNPs for plasma low-density lipoprotein concentration in the liver (P < 7 × 10(-5)), 31 SNPs for rheumatoid arthritis within CD4(+) regulatory T cells (P = 1 × 10(-4)), 67 SNPs for type 2 diabetes in pancreatic islet cells (P = 0.003) and the liver (P = 0.003), and 14 SNPs for neuropsychiatric disease in neuronal tissues (P = 0.007). We show how cell type-specific H3K4me3 peaks can inform the fine mapping of associated SNPs to identify causal variation.
[Show abstract][Hide abstract] ABSTRACT: Epigenetic regulators represent a promising new class of therapeutic targets for cancer. Enhancer of zeste homolog 2 (EZH2), a subunit of Polycomb repressive complex 2 (PRC2), silences gene expression via its histone methyltransferase activity. We found that the oncogenic function of EZH2 in cells of castration-resistant prostate cancer is independent of its role as a transcriptional repressor. Instead, it involves the ability of EZH2 to act as a coactivator for critical transcription factors including the androgen receptor. This functional switch is dependent on phosphorylation of EZH2 and requires an intact methyltransferase domain. Hence, targeting the non-PRC2 function of EZH2 may have therapeutic efficacy for treating metastatic, hormone-refractory prostate cancer.
[Show abstract][Hide abstract] ABSTRACT: Histone modifications play important roles in regulating eukaryotic gene expression and have been used to model expression levels. Here, we present a regression model to systematically infer mRNA stability by comparing transcriptome profiles with ChIP-seq of H3K4me3, H3K27me3 and H3K36me3. The results from multiple human and mouse cell lines show that the inferred unstable mRNAs have significantly longer 3'Untranslated Regions (UTRs) and more microRNA binding sites within 3'UTR than the inferred stable mRNAs. Regression residuals derived from RNA-seq, but not from GRO-seq, are highly correlated with the half-lives measured by pulse-labeling experiments, supporting the rationale of our inference. Whereas, the functions enriched in the inferred stable and unstable mRNAs are consistent with those from pulse-labeling experiments, we found the unstable mRNAs have higher cell-type specificity under functional constraint. We conclude that the systematical use of histone modifications can differentiate non-expressed mRNAs from unstable mRNAs, and distinguish stable mRNAs from highly expressed ones. In summary, we represent the first computational model of mRNA stability inference that compares transcriptome and epigenome profiles, and provides an alternative strategy for directing experimental measurements.
Nucleic Acids Research 04/2012; 40(14):6414-23. · 8.81 Impact Factor