[Show abstract][Hide abstract] ABSTRACT: Somatic mutations affecting components of the RNA splicing machinery occur with high frequencies across many tumor types. These mutations give rise to distinct alterations in normal splice site and exon recognition, such as unusual 3' splice site preferences, that likely contribute to tumorigenesis.
We analyzed genome-wide patterns of RNA splicing across 805 matched tumor and normal control samples from 16 distinct cancer types to identify signals of abnormal cancer-associated splicing.
We found that abnormal RNA splicing, typified by widespread intron retention, is common across cancers even in the absence of mutations directly affecting the RNA splicing machinery. Almost all liquid and solid cancer types exhibited frequent retention of both alternative and constitutive introns relative to control normal tissues. The sole exception was breast cancer, where intron retention typified adjacent normal rather than cancer tissue. Different introns were preferentially retained in specific cancer types, although a small subset of introns enriched for genes encoding RNA splicing and export factors exhibited frequent retention across diverse cancers. The extent of intron retention correlated with the presence of IDH1 and IDH2 mutations in acute myeloid leukemia and across molecular subtypes in breast cancer. Many introns that were preferentially retained in primary cancers were present at high levels in the cytoplasmic mRNA pools of cancer cell lines.
Our data indicate that abnormal RNA splicing is a common characteristic of cancers even in the absence of mutational insults to the splicing machinery, and suggest that intron-containing mRNAs contribute to the transcriptional diversity of many cancers.
[Show abstract][Hide abstract] ABSTRACT: The importance of androgen receptor (AR) signaling is increasingly being recognized in breast cancer, which has elicited clinical trials aimed at assessing the efficacy of androgen deprivation therapy (ADT) for metastatic disease. In prostate cancer, resistance to ADT is frequently associated with the emergence of androgen-independent splice variants of the AR (AR variants, AR-Vs) that lack the LBD and are constitutively active. Women with breast cancer may be prone to a similar phenomenon. Herein, we show that in addition to the prototypical transcript, the AR gene produces a diverse range of AR-V transcripts in primary breast tumors. The most frequently and highly expressed variant was AR-V7 (exons 1/2/3/CE3), which was detectable at the mRNA level in > 50% of all breast cancers and at the protein level in a subset of ERα-negative tumors. Functionally, AR-V7 is a constitutively active and ADT-resistant transcription factor that promotes growth and regulates a transcriptional program distinct from AR in ERα-negative breast cancer cells. Importantly, we provide ex vivo evidence that AR-V7 is upregulated by the AR antagonist enzalutamide in primary breast tumors. These findings have implications for treatment response in the ongoing clinical trials of ADT in breast cancer.
[Show abstract][Hide abstract] ABSTRACT: Summary There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, and FLI1) or mutations (SPOP, FOXA1, and IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1 mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner, with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.
[Show abstract][Hide abstract] ABSTRACT: Facioscapulohumeral muscular dystrophy (FSHD) is a muscular dystrophy caused by inefficient epigenetic repression of the D4Z4 macrosatellite array and somatic expression of the DUX4 retrogene. DUX4 is a double homeobox transcription factor that is normally expressed in the testis and causes apoptosis and FSHD when mis-expressed in skeletal muscle. The mechanism(s) of DUX4 toxicity in muscle is incompletely understood. We report that DUX4-triggered proteolytic degradation of UPF1, a central component of the nonsense-mediated decay (NMD) machinery, is associated with profound NMD inhibition, resulting in global accumulation of RNAs normally degraded as NMD substrates. DUX4 mRNA is itself degraded by NMD, such that inhibition of NMD by DUX4 protein stabilizes DUX4 mRNA through a double-negative feedback loop in FSHD muscle cells. This feedback loop illustrates an unexpected mode of autoregulatory behavior of a transcription factor, is consistent with 'bursts' of DUX4 expression in FSHD muscle, and has implications for FSHD pathogenesis.
[Show abstract][Hide abstract] ABSTRACT: Substantial effort is currently devoted to identifying cancer-associated alterations using genomics. Here, we show that standard blood collection procedures rapidly change the transcriptional and posttranscriptional landscapes of hematopoietic cells, resulting in biased activation of specific biological pathways; up-regulation of pseudogenes, antisense RNAs, and unannotated coding isoforms; and RNA surveillance inhibition. Affected genes include common mutational targets and thousands of other genes participating in processes such as chromatin modification, RNA splicing, T- and B-cell activation, and NF-κB signaling. The majority of published leukemic transcriptomes exhibit signals of this incubation-induced dysregulation, explaining up to 40% of differences in gene expression and alternative splicing between leukemias and reference normal transcriptomes. The effects of sample processing are particularly evident in pan-cancer analyses. We provide biomarkers that detect prolonged incubation of individual samples and show that keeping blood on ice markedly reduces changes to the transcriptome. In addition to highlighting the potentially confounding effects of technical artifacts in cancer genomics data, our study emphasizes the need to survey the diversity of normal as well as neoplastic cells when characterizing tumors.
Full-text · Article · Nov 2014 · Proceedings of the National Academy of Sciences
[Show abstract][Hide abstract] ABSTRACT: Whole-exome sequencing studies have identified common mutations affecting genes encoding components of the RNA splicing machinery in hematological malignancies. Here, we sought to determine how mutations affecting the 3' splice site recognition factor U2AF1 alter its normal role in RNA splicing. We find that U2AF1 mutations influence the similarity of splicing programs in leukemias, but do not give rise to widespread splicing failure. U2AF1 mutations cause differential splicing of hundreds of genes, affecting biological pathways such as DNA methylation (DNMT3B), X chromosome inactivation (H2AFY), the DNA damage response (ATR, FANCA), and apoptosis (CASP8). We show that U2AF1 mutations alter the preferred 3' splice site motif in patients, in cell culture, and in vitro. Mutations affecting the first and second zinc fingers give rise to different alterations in splice site preference and largely distinct downstream splicing programs. These allele-specific effects are consistent with a computationally predicted model of U2AF1 in complex with RNA. Our findings suggest that U2AF1 mutations contribute to pathogenesis by causing quantitative changes in splicing that affect diverse cellular pathways, and give insight into the normal function of U2AF1's zinc finger domains.
[Show abstract][Hide abstract] ABSTRACT: To identify key regulators of human brain tumor maintenance and initiation, we performed multiple genome-wide RNAi screens in patient-derived glioblastoma multiforme (GBM) stem cells (GSCs). These screens identified the plant homeodomain (PHD)-finger domain protein PHF5A as differentially required for GSC expansion, as compared with untransformed neural stem cells (NSCs) and fibroblasts. Given PHF5A's known involvement in facilitating interactions between the U2 snRNP complex and ATP-dependent helicases, we examined cancer-specific roles in RNA splicing. We found that in GSCs, but not untransformed controls, PHF5A facilitates recognition of exons with unusual C-rich 3' splice sites in thousands of essential genes. PHF5A knockdown in GSCs, but not untransformed NSCs, astrocytes, or fibroblasts, inhibited splicing of these genes, leading to cell cycle arrest and loss of viability. Notably, pharmacologic inhibition of U2 snRNP activity phenocopied PHF5A knockdown in GSCs and also in NSCs or fibroblasts overexpressing MYC. Furthermore, PHF5A inhibition compromised GSC tumor formation in vivo and inhibited growth of established GBM patient-derived xenograft tumors. Our results demonstrate a novel viability requirement for PHF5A to maintain proper exon recognition in brain tumor-initiating cells and may provide new inroads for novel anti-GBM therapeutic strategies.
Preview · Article · May 2013 · Genes & development
[Show abstract][Hide abstract] ABSTRACT: Recent studies have suggested that plant genomes have undergone potentially rampant horizontal gene transfer (HGT), especially in the mitochondrial genome. Parasitic plants have provided the strongest evidence of HGT, which appears to be facilitated by the intimate physical association between the parasites and their hosts. A recent phylogenomic study demonstrated that in the holoparasite (Rafflesiaceae), whose close relatives possess the world's largest flowers, about 2.1% of nuclear gene transcripts were likely acquired from its obligate host. Here, we used next-generation sequencing to obtain the 38 protein-coding and ribosomal RNA genes common to the mitochondrial genomes of angiosperms from and five additional species, including two of its closest relatives and two host species. Strikingly, our phylogenetic analyses conservatively indicate that 24%-41% of these gene sequences show evidence of HGT in Rafflesiaceae, depending on the species. Most of these transgenic sequences possess intact reading frames and are actively transcribed, indicating that they are potentially functional. Additionally, some of these transgenes maintain synteny with their donor and recipient lineages, suggesting that native genes have likely been displaced via homologous recombination. Our study is the first to comprehensively assess the magnitude of HGT in plants involving a genome (i.e., mitochondria) and a species interaction (i.e., parasitism) where it has been hypothesized to be potentially rampant. Our results establish for the first time that, although the magnitude of HGT involving nuclear genes is appreciable in these parasitic plants, HGT involving mitochondrial genes is substantially higher. This may represent a more general pattern for other parasitic plant clades and perhaps more broadly for angiosperms.
[Show abstract][Hide abstract] ABSTRACT: Long noncoding RNAs (lncRNAs) are often expressed in a development-specific manner, yet little is known about their roles in lineage commitment. Here, we identified Braveheart (Bvht), a heart-associated lncRNA in mouse. Using multiple embryonic stem cell (ESC) differentiation strategies, we show that Bvht is required for progression of nascent mesoderm toward a cardiac fate. We find that Bvht is necessary for activation of a core cardiovascular gene network and functions upstream of mesoderm posterior 1 (MesP1), a master regulator of a common multipotent cardiovascular progenitor. We also show that Bvht interacts with SUZ12, a component of polycomb-repressive complex 2 (PRC2), during cardiomyocyte differentiation, suggesting that Bvht mediates epigenetic regulation of cardiac commitment. Finally, we demonstrate a role for Bvht in maintaining cardiac fate in neonatal cardiomyocytes. Together, our work provides evidence for a long noncoding RNA with critical roles in the establishment of the cardiovascular lineage during mammalian development.
[Show abstract][Hide abstract] ABSTRACT: Recent studies have shown that plant genomes have potentially undergone rampant horizontal gene transfer (HGT). In plant parasitic systems HGT appears to be facilitated by the intimate physical association between the parasite and its host. HGT in these systems has been invoked when a DNA sequence obtained from a parasite is placed phylogenetically very near to its host rather than with its closest relatives. Studies of HGT in parasitic plants have relied largely on the fortuitous discovery of gene phylogenies that indicate HGT, and no broad systematic search for HGT has been undertaken in parasitic systems where it is most expected to occur.
We analyzed the transcriptomes of the holoparasite Rafflesia cantleyi Solms-Laubach and its obligate host Tetrastigma rafflesiae Miq. using phylogenomic approaches. Our analyses show that several dozen actively transcribed genes, most of which appear to be encoded in the nuclear genome, are likely of host origin. We also find that hundreds of vertically inherited genes (VGT) in this parasitic plant exhibit codon usage properties that are more similar to its host than to its closest relatives.
Our results establish for the first time a substantive number of HGTs in a plant host-parasite system. The elevated rate of unidirectional host-to- parasite gene transfer raises the possibility that HGTs may provide a fitness benefit to Rafflesia for maintaining these genes. Finally, a similar convergence in codon usage of VGTs has been shown in microbes with high HGT rates, which may help to explain the increase of HGTs in these parasitic plants.
[Show abstract][Hide abstract] ABSTRACT: Highly overlapping patterns of genome-wide binding of many distinct transcription factors have been observed in worms, insects, and mammals, but the origins and consequences of this overlapping binding remain unclear. While analyzing chromatin immunoprecipitation data sets from 21 sequence-specific transcription factors active in the Drosophila embryo, we found that binding of all factors exhibits a dose-dependent relationship with "TAGteam" sequence motifs bound by the zinc finger protein Vielfaltig, also known as Zelda, a recently discovered activator of the zygotic genome. TAGteam motifs are present and well conserved in highly bound regions, and are associated with transcription factor binding even in the absence of canonical recognition motifs for these factors. Furthermore, levels of binding in promoters and enhancers of zygotically transcribed genes are correlated with RNA polymerase II occupancy and gene expression levels. Our results suggest that Vielfaltig acts as a master regulator of early development by facilitating the genome-wide establishment of overlapping patterns of binding of diverse transcription factors that drive global gene expression.
[Show abstract][Hide abstract] ABSTRACT: Technical variability in human libraries. Single-end (75 bp) and paired-end (2×50 bp) sequencing of the same human libraries captures sequencing variability.
[Show abstract][Hide abstract] ABSTRACT: Splice site score difference and maximum splice site score as a function of switch score for different classes of alternative 3′ splice sites. (A) The splice site scores of regulated NAGNAG 3′ splice sites tended to be far more similar to one another than those of unregulated events, suggesting that regulation is easier to achieve when the intrinsic strengths of the sites are evenly matched. (B–C) This trend was much weaker for more distant alternative 3′ splice site events. (D) The 3′ splice site scores of tissue-regulated NAGNAGs also tended to be somewhat weaker than for unregulated NAGNAGs or constitutive 3′ splice sites. This observation suggested that weaker splice sites are more easily regulated, consistent with previous studies of other types of alternative splicing. (E–F) This trend for regulated events to be associated with weaker splice site scores was observed to a much lesser extent for alternative 3′ splice sites separated by longer distances, suggesting that splicing regulatory elements may more readily exert differential effects on more widely spaced 3′ splice sites, making matching of splice site scores less critical for achieving regulation for this class than it is for NAGNAGs. For example, we have previously shown that most exonic splicing silencer (ESS) elements inhibit the intron-proximal site when situated between competing 3′ splice sites, an arrangement that requires separation of the competing sites by sufficient space to accommodate the ESS, and so does not apply to NAGNAGs. “v. low” indicates “very low,” and “CJ” indicates the 3′ splice sites of constitutive junctions.
[Show abstract][Hide abstract] ABSTRACT: Relative conservation at the −4 position for different classes of NAGNAGs. Plot shows median relative conservation at the −4 position, computed as (phastCons score at −4 position/phastCons score at −3 position). “CJ” indicates the 3′ splice sites of constitutive junctions. Error bars indicate the standard error of the median, estimated by bootstrapping.