[Show abstract][Hide abstract] ABSTRACT: Somatic mutations affecting components of the RNA splicing machinery occur with high frequencies across many tumor types. These mutations give rise to distinct alterations in normal splice site and exon recognition, such as unusual 3' splice site preferences, that likely contribute to tumorigenesis.
We analyzed genome-wide patterns of RNA splicing across 805 matched tumor and normal control samples from 16 distinct cancer types to identify signals of abnormal cancer-associated splicing.
We found that abnormal RNA splicing, typified by widespread intron retention, is common across cancers even in the absence of mutations directly affecting the RNA splicing machinery. Almost all liquid and solid cancer types exhibited frequent retention of both alternative and constitutive introns relative to control normal tissues. The sole exception was breast cancer, where intron retention typified adjacent normal rather than cancer tissue. Different introns were preferentially retained in specific cancer types, although a small subset of introns enriched for genes encoding RNA splicing and export factors exhibited frequent retention across diverse cancers. The extent of intron retention correlated with the presence of IDH1 and IDH2 mutations in acute myeloid leukemia and across molecular subtypes in breast cancer. Many introns that were preferentially retained in primary cancers were present at high levels in the cytoplasmic mRNA pools of cancer cell lines.
Our data indicate that abnormal RNA splicing is a common characteristic of cancers even in the absence of mutational insults to the splicing machinery, and suggest that intron-containing mRNAs contribute to the transcriptional diversity of many cancers.
Genome Medicine 12/2015; 7(1). DOI:10.1186/s13073-015-0168-9 · 5.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The importance of androgen receptor (AR) signaling is increasingly being recognized in breast cancer, which has elicited clinical trials aimed at assessing the efficacy of androgen deprivation therapy (ADT) for metastatic disease. In prostate cancer, resistance to ADT is frequently associated with the emergence of androgen-independent splice variants of the AR (AR variants, AR-Vs) that lack the LBD and are constitutively active. Women with breast cancer may be prone to a similar phenomenon. Herein, we show that in addition to the prototypical transcript, the AR gene produces a diverse range of AR-V transcripts in primary breast tumors. The most frequently and highly expressed variant was AR-V7 (exons 1/2/3/CE3), which was detectable at the mRNA level in > 50% of all breast cancers and at the protein level in a subset of ERα-negative tumors. Functionally, AR-V7 is a constitutively active and ADT-resistant transcription factor that promotes growth and regulates a transcriptional program distinct from AR in ERα-negative breast cancer cells. Importantly, we provide ex vivo evidence that AR-V7 is upregulated by the AR antagonist enzalutamide in primary breast tumors. These findings have implications for treatment response in the ongoing clinical trials of ADT in breast cancer.
[Show abstract][Hide abstract] ABSTRACT: Facioscapulohumeral muscular dystrophy (FSHD) is a muscular dystrophy caused by inefficient epigenetic repression of the D4Z4 macrosatellite array and somatic expression of the DUX4 retrogene. DUX4 is a double homeobox transcription factor that is normally expressed in the testis and causes apoptosis and FSHD when mis-expressed in skeletal muscle. The mechanism(s) of DUX4 toxicity in muscle is incompletely understood. We report that DUX4-triggered proteolytic degradation of UPF1, a central component of the nonsense-mediated decay (NMD) machinery, is associated with profound NMD inhibition, resulting in global accumulation of RNAs normally degraded as NMD substrates. DUX4 mRNA is itself degraded by NMD, such that inhibition of NMD by DUX4 protein stabilizes DUX4 mRNA through a double-negative feedback loop in FSHD muscle cells. This feedback loop illustrates an unexpected mode of autoregulatory behavior of a transcription factor, is consistent with 'bursts' of DUX4 expression in FSHD muscle, and has implications for FSHD pathogenesis.
[Show abstract][Hide abstract] ABSTRACT: Substantial effort is currently devoted to identifying cancer-associated alterations using genomics. Here, we show that standard blood collection procedures rapidly change the transcriptional and posttranscriptional landscapes of hematopoietic cells, resulting in biased activation of specific biological pathways; up-regulation of pseudogenes, antisense RNAs, and unannotated coding isoforms; and RNA surveillance inhibition. Affected genes include common mutational targets and thousands of other genes participating in processes such as chromatin modification, RNA splicing, T- and B-cell activation, and NF-κB signaling. The majority of published leukemic transcriptomes exhibit signals of this incubation-induced dysregulation, explaining up to 40% of differences in gene expression and alternative splicing between leukemias and reference normal transcriptomes. The effects of sample processing are particularly evident in pan-cancer analyses. We provide biomarkers that detect prolonged incubation of individual samples and show that keeping blood on ice markedly reduces changes to the transcriptome. In addition to highlighting the potentially confounding effects of technical artifacts in cancer genomics data, our study emphasizes the need to survey the diversity of normal as well as neoplastic cells when characterizing tumors.
Proceedings of the National Academy of Sciences 11/2014; 111(47). DOI:10.1073/pnas.1413374111 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Whole-exome sequencing studies have identified common mutations affecting genes encoding components of the RNA splicing machinery in hematological malignancies. Here, we sought to determine how mutations affecting the 3' splice site recognition factor U2AF1 alter its normal role in RNA splicing. We find that U2AF1 mutations influence the similarity of splicing programs in leukemias, but do not give rise to widespread splicing failure. U2AF1 mutations cause differential splicing of hundreds of genes, affecting biological pathways such as DNA methylation (DNMT3B), X chromosome inactivation (H2AFY), the DNA damage response (ATR, FANCA), and apoptosis (CASP8). We show that U2AF1 mutations alter the preferred 3' splice site motif in patients, in cell culture, and in vitro. Mutations affecting the first and second zinc fingers give rise to different alterations in splice site preference and largely distinct downstream splicing programs. These allele-specific effects are consistent with a computationally predicted model of U2AF1 in complex with RNA. Our findings suggest that U2AF1 mutations contribute to pathogenesis by causing quantitative changes in splicing that affect diverse cellular pathways, and give insight into the normal function of U2AF1's zinc finger domains.
Genome Research 09/2014; 25(1). DOI:10.1101/gr.181016.114 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Recent studies have suggested that plant genomes have undergone potentially rampant horizontal gene transfer (HGT), especially in the mitochondrial genome. Parasitic plants have provided the strongest evidence of HGT, which appears to be facilitated by the intimate physical association between the parasites and their hosts. A recent phylogenomic study demonstrated that in the holoparasite (Rafflesiaceae), whose close relatives possess the world's largest flowers, about 2.1% of nuclear gene transcripts were likely acquired from its obligate host. Here, we used next-generation sequencing to obtain the 38 protein-coding and ribosomal RNA genes common to the mitochondrial genomes of angiosperms from and five additional species, including two of its closest relatives and two host species. Strikingly, our phylogenetic analyses conservatively indicate that 24%-41% of these gene sequences show evidence of HGT in Rafflesiaceae, depending on the species. Most of these transgenic sequences possess intact reading frames and are actively transcribed, indicating that they are potentially functional. Additionally, some of these transgenes maintain synteny with their donor and recipient lineages, suggesting that native genes have likely been displaced via homologous recombination. Our study is the first to comprehensively assess the magnitude of HGT in plants involving a genome (i.e., mitochondria) and a species interaction (i.e., parasitism) where it has been hypothesized to be potentially rampant. Our results establish for the first time that, although the magnitude of HGT involving nuclear genes is appreciable in these parasitic plants, HGT involving mitochondrial genes is substantially higher. This may represent a more general pattern for other parasitic plant clades and perhaps more broadly for angiosperms.
[Show abstract][Hide abstract] ABSTRACT: Long noncoding RNAs (lncRNAs) are often expressed in a development-specific manner, yet little is known about their roles in lineage commitment. Here, we identified Braveheart (Bvht), a heart-associated lncRNA in mouse. Using multiple embryonic stem cell (ESC) differentiation strategies, we show that Bvht is required for progression of nascent mesoderm toward a cardiac fate. We find that Bvht is necessary for activation of a core cardiovascular gene network and functions upstream of mesoderm posterior 1 (MesP1), a master regulator of a common multipotent cardiovascular progenitor. We also show that Bvht interacts with SUZ12, a component of polycomb-repressive complex 2 (PRC2), during cardiomyocyte differentiation, suggesting that Bvht mediates epigenetic regulation of cardiac commitment. Finally, we demonstrate a role for Bvht in maintaining cardiac fate in neonatal cardiomyocytes. Together, our work provides evidence for a long noncoding RNA with critical roles in the establishment of the cardiovascular lineage during mammalian development.
[Show abstract][Hide abstract] ABSTRACT: Recent studies have shown that plant genomes have potentially undergone rampant horizontal gene transfer (HGT). In plant parasitic systems HGT appears to be facilitated by the intimate physical association between the parasite and its host. HGT in these systems has been invoked when a DNA sequence obtained from a parasite is placed phylogenetically very near to its host rather than with its closest relatives. Studies of HGT in parasitic plants have relied largely on the fortuitous discovery of gene phylogenies that indicate HGT, and no broad systematic search for HGT has been undertaken in parasitic systems where it is most expected to occur.
We analyzed the transcriptomes of the holoparasite Rafflesia cantleyi Solms-Laubach and its obligate host Tetrastigma rafflesiae Miq. using phylogenomic approaches. Our analyses show that several dozen actively transcribed genes, most of which appear to be encoded in the nuclear genome, are likely of host origin. We also find that hundreds of vertically inherited genes (VGT) in this parasitic plant exhibit codon usage properties that are more similar to its host than to its closest relatives.
Our results establish for the first time a substantive number of HGTs in a plant host-parasite system. The elevated rate of unidirectional host-to- parasite gene transfer raises the possibility that HGTs may provide a fitness benefit to Rafflesia for maintaining these genes. Finally, a similar convergence in codon usage of VGTs has been shown in microbes with high HGT rates, which may help to explain the increase of HGTs in these parasitic plants.
[Show abstract][Hide abstract] ABSTRACT: Highly overlapping patterns of genome-wide binding of many distinct transcription factors have been observed in worms, insects, and mammals, but the origins and consequences of this overlapping binding remain unclear. While analyzing chromatin immunoprecipitation data sets from 21 sequence-specific transcription factors active in the Drosophila embryo, we found that binding of all factors exhibits a dose-dependent relationship with "TAGteam" sequence motifs bound by the zinc finger protein Vielfaltig, also known as Zelda, a recently discovered activator of the zygotic genome. TAGteam motifs are present and well conserved in highly bound regions, and are associated with transcription factor binding even in the absence of canonical recognition motifs for these factors. Furthermore, levels of binding in promoters and enhancers of zygotically transcribed genes are correlated with RNA polymerase II occupancy and gene expression levels. Our results suggest that Vielfaltig acts as a master regulator of early development by facilitating the genome-wide establishment of overlapping patterns of binding of diverse transcription factors that drive global gene expression.
Genome Research 02/2012; 22(4):656-65. DOI:10.1101/gr.130682.111 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Thousands of human genes contain introns ending in NAGNAG (N any nucleotide), where both NAGs can function as 3' splice sites, yielding isoforms that differ by inclusion/exclusion of three bases. However, few models exist for how such splicing might be regulated, and some studies have concluded that NAGNAG splicing is purely stochastic and nonfunctional. Here, we used deep RNA-Seq data from 16 human and eight mouse tissues to analyze the regulation and evolution of NAGNAG splicing. Using both biological and technical replicates to estimate false discovery rates, we estimate that at least 25% of alternatively spliced NAGNAGs undergo tissue-specific regulation in mammals, and alternative splicing of strongly tissue-specific NAGNAGs was 10 times as likely to be conserved between species as was splicing of non-tissue-specific events, implying selective maintenance. Preferential use of the distal NAG was associated with distinct sequence features, including a more distal location of the branch point and presence of a pyrimidine immediately before the first NAG, and alteration of these features in a splicing reporter shifted splicing away from the distal site. Strikingly, alignments of orthologous exons revealed a ∼15-fold increase in the frequency of three base pair gaps at 3' splice sites relative to nearby exon positions in both mammals and in Drosophila. Alternative splicing of NAGNAGs in human was associated with dramatically increased frequency of exon length changes at orthologous exon boundaries in rodents, and a model involving point mutations that create, destroy, or alter NAGNAGs can explain both the increased frequency and biased codon composition of gained/lost sequence observed at the beginnings of exons. This study shows that NAGNAG alternative splicing generates widespread differences between the proteomes of mammalian tissues, and suggests that the evolutionary trajectories of mammalian proteins are strongly biased by the locations and phases of the introns that interrupt coding sequences.