Article

Reanalysis of RNA-Sequencing Data Reveals Several Additional Fusion Genes with Multiple Isoforms

The Institute of Cancer Research, London, United Kingdom
PLoS ONE (Impact Factor: 3.53). 10/2012; 7(10):e48745. DOI: 10.1371/journal.pone.0048745
Source: PubMed

ABSTRACT RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

Download full-text

Full-text

Available from: Sara Kangaspeska, Aug 16, 2015
0 Followers
 · 
159 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases.
    BMC Genomics 08/2013; 14(1):550. DOI:10.1186/1471-2164-14-550 · 4.04 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: RNA-Seq provides a powerful approach to carry out ab initio investigation of fusion transcripts representing critical translocation and post-transcriptional events that recode hereditary information. Most of existing computational fusion detection tools are challenged by the issues of accuracy and how to handle multiple mappings. We present a novel tool SOAPfusion for fusion discovery with paired-end RNA-Seq reads. SOAPfusion is accurate and efficient for fusion discovery with high sensitivity (≥93%), low false positive rate (≤1.36%) even the coverage is as low as 10X, highlighting its ability to detect fusions efficiently at low sequencing cost. From real data of UHRR samples, SOAPfusion detected a novel NPEPPS-TBC1D3 fusion which has been validated through RT-PCR followed by Sanger sequencing. SOAPfusion thus proves to be an effective method with precise applicability in search of fusion transcripts, which is advantageous to accelerate pathological and therapeutic cancer studies. http://soap.genomics.org.cn/SOAPfusion.html CONTACT: smyiu@cs.hku.hk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
    Bioinformatics 10/2013; DOI:10.1093/bioinformatics/btt522 · 4.62 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Cancer-specific fusion genes are often caused by cytogenetically visible chromosomal rearrangements such as translocations, inversions, deletions or insertions, they can be the targets of molecular therapy, they play a key role in the accurate diagnosis and classification of neoplasms, and they are of prognostic impact. The identification of novel fusion genes in various neoplasms therefore not only has obvious research importance, but is also potentially of major clinical significance. The "traditional" methodology to detect them began with cytogenetic analysis to find the chromosomal rearrangement, followed by utilization of fluorescence in situ hybridization techniques to find the probe which spans the chromosomal breakpoint, and finally molecular cloning to localize the breakpoint more precisely and identify the genes fused by the chromosomal rearrangement. Although laborious, the above-mentioned sequential approach is robust and reliable and a number of fusion genes have been cloned by such means. Next generation sequencing (NGS), mainly RNA sequencing (RNA-Seq), has opened up new possibilities to detect fusion genes even when cytogenetic aberrations are cryptic or information about them is unknown. However, NGS suffers from the shortcoming of identifying as "fusion genes" also many technical, biological and, perhaps in particular, clinical "false positives," thus making the assessment of which fusions are important and which are noise extremely difficult. The best way to overcome this risk of information overflow is, whenever reliable cytogenetic information is at hand, to compare karyotyping and sequencing data and concentrate exclusively on those suggested fusion genes that are found in chromosomal breakpoints.
    The International Journal of Biochemistry & Cell Biology 05/2014; 53. DOI:10.1016/j.biocel.2014.05.018 · 4.24 Impact Factor
Show more