Levin JZ, Berger MF, Adiconis X, Rogov P, Melnikov A, Fennell T et al.. Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol 10: R115

Genome Sequencing and Analysis Program, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, MA 02141, USA.
Genome biology (Impact Factor: 10.81). 10/2009; 10(10):R115. DOI: 10.1186/gb-2009-10-10-r115
Source: PubMed


Targeted RNA-Seq combines next-generation sequencing with capture of sequences from a relevant subset of a transcriptome. When testing by capturing sequences from a tumor cDNA library by hybridization to oligonucleotide probes specific for 467 cancer-related genes, this method showed high selectivity, improved mutation detection enabling discovery of novel chimeric transcripts, and provided RNA expression data. Thus, targeted RNA-Seq produces an enhanced view of the molecular state of a set of "high interest" genes.

Download full-text


Available from: Chad Nusbaum, Oct 04, 2015
1 Follower
21 Reads
  • Source
    • "XKR3 is a membrane transporter in the XK/Kell complex of the Kell blood group system, located at chromosome 22q11.1 (51). Levin et al (52) investigated gene fusions in the cDNA Illumina data (Illumina, Inc., San Diego, CA, USA) of K562 (a CML cell line) using targeted RNA sequencing. In addition to the BCR-ABL1 fusion gene, a novel NUP214-XKR3 fusion gene was identified in the cDNA library. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Nucleoporin 214 (NUP214), previously termed CAN, is required for cell cycle and nucleocytoplasmic transport. The genetic features and clinical implications of five NUP214-associated fusion genes are described in this review. SET-NUP214 was most frequently observed in T-cell acute lymphoblastic leukemia (T-ALL), concomitant with the elevated expression of HOXA cluster genes. Furthermore, the fusion transcript may be regarded as a potential minimal residual disease marker for SET-NUP214-positive patients. Episomal amplifications of NUP214-ABL1 are specific to T-ALL patients. The NUP214-ABL1 gene is observed in ~6% of T-ALL, in children and adults. Targeted tyrosine kinase inhibitors plus standard chemotherapy appear to present a promising treatment strategy. DEK-NUP214 is formed by the fusion of exon 2 of DEK and exon 6 of NUP214. Achieving molecular negativity of DEK-NUP214 is of great importance for individual management. SQSTM1-NUP214 and NUP214-XKR3 were only identified in one T-ALL patient and one cell line, respectively. The NUP214 fusions have significant diagnostic and therapeutic implications for leukemia patients. Additional NUP214-associated fusions require identification in future studies.
    Oncology letters 09/2014; 8(3):959-962. DOI:10.3892/ol.2014.2263 · 1.55 Impact Factor
  • Source
    • "Next-generation DNA sequencing is a powerful tool in biological research [1] and is steadily gaining momentum as costs keep decreasing. Applications vary from genome re-sequencing [2-4] to transcriptome analysis [5-7], metagenomics projects [8-10], and sequencing of ancient genomes [11-16]. All these applications rely on mapping reads to existing reference genomes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Modern DNA sequencing methods produce vast amounts of data that often requires mapping to a reference genome. Most existing programs use the number of mismatches between the read and the genome as a measure of quality. This approach is without a statistical foundation and can for some data types result in many wrongly mapped reads. Here we present a probabilistic mapping method based on position-specific scoring matrices, which can take into account not only the quality scores of the reads but also user-specified models of evolution and data-specific biases. We show how evolution, data-specific biases, and sequencing errors are naturally dealt with probabilistically. Our method achieves better results than Bowtie and BWA on simulated and real ancient and PAR-CLIP reads, as well as on simulated reads from the AT rich organism P. falciparum, when modeling the biases of these data. For simulated Illumina reads, the method has consistently higher sensitivity for both single-end and paired-end data. We also show that our probabilistic approach can limit the problem of random matches from short reads of contamination and that it improves the mapping of real reads from one organism (D. melanogater) to a related genome (D. simulans). The presented work is an implementation of a novel approach to short read mapping where quality scores, prior mismatch probabilities and mapping qualities are handled in a statistically sound manner. The resulting implementation provides not only a tool for biologists working with low quality and/or biased sequencing data but also a demonstration of the feasibility of using a probability based alignment method on real and simulated data sets.
    BMC Bioinformatics 04/2014; 15(1):100. DOI:10.1186/1471-2105-15-100 · 2.58 Impact Factor
  • Source
    • "For example, targeted RNA sequencing is an efficient way to reduce the overall sequencing costs to interrogate a small set of transcripts of interest20. With this technology, it is possible to increase the read coverage of the low abundance transcripts so they can be assembled, thereby enabling the study of their genomic and transcriptional features, such as RNA-editing, splicing, and potential gene fusions2122. If the goal is to study genes with low expression levels at the genome scale, normalization during RNA-seq library construction may be an effective way to reduce the representation of highly expressed transcripts while enriching for the low abundant transcripts23. "
    [Show abstract] [Hide abstract]
    ABSTRACT: RNA-sequencing (RNA-seq) enables in-depth exploration of transcriptomes, but typical sequencing depth often limits its comprehensiveness. In this study, we generated nearly 3 billion RNA-Seq reads, totaling 341 Gb of sequence, from a Zea mays seedling sample. At this depth, a near complete snapshot of the transcriptome was observed consisting of over 90% of the annotated transcripts, including lowly expressed transcription factors. A novel hybrid strategy combining de novo and reference-based assemblies yielded a transcriptome consisting of 126,708 transcripts with 88% of expressed known genes assembled to full-length. We improved current annotations by adding 4,842 previously unannotated transcript variants and many new features, including 212 maize transcripts, 201 genes, 10 genes with undocumented potential roles in seedlings as well as maize lineage specific gene fusion events. We demonstrated the power of deep sequencing for large transcriptome studies by generating a high quality transcriptome, which provides a rich resource for the research community.
    Scientific Reports 03/2014; 4:4519. DOI:10.1038/srep04519 · 5.58 Impact Factor
Show more