Article

RNA-SeQC: RNA-seq metrics for quality control and process optimization

Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Bioinformatics (Impact Factor: 4.62). 04/2012; 28(11):1530-2. DOI: 10.1093/bioinformatics/bts196
Source: PubMed

ABSTRACT Summary: RNA-seq, the application of next-generation sequencing to RNA, provides transcriptome-wide characterization of cellular activity. Assessment of sequencing performance and library quality is critical to the interpretation of RNA-seq data, yet few tools exist to address this issue. We introduce RNA-SeQC, a program which provides key measures of data quality. These metrics include yield, alignment and duplication rates; GC bias, rRNA content, regions of alignment (exon, intron and intragenic), continuity of coverage, 3′/5′ bias and count of detectable transcripts, among others. The software provides multi-sample evaluation of library construction protocols, input materials and other experimental parameters. The modularity of the software enables pipeline integration and the routine monitoring of key measures of data quality such as the number of alignable reads, duplication rates and rRNA contamination. RNA-SeQC allows investigators to make informed decisions about sample inclusion in downstream analysis. In summary, RNA-SeQC provides quality control measures critical to experiment design, process optimization and downstream computational analysis.Availability and implementation: See www.genepattern.org to run online, or www.broadinstitute.org/rna-seqc/ for a command line tool.Contact:
ddeluca@broadinstitute.orgSupplementary information:
Supplementary data are available at Bioinformatics online.

1 Follower
 · 
442 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The human subependymal zone (SEZ) is debatably a source of newly born neurons throughout life and neurogenesis is a multi-step process requiring distinct transcripts during cell proliferation and early neuronal maturation, along with orchestrated changes in gene expression during cell state/fate transitions. Furthermore, it is becoming increasingly clear that the majority of our genome that results in production of non-protein coding RNAs plays vital roles in the evolution, development, adaptation and region-specific function of the human brain. We predicted that some transcripts expressed in the SEZ may be unique to this specialized brain region, and that a comprehensive transcriptomic analysis of this region would aid in defining expression changes during neuronal birth and growth in adult humans. Here, we used deep RNA sequencing of human SEZ tissue during adulthood and aging to characterize the transcriptional landscape with a particular emphasis on long non-coding RNAs (lncRNAs). The data shows predicted age-related changes in mRNAs encoding proliferation, progenitor and inflammatory proteins as well as a unique subset of lncRNAs that are highly expressed in the human SEZ, many of which have unknown functions. Our results suggest the existence of robust proliferative and neuronal differentiation potential in the adult human SEZ and lay the foundation for understanding the involvement of lncRNAs in postnatal neurogenesis and potentially associated neurodevelopmental diseases that emerge after birth.
    Frontiers in Neurology 03/2015; 6. DOI:10.3389/fneur.2015.00045
  • [Show abstract] [Hide abstract]
    ABSTRACT: Massively parallel strand-specific sequencing of RNA (ssRNA-seq) has emerged as a powerful tool for profiling complex transcriptomes. However, many current methods for ssRNA-seq suffer from the underrepresentation of both the 5' and 3' ends of RNAs, which can be attributed to second-strand cDNA synthesis. The 5' and 3' ends of RNA harbour crucial information for gene regulation; namely, transcription start sites (TSSs) and polyadenylation sites. Here we report a novel ssRNA-seq method that does not involve second-strand cDNA synthesis, as we Directly Ligate sequencing Adaptors to the First-strand cDNA (DLAF). This novel method with fewer enzymatic reactions results in a higher quality of the libraries than the conventional method. Sequencing of DLAF libraries followed by a novel analysis pipeline enables the profiling of both 5' ends and polyadenylation sites at near-base resolution. Therefore, DLAF offers the first genomics tool to obtain the 'full-length' transcriptome with a single library.
    Nature Communications 01/2015; 6:6002. DOI:10.1038/ncomms7002 · 10.74 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background To date, there have been no reports characterizing the genome-wide somatic DNA chromosomal copy-number alteration landscape in metastatic urothelial carcinoma. We sought to characterize the DNA copy-number profile in a cohort of metastatic samples and compare them to a cohort of primary urothelial carcinoma samples in order to identify changes that are associated with progression from primary to metastatic disease. Methods Using molecular inversion probe array analysis we compared genome-wide chromosomal copy-number alterations between 30 metastatic and 29 primary UC samples. Whole transcriptome RNA-Seq analysis was also performed in primary and matched metastatic samples which was available for 9 patients. Results Based on a focused analysis of 32 genes in which alterations may be clinically actionable, there were significantly more amplifications/deletions in metastases (8.6% vs 4.5%, p < 0.001). In particular, there was a higher frequency of E2F3 amplification in metastases (30% vs 7%, p = 0.046). Paired primary and metastatic tissue was available for 11 patients and 3 of these had amplifications of potential clinical relevance in metastases that were not in the primary tumor including ERBB2, CDK4, CCND1, E2F3, and AKT1. The transcriptional activity of these amplifications was supported by RNA expression data. Conclusions The discordance in alterations between primary and metastatic tissue may be of clinical relevance in the era of genomically directed precision cancer medicine. Electronic supplementary material The online version of this article (doi:10.1186/s12885-015-1192-2) contains supplementary material, which is available to authorized users.
    BMC Cancer 04/2015; 15(1). DOI:10.1186/s12885-015-1192-2 · 3.32 Impact Factor

Full-text (3 Sources)

Download
70 Downloads
Available from
May 26, 2014