[Show abstract][Hide abstract] ABSTRACT: Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model.
We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models.
We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.
[Show abstract][Hide abstract] ABSTRACT: We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics.
At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used.
In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed
for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline,
and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb),
provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
[Show abstract][Hide abstract] ABSTRACT: Relative to microarrays, RNA-seq has been reported to offer higher precision estimates of transcript abundance, a greater dynamic range, and detection of novel transcripts. However, previous comparisons of the two technologies have not covered dose-response experiments that are relevant to toxicology. Male F344 rats were exposed for 13 weeks to five doses of bromobenzene and liver gene expression was measured using both microarrays and RNA-seq. Multiple normalization methods were evaluated for each technology and gene expression changes were statistically analyzed using both analysis of variance and benchmark dose (BMD). Fold-change values were highly correlated between the two technologies, while the -log p-values showed lower correlation. RNA-seq detected fewer statistically significant genes at lower doses, but more significant genes based on fold-change except when a negative binomial transformation was applied. Overlap in genes significant by both p-value and fold-change was approximately 30-40%. Random sampling of the RNA-seq data showed an equivalent number of differentially expressed genes compared to microarrays at ~5 million reads. Quantitative RT-PCR of differentially expressed genes uniquely identified by each technology showed a high degree of confirmation when both fold-change and p-value were considered. The mean dose-response expression of each gene was highly correlated between technologies, while estimates of sample variability and gene-based BMD values showed lower correlation. Differences in BMD estimates and statistical significance may be due, in part, to differences in the dynamic range of each technology and the degree to which normalization corrects genes at either end of the scale.
[Show abstract][Hide abstract] ABSTRACT: The use of high-throughput in vitro assays has been proposed to play a significant role in the future of toxicity testing. In this study, rat hepatic metabolic clearance and plasma protein binding were measured for 59 ToxCast Phase I chemicals. Computational in vitro-to-in vivo extrapolation (IVIVE) was used to estimate the daily dose in a rat, called the oral equivalent dose, which would result in steady-state in vivo blood concentrations equivalent to AC(50) or lowest effective concentration (LEC) across more than 600 ToxCast Phase I in vitro assays. Statistical classification analysis was performed using either oral equivalent doses or unadjusted AC(50)/LEC values for the in vitro assays to predict the in vivo effects of the 59 chemicals. Adjusting the in vitro assays for pharmacokinetics did not improve the ability to predict in vivo effects as either a discrete (yes or no) response or as a low effect level (LEL) on a continuous dose scale. Interestingly, a comparison of the in vitro assay with the lowest oral equivalent dose with the in vivo endpoint with the lowest LEL suggested that the lowest oral equivalent dose may provide a conservative estimate of the point-of-departure for a chemical in a dose-response assessment. Further, comparing the oral equivalent doses for the in vitro assays with the in vivo dose range that resulted in adverse effects identified more coincident in vitro assays across chemicals than expected by chance suggesting that the approach may be used to identify potential molecular initiating events leading to adversity.
[Show abstract][Hide abstract] ABSTRACT: Over the past 5 years, increased attention has been focused on using high-throughput in vitro screening for identifying chemical hazards and prioritizing chemicals for additional in vivo testing. The U.S. Environmental Protection Agency's ToxCast program has generated a significant amount of high-throughput screening data allowing a broad-based assessment of the utility of these assays for predicting in vivo responses. In this study, a comprehensive cross-validation model comparison was performed to evaluate the predictive performance of the more than 600 in vitro assays from the ToxCast phase I screening effort across 60 in vivo endpoints using 84 different statistical classification methods. The predictive performance of the in vitro assays was compared and combined with that from chemical structure descriptors. With the exception of chronic in vivo cholinesterase inhibition, the overall predictive power of both the in vitro assays and the chemical descriptors was relatively low. The predictive power of the in vitro assays was not significantly different from that of the chemical descriptors and aggregating the assays based on genes reduced predictive performance. Prefiltering the in vitro assay data outside the cross-validation loop, as done in some previous studies, significantly biased estimates of model performance. The results suggest that the current ToxCast phase I assays and chemicals have limited applicability for predicting in vivo chemical hazards using standard statistical classification methods. However, if viewed as a survey of potential molecular initiating events and interpreted as risk factors for toxicity, the assays may still be useful for chemical prioritization.
[Show abstract][Hide abstract] ABSTRACT: Normalization of gene expression data has been studied for many years and various strategies have been formulated to deal with various types of data. Most normalization algorithms rely on the assumption that the number of up-regulated genes and the number of down-regulated genes are roughly the same. However, the well-known Golden Spike experiment presents a unique situation in which differentially regulated genes are biased toward one direction, thereby challenging the conclusions of previous bench mark studies.
This study proposes two novel approaches, KDL and KDQ, based on kernel density estimation to improve upon the basic idea of invariant set selection. The key concept is to provide various importance scores to data points on the MA plot according to their proximity to the cluster of the null genes under the assumption that null genes are more densely distributed than those that are differentially regulated. The comparison is demonstrated in the Golden Spike experiment as well as with simulation data using the ROC curves and compression rates. KDL and KDQ in combination with GCRMA provided the best performance among all approaches.
This study determined that methods based on invariant sets are better able to resolve the problem of asymmetry. Normalization, either before or after expression summary for probesets, improves performance to a similar degree.
[Show abstract][Hide abstract] ABSTRACT: Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.
[Show abstract][Hide abstract] ABSTRACT: Batch effects are present in microarray experiments due to poor experimental design and when data are combined from different studies. To assess the quantity of batch effects, we present a novel hybrid approach known as principal variance component analysis (PVCA). The approach leverages the strengths of two popular data analysis methods: principal component analysis and variance components analysis, and integrates into a novel algorithm. It can be used as a screening tool to determine the sources of variability, and, using the eigenvalues associated with their corresponding eigenvectors as weights, to quantify the magnitude of each source of variability (including each batch effect) presented as a proportion of total variance. Although PVCA is a generic approach for quantifying the corresponding proportion of variation of each effect, it can be a handy assessment for estimating batch effect before and after batch normalization.
Batch Effects and Noise in Microarray Experiments: Sources and Solutions, 11/2009: pages 141 - 154; , ISBN: 9780470685983
[Show abstract][Hide abstract] ABSTRACT: Geminiviruses are small DNA viruses that use plant replication machinery to amplify their genomes. Microarray analysis of the Arabidopsis (Arabidopsis thaliana) transcriptome in response to cabbage leaf curl virus (CaLCuV) infection uncovered 5,365 genes (false discovery rate <0.005) differentially expressed in infected rosette leaves at 12 d postinoculation. Data mining revealed that CaLCuV triggers a pathogen response via the salicylic acid pathway and induces expression of genes involved in programmed cell death, genotoxic stress, and DNA repair. CaLCuV also altered expression of cell cycle-associated genes, preferentially activating genes expressed during S and G2 and inhibiting genes active in G1 and M. A limited set of core cell cycle genes associated with cell cycle reentry, late G1, S, and early G2 had increased RNA levels, while core cell cycle genes linked to early G1 and late G2 had reduced transcripts. Fluorescence-activated cell sorting of nuclei from infected leaves revealed a depletion of the 4C population and an increase in 8C, 16C, and 32C nuclei. Infectivity studies of transgenic Arabidopsis showed that overexpression of CYCD3;1 or E2FB, both of which promote the mitotic cell cycle, strongly impaired CaLCuV infection. In contrast, overexpression of E2FA or E2FC, which can facilitate the endocycle, had no apparent effect. These results showed that geminiviruses and RNA viruses interface with the host pathogen response via a common mechanism, and that geminiviruses modulate plant cell cycle status by differentially impacting the CYCD/retinoblastoma-related protein/E2F regulatory network and facilitating progression into the endocycle.
[Show abstract][Hide abstract] ABSTRACT: Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists.
Using the data sets generated by the MicroArray Quality Control (MAQC) project, we investigated the impact on the reproducibility of DEG lists of a few widely used gene selection procedures. We present comprehensive results from inter-site comparisons using the same microarray platform, cross-platform comparisons using multiple microarray platforms, and comparisons between microarray results and those from TaqMan - the widely regarded "standard" gene expression platform. Our results demonstrate that (1) previously reported discordance between DEG lists could simply result from ranking and selecting DEGs solely by statistical significance (P) derived from widely used simple t-tests; (2) when fold change (FC) is used as the ranking criterion with a non-stringent P-value cutoff filtering, the DEG lists become much more reproducible, especially when fewer genes are selected as differentially expressed, as is the case in most microarray studies; and (3) the instability of short DEG lists solely based on P-value ranking is an expected mathematical consequence of the high variability of the t-values; the more stringent the P-value threshold, the less reproducible the DEG list is. These observations are also consistent with results from extensive simulation calculations.
We recommend the use of FC-ranking plus a non-stringent P cutoff as a straightforward and baseline practice in order to generate more reproducible DEG lists. Specifically, the P-value cutoff should not be stringent (too small) and FC should be as large as possible. Our results provide practical guidance to choose the appropriate FC and P-value cutoffs when selecting a given number of DEGs. The FC criterion enhances reproducibility, whereas the P criterion balances sensitivity and specificity.
[Show abstract][Hide abstract] ABSTRACT: The mechanisms underlying defence reactions to a pathogen attack, though well studied in crop plants, are poorly understood in conifers. To analyze changes in gene transcript abundance in Pinus sylvestris L. root tissues infected by Heterobasidion annosum (Fr.) Bref. s.l., a cDNA microarray containing 2109 ESTs from P. taeda L. was used. Mixed model statistical analysis identified 179 expressed sequence tags differentially expressed at 1, 5 or 15 days post inoculation. In general, the total number of genes differentially expressed during the infection increased over time. The most abundant group of genes up-regulated upon infection coded for enzymes involved in metabolism (phenylpropanoid pathway) and defence-related proteins with antimicrobial properties. A class III peroxidase responsible for lignin biosynthesis and cell wall thickening had increased transcript abundance at all measurement times. Real-time RT-PCR verified the microarray results with high reproducibility. The similarity of the expression profiling pattern observed in this pathosystem to those documented in crop pathology suggests that angiosperms and gymnosperms use similar genetic programs in responding to invasive growth by microbial pathogens.
Tree Physiology 11/2007; 27(10):1441-58. DOI:10.1093/treephys/27.10.1441 · 3.66 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Reproducibility is a fundamental requirement in scientific experiments and clinical contexts. Recent publications raise concerns about the reliability of microarray technology because of the apparent lack of agreement between lists of differentially expressed genes (DEGs). In this study we demonstrate that (1) such discordance may stem from ranking and selecting DEGs solely by statistical significance (P) derived from widely used simple t-tests; (2) when fold change (FC) is used as the ranking criterion, the lists become much more reproducible, especially when fewer genes are selected; and (3) the instability of short DEG lists based on P cutoffs is an expected mathematical consequence of the high variability of the t-values. We recommend the use of FC ranking plus a non-stringent P cutoff as a baseline practice in order to generate more reproducible DEG lists. The FC criterion enhances reproducibility while the P criterion balances sensitivity and specificity.
[Show abstract][Hide abstract] ABSTRACT: Half of the probes on Affymetrix microarrays contain a single base mismatch (MM) of a known perfect match (PM) target sequence. While putatively designed to detect nonspecific binding, the MM data can also contain true signals and because of this, debates persist concerning how to best combine PM and MM data for statistical modeling purposes. Most current approaches involve either subtracting some function of MM from PM or ignoring MM altogether. Here, we describe a bivariate model that includes both PM and MM based on the mixed linear modelling framework. It directly models the correlation between PM and MM and thereby increases the power of significant gene detection. In this paper, we show that the bivariate mixed model offers moderate gains in power over a comparable univariate model that ignores the MM data. The gains are more prominent when the number of replicates and the array-to-array variability is small. We apply the models to a small experiment on yeast and use the data as a basis for a Monte Carlo simulation.
Journal of Statistical Computation and Simulation 03/2007; 77(3):251-264. DOI:10.1080/10629360600826398 · 0.64 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: A genome-wide location analysis method has been introduced as a means to simultaneously study protein-DNA binding interactions for a large number of genes on a microarray platform. Identification of interactions between transcription factors (TF) and genes provide insight into the mechanisms that regulate a variety of cellular responses. Drawing proper inferences from the experimental data is key to finding statistically significant TF-gene binding interactions. We describe how the analysis and interpretation of genome-wide location data can be fit into a traditional statistical modeling framework that considers the data across all arrays and formulizes appropriate hypothesis tests. The approach is illustrated with data from a yeast transcription factor binding experiment that illustrates how identified TF-gene interactions can enhance initial exploration of transcriptional regulatory networks. Examples of five kinds of transcriptional regulatory structure are also demonstrated. Some stark differences with previously published results are explored.
Statistical Applications in Genetics and Molecular Biology 02/2007; 3(1):22-22. DOI:10.2202/1544-6115.1045 · 1.13 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In the absence of specific high-affinity agonists and antagonists, it has been difficult to define the target genes and biological responses attributable to many of the orphan nuclear receptors (ONRs). Indeed, it appears that many members of this receptor superfamily are not regulated by classical small molecules but rather their activity is controlled by interacting cofactors. Motivated by this finding, we have developed an approach to genetically isolate specific receptor-cofactor pairs in cells, allowing us to define the biological responses attributable to each complex. This is accomplished by using combinatorial peptide phage display to engineer the receptor interacting domain of each cofactor such that it interacts selectively with one nuclear receptor. In this study, we describe the customization of PGC-1alpha and its use to study the biology of the estrogen-related receptor alpha (ERRalpha) in cultured liver cells.
[Show abstract][Hide abstract] ABSTRACT: Microarray-based expression profiling experiments typically use either a one-color or a two-color design to measure mRNA abundance. The validity of each approach has been amply demonstrated. Here we provide a simultaneous comparison of results from one- and two-color labeling designs, using two independent RNA samples from the Microarray Quality Control (MAQC) project, tested on each of three different microarray platforms. The data were evaluated in terms of reproducibility, specificity, sensitivity and accuracy to determine if the two approaches provide comparable results. For each of the three microarray platforms tested, the results show good agreement with high correlation coefficients and high concordance of differentially expressed gene lists within each platform. Cumulatively, these comparisons indicate that data quality is essentially equivalent between the one- and two-color approaches and strongly suggest that this variable need not be a primary factor in decisions regarding experimental microarray design.
[Show abstract][Hide abstract] ABSTRACT: External RNA controls (ERCs), although important for microarray assay performance assessment, have yet to be fully implemented in the research community. As part of the MicroArray Quality Control (MAQC) study, two types of ERCs were implemented and evaluated; one was added to the total RNA in the samples before amplification and labeling; the other was added to the copyRNAs (cRNAs) before hybridization. ERC concentration-response curves were used across multiple commercial microarray platforms to identify problematic assays and potential sources of variation in the analytical process. In addition, the behavior of different ERC types was investigated, resulting in several important observations, such as the sample-dependent attributes of performance and the potential of using these control RNAs in a combinatorial fashion. This multiplatform investigation of the behavior and utility of ERCs provides a basis for articulating specific recommendations for their future use in evaluating assay performance across multiple platforms.
[Show abstract][Hide abstract] ABSTRACT: Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.