Correction of technical bias in clinical microarray data improves concordance with known biological information

Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology (CHIP@HST), Harvard Medical School, Boston, MA 02115, USA.
Genome biology (Impact Factor: 10.81). 02/2008; 9(2):R26. DOI: 10.1186/gb-2008-9-2-r26
Source: PubMed


The performance of gene expression microarrays has been well characterized using controlled reference samples, but the performance on clinical samples remains less clear. We identified sources of technical bias affecting many genes in concert, thus causing spurious correlations in clinical data sets and false associations between genes and clinical variables. We developed a method to correct for technical bias in clinical microarray data, which increased concordance with known biological relationships in multiple data sets.

8 Reads
  • Source
    • "Data from 2 independent experiments were averaged and only probes with a log2-ratio above 1 or below -1 were considered. Only probes with log intensity >8 and <14 were taken in account, to avoid non linear effects caused by the noise floor at low intensities or by saturation at high [19]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The amygdala is a brain structure considered a key node for the regulation of neuroendocrine stress response. Stress-induced response in amygdala is accomplished through neurotransmitter activation and an alteration of gene expression. MicroRNAs (miRNAs) are important regulators of gene expression in the nervous system and are very well suited effectors of stress response for their ability to reversibly silence specific mRNAs. In order to study how acute stress affects miRNAs expression in amygdala we analyzed the miRNA profile after two hours of mouse restraint, by microarray analysis and reverse transcription real time PCR. We found that miR-135a and miR-124 were negatively regulated. Among in silico predicted targets we identified the mineralocorticoid receptor (MR) as a target of both miR-135a and miR-124. Luciferase experiments and endogenous protein expression analysis upon miRNA upregulation and inhibition allowed us to demonstrate that mir-135a and mir-124 are able to negatively affect the expression of the MR. The increased levels of the amygdala MR protein after two hours of restraint, that we analyzed by western blot, negatively correlate with miR-135a and miR-124 expression. These findings point to a role of miR-135a and miR-124 in acute stress as regulators of the MR, an important effector of early stress response.
    PLoS ONE 09/2013; 8(9):e73385. DOI:10.1371/journal.pone.0073385 · 3.23 Impact Factor
  • Source
    • "In some cases the normalization algorithm itself may be a source of bias [5]. Several technical factors can be inferred post hoc from raw microarray data; e.g. the level of negative control probes can indicate changes in the noise floor, or the width of the distribution of expression values can indicate dynamic range [3]. We do not necessarily expect the bias metrics to be independent of each other. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene expression profiles of clinical cohorts can be used to identify genes that are correlated with a clinical variable of interest such as patient outcome or response to a particular drug. However, expression measurements are susceptible to technical bias caused by variation in extraneous factors such as RNA quality and array hybridization conditions. If such technical bias is correlated with the clinical variable of interest, the likelihood of identifying false positive genes is increased. Here we describe a method to visualize an expression matrix as a projection of all genes onto a plane defined by a clinical variable and a technical nuisance variable. The resulting plot indicates the extent to which each gene is correlated with the clinical variable or the technical variable. We demonstrate this method by applying it to three clinical trial microarray data sets, one of which identified genes that may have been driven by a confounding technical variable. This approach can be used as a quality control step to identify data sets that are likely to yield false positive results.
    PLoS ONE 04/2013; 8(4):e61872. DOI:10.1371/journal.pone.0061872 · 3.23 Impact Factor
  • Source
    • "Thus, probes that are too far from the 3' end of the target are likely to have a lower signal intensity, and for this reason most (but not all) probes are designed by Affymetrix to query their target within 600 bases of the 3' end of the transcript, or within 300 bases for the X3P array. In addition to a weaker signal, probes far from the 3' end of the gene are susceptible to false signal changes resulting from variations in RNA integrity [17]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Interpretation of gene expression microarrays requires a mapping from probe set to gene. On many Affymetrix gene expression microarrays, a given gene may be detected by multiple probe sets, which may deliver inconsistent or even contradictory measurements. Therefore, obtaining an unambiguous expression estimate of a pre-specified gene can be a nontrivial but essential task. We developed scoring methods to assess each probe set for specificity, splice isoform coverage, and robustness against transcript degradation. We used these scores to select a single representative probe set for each gene, thus creating a simple one-to-one mapping between gene and probe set. To test this method, we evaluated concordance between protein measurements and gene expression values, and between sets of genes whose expression is known to be correlated. For both test cases, we identified genes that were nominally detected by multiple probe sets, and we found that the probe set chosen by our method showed stronger concordance. This method provides a simple, unambiguous mapping to allow assessment of the expression levels of specific genes of interest.
    BMC Bioinformatics 12/2011; 12(1):474. DOI:10.1186/1471-2105-12-474 · 2.58 Impact Factor
Show more

Similar Publications

Preview (2 Sources)

8 Reads
Available from