Normalization of two-channel microarray experiments: A semiparametic approach
ABSTRACT MOTIVATION: An important underlying assumption of any experiment is that the experimental subjects are similar across levels of the treatment variable, so that changes in the response variable can be attributed to exposure to the treatment under study. This assumption is often not valid in the analysis of a microarray experiment due to systematic biases in the measured expression levels related to experimental factors such as spot location (often referred to as a print-tip effect), arrays, dyes, and various interactions of these effects. Thus, normalization is a critical initial step in the analysis of a microarray experiment, where the objective is to balance the individual signal intensity levels across the experimental factors, while maintaining the effect due to the treatment under investigation. RESULTS: Various normalization strategies have been developed including log-median centering, analysis of variance modeling, and local regression smoothing methods for removing linear and/or intensity-dependent systematic effects in two-channel microarray experiments. We describe a method that incorporates many of these into a single strategy, referred to as two-channel fastlo, and is derived from a normalization procedure that was developed for single-channel arrays. The proposed normalization procedure is applied to a two-channel dose-response experiment.
- SourceAvailable from: Ellen Eisen
[Show abstract] [Hide abstract]
- "In recent years, smoothing methods have been used in a wide range of studies to model a nonlinear exposure–response relationship. These include etiological investigations of air pollution , cancer risk assessment , nutrition , microarray studies , cardiovascular mortality , and occupational exposures   . In a study of lung cancer and occupational exposure to silica, the estimated log hazard ratio for silica exposure considered as a function of cumulative exposure referred to hereafter as the exposure–response was observed to increase over categories of exposures . "
ABSTRACT: In this paper, we review available methods for determination of the functional form of the relation between a covariate and the log hazard ratio for a Cox model. We pay special attention to the detection of influential observations to the extent that they influence the estimated functional form of the relation between a covariate and the log hazard ratio. Our paper is motivated by a data set from a cohort study of lung cancer and silica exposure, where the nonlinear shape of the estimated log hazard ratio for silica exposure plotted against cumulative exposure and hereafter referred to as the exposure–response curve was greatly affected by whether or not two individuals with the highest exposures were included in the analysis. Formal influence diagnostics did not identify these two individuals but did identify the three highest exposed cases. Removal of these three cases resulted in a biologically plausible exposure–response curve.Journal of Applied Statistics 01/2015; 42(5). DOI:10.1080/02664763.2014.995607 · 0.42 Impact Factor
[Show abstract] [Hide abstract]
- "Microarray expression data were analyzed on the log2 scale. Data quality was assessed via box and whisker plots along with residual and pair-wise MVA plots before and after normalization [29,30]. All arrays were normalized together using fastlo, a non-linear normalization similar to cyclic loess which runs in a fraction of the time . "
ABSTRACT: Background Loss of the endosulfatase HSulf-1 is common in ovarian cancer, upregulates heparin binding growth factor signaling and potentiates tumorigenesis and angiogenesis. However, metabolic differences between isogenic cells with and without HSulf-1 have not been characterized upon HSulf-1 suppression in vitro. Since growth factor signaling is closely tied to metabolic alterations, we determined the extent to which HSulf-1 loss affects cancer cell metabolism. Results Ingenuity pathway analysis of gene expression in HSulf-1 shRNA-silenced cells (Sh1 and Sh2 cells) compared to non-targeted control shRNA cells (NTC cells) and subsequent Kyoto Encyclopedia of Genes and Genomics (KEGG) database analysis showed altered metabolic pathways with changes in the lipid metabolism as one of the major pathways altered inSh1 and 2 cells. Untargeted global metabolomic profiling in these isogenic cell lines identified approximately 338 metabolites using GC/MS and LC/MS/MS platforms. Knockdown of HSulf-1 in OV202 cells induced significant changes in 156 metabolites associated with several metabolic pathways including amino acid, lipids, and nucleotides. Loss of HSulf-1 promoted overall fatty acid synthesis leading to enhance the metabolite levels of long chain, branched, and essential fatty acids along with sphingolipids. Furthermore, HSulf-1 loss induced the expression of lipogenic genes including FASN, SREBF1, PPARγ, and PLA2G3 stimulated lipid droplet accumulation. Conversely, re-expression of HSulf-1 in Sh1 cells reduced the lipid droplet formation. Additionally, HSulf-1 also enhanced CPT1A and fatty acid oxidation and augmented the protein expression of key lipolytic enzymes such as MAGL, DAGLA, HSL, and ASCL1. Overall, these findings suggest that loss of HSulf-1 by concomitantly enhancing fatty acid synthesis and oxidation confers a lipogenic phenotype leading to the metabolic alterations associated with the progression of ovarian cancer. Conclusions Taken together, these findings demonstrate that loss of HSulf-1 potentially contributes to the metabolic alterations associated with the progression of ovarian pathogenesis, specifically impacting the lipogenic phenotype of ovarian cancer cells that can be therapeutically targeted.08/2014; 2(1):13. DOI:10.1186/2049-3002-2-13
[Show abstract] [Hide abstract]
- "Most algorithms assume: 1) only a small portion of the proteins are differentially abundant between groups of interest, 2) the fold change distribution of differentially abundant proteins is symmetric about 1.0, 3) data must be available on a sufficient number of proteins with abundance levels distributed throughout the dynamic range to estimate global biases without over-fitting . For example, quantile [26,27] and cyclic loess normalization [28-30] are examples of normalization algorithms developed for one- and two-color gene expression arrays that make these assumptions. The iterative ANOVA model  described in the "Data quality and normalization" section is an example of such a normalization algorithm which can be applied to both labeled and label-free proteomics abundance data. "
ABSTRACT: Mass Spectrometry utilizing labeling allows multiple specimens to be subjected to mass spectrometry simultaneously. As a result, between-experiment variability is reduced. Here we describe use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases. We demonstrate how to export data in the format that is most efficient for statistical analysis. We demonstrate how to assess the need for normalization, perform normalization, and check whether it worked. We describe how to build a model explaining the observed values and test for differential protein abundance along with descriptive statistics and measures of reliability of the findings. Concepts are illustrated through the use of three case studies utilizing the iTRAQ 4-plex labeling protocol.BMC Bioinformatics 11/2012; 13 Suppl 16(Suppl 16):S7. DOI:10.1186/1471-2105-13-S16-S7 · 2.58 Impact Factor