Gene Selection and clustering for time-course and dose-response microarray experiments using order-restricted inference

Biostatistics Branch, Laboratory of Molecular Carcinogenesis, Research Triangle Park, NC 27709, USA.
Bioinformatics (Impact Factor: 4.62). 06/2003; 19(7):834-41. DOI: 10.1093/bioinformatics/btg093
Source: PubMed

ABSTRACT We propose an algorithm for selecting and clustering genes according to their time-course or dose-response profiles using gene expression data. The proposed algorithm is based on the order-restricted inference methodology developed in statistics. We describe the methodology for time-course experiments although it is applicable to any ordered set of treatments. Candidate temporal profiles are defined in terms of inequalities among mean expression levels at the time points. The proposed algorithm selects genes when they meet a bootstrap-based criterion for statistical significance and assigns each selected gene to the best fitting candidate profile. We illustrate the methodology using data from a cDNA microarray experiment in which a breast cancer cell line was stimulated with estrogen for different time intervals. In this example, our method was able to identify several biologically interesting genes that previous analyses failed to reveal.

Download full-text


Available from: Shyamal Peddada, Aug 24, 2015
  • Source
    • "In particular, the development of gene-clustering algorithms that also detect temporal profiles is becoming increasingly important. Statistical bootstrap methods have been developed for assigning genes to candidate profiles [1]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Statistical evaluation of temporal gene expression profiles plays an important role in particular biological processes and conditions. We introduce a clustering method for this purpose, which is based on the expression patterns but is also influenced by temporal changes. We compare the results of our platform with methods based on expression or the rank of temporal changes. The proposed platform is illustrated with a temporal gene expression dataset comprised of primary human chondrocytes and mesenchymal stem cells (MSCs). We derived three clusters in each cell type and compared the content of these classes in terms of temporal changes, which can support biological performance. For statistical evaluation we introduce a validity measure that takes under consideration these temporal changes and we also perform an enrichment analysis of three central genes in each cluster. Even though we can detect certain statistical similarities, these might be due to different biological processes. Our proposed platform contributes to both the statistical and biological validation of temporal profiles.
    Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference 08/2012; 2012:1238-41. DOI:10.1109/EMBC.2012.6346161
  • Source
    • "Fortunately, classical statistical models and bioinformatics/computational biology methods such as ANOVA, mixed linear models, and decision trees offer a good bioinformatics framework to begin to use microarray data with other associated biological/toxicological data for analysis (Johann et al., 2004; Kerr and Churchill, 2001; Tong et al., 2003; Wolfinger et al., 2001). Newer more sophisticated approaches for analysis of toxicogenomics data sets involved simultaneous comparisons of groups of samples by assessing the means of the data using inequalities (Peddada et al., 2003). Using a class of statistics called order-restricted inference, candidate temporal gene profiles are defined in terms of inequalities among mean expression levels at time or dose points. "
    [Show abstract] [Hide abstract]
    ABSTRACT: As one reflects back through the past 50 years of scientific research, a significant accomplishment was the advance into the genomic era. Basic research scientists have uncovered the genetic code and the foundation of the most fundamental building blocks for the molecular activity that supports biological structure and function. Accompanying these structural and functional discoveries is the advance of techniques and technologies to probe molecular events, in time, across environmental and chemical exposures, within individuals, and across species. The field of toxicology has kept pace with advances in molecular study, and the past 50 years recognizes significant growth and explosive understanding of the impact of the compounds and environment to basic cellular and molecular machinery. The advancement of molecular techniques applied in a whole-genomic capacity to the study of toxicant effects, toxicogenomics, is no doubt a significant milestone for toxicological research. Toxicogenomics has also provided an avenue for advancing a joining of multidisciplinary sciences including engineering and informatics in traditional toxicological research. This review will cover the evolution of the field of toxicogenomics in the context of informatics integration its current promise, and limitations.
    Toxicological Sciences 03/2011; 120 Suppl 1:S225-37. DOI:10.1093/toxsci/kfq373 · 4.48 Impact Factor
  • Source
    • "Also visually, it appears that there is a decrease in blood chromium levels at the highest dose group. We performed a non-parametric test for an umbrella order in the mean values of blood chromium using the order-restricted inference methodology developed in Peddada et al. (2003, 2005). Thus we tested the null hypothesis of no difference in means across dose groups against the alternative that the mean response increases with dose until 100 mg/L and then drops at 300 mg/L, and found the p-value to be 0.001. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Nonlinear regression models are commonly used in toxicology and pharmacology. When fitting nonlinear models for such data, one needs to pay attention to error variance structure in the model and the presence of possible outliers or influential observations. In this paper, an M-estimation based procedure is considered in heteroscedastic nonlinear regression models where the standard deviation is modeled by a nonlinear function. The methodology is illustrated using toxicological data.
    12/2010; 72:202-218. DOI:10.1007/s13571-011-0013-0
Show more