Determination of the minimum number of microarray experiments for discovery of gene expression patterns

Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
BMC Bioinformatics (Impact Factor: 2.58). 02/2006; 7 Suppl 4(Suppl 4):S13. DOI: 10.1186/1471-2105-7-S4-S13
Source: PubMed


One type of DNA microarray experiment is discovery of gene expression patterns for a cell line undergoing a biological process over a series of time points. Two important issues with such an experiment are the number of time points, and the interval between them. In the absence of biological knowledge regarding appropriate values, it is natural to question whether the behaviour of progressively generated data may by itself determine a threshold beyond which further microarray experiments do not contribute to pattern discovery. Additionally, such a threshold implies a minimum number of microarray experiments, which is important given the cost of these experiments.
We have developed a method for determining the minimum number of microarray experiments (i.e. time points) for temporal gene expression, assuming that the span between time points is given and the hierarchical clustering technique is used for gene expression pattern discovery. The key idea is a similarity measure for two clusterings which is expressed as a function of the data for progressive time points. While the experiments are underway, this function is evaluated. When the function reaches its maximum, it indicates the set of experiments reach a saturated state. Therefore, further experiments do not contribute to the discrimination of patterns.
The method has been verified with two previously published gene expression datasets. For both experiments, the number of time points determined with our method is less than in the published experiments. It is noted that the overall approach is applicable to other clustering techniques.

Download full-text


Available from: PubMed Central · License: CC BY
  • Source
    • "Because microarray experiments are expensive, it is important to determine an appropriate sample number for a micorarray experiment. Wu and his colleagues [3] have developed a method to determine the minimum microarray samples such as the minimum time points for the micorarray researchers. Their basic idea is to use hierarchical clustering to obtain the gene expression patterns in a microarray experiment. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The first symposium of computations in bioinformatics and bioscience (SCBB06) was held in Hangzhou, China on June 21-22, 2006. Twenty-six peer-reviewed papers were selected for publication in this special issue of BMC Bioinformatics. These papers cover a broad range of topics including bioinformatics theories, algorithms, applications and tool development. The main technical topics contain gene expression analysis, sequence analysis, genome analysis, phylogenetic analysis, gene function prediction, molecular interaction and system biology, genetics and population study, immune strategy, protein structure prediction and proteomics.
    Full-text · Article · Feb 2006 · BMC Bioinformatics
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
    Full-text · Article · May 2008 · Molecules and Cells
  • [Show abstract] [Hide abstract]
    ABSTRACT: Gene expression microarrays have become an important exploratory tool in many screening experiments that aim to discover the genes that change expression in two or more biological conditions and can be used to build molecular profiles for both diagnostic and prognostic use. The still very high costs of microarrays and the difficulty in generating the biological samples are critical issues of microarraybased screening experiments, and the experimental design plays a crucial role in how informative an experiment is going to be. In this chapter, we describe some of the major issues related to the design of either randomized control trials or observational studies and discuss the choice of powerful sample sizes, the selection of informative experimental conditions, and experimental strategies that can minimize confounding. We conclude with a discussion of some of the open problems in the design and analysis of microarray experiments that need further research.
    No preview · Chapter · Dec 2010
Show more