Conference Paper

Finding Similar Patterns in Microarray Data.

DOI: 10.1007/11589990_185 Conference: AI 2005: Advances in Artificial Intelligence, 18th Australian Joint Conference on Artificial Intelligence, Sydney, Australia, December 5-9, 2005, Proceedings
Source: DBLP


In this paper we propose a clustering algorithm called s- Cluster for analysis of gene expression data based on pattern-similarity. The algorithm captures the tight clusters exhibiting strong similar ex- pression patterns in Microarray data,and allows a high level of overlap among discovered clusters without completely grouping all genes like other algorithms. This reflects the biological fact that not all functions are turned on in an experiment, and that many genes are co-expressed in multiple groups in response to different stimuli. The experiments have demonstrated that the proposed algorithm successfully groups the genes with strong similar expression patterns and that the found clusters are interpretable.

Download full-text


Available from: Xiaodi Huang, Oct 07, 2015
15 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Clustering is the process of grouping a set of objects into classes of similar objects. Because of unknownness of the hidden patterns in the data sets, the definition of similarity is very subtle. Until recently, similarity measures are typically based on distances, e.g Euclidean distance and cosine distance. In this paper, we propose a flexible yet powerful clustering model, namely OP-Cluster (Order Preserving Cluster). Under this new model, two objects are similar on a subset of dimensions if the values of these two objects induce the same relative order of those dimensions. Such a cluster might arise when the expression levels of (coregulated) genes can rise or fall synchronously in response to a sequence of environment stimuli. Hence, discovery of OP-Cluster is essential in revealing significant gene regulatory networks. A deterministic algorithm is designed and implemented to discover all the significant OP-Clusters. A set of extensive experiments has been done on several real biological data sets to demonstrate its effectiveness and efficiency in detecting co-regulated patterns.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA microarray technologies together with rapidly increasing genomic sequence information is leading to an explosion in available gene expression data. Currently there is a great need for efficient methods to analyze and visualize these massive data sets. A self-organizing map (SOM) is an unsupervised neural network learning algorithm which has been successfully used for the analysis and organization of large data files. We have here applied the SOM algorithm to analyze published data of yeast gene expression and show that SOM is an excellent tool for the analysis and visualization of gene expression profiles.
    FEBS Letters 06/1999; 451(2):142-6. DOI:10.1016/S0014-5793(99)00524-4 · 3.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Technologies to measure whole-genome mRNA abundances and methods to organize and display such data are emerging as valuable tools for systems-level exploration of transcriptional regulatory networks. For instance, it has been shown that mRNA data from 118 genes, measured at several time points in the developing hindbrain of mice, can be hierarchically clustered into various patterns (or 'waves') whose members tend to participate in common processes. We have previously shown that hierarchical clustering can group together genes whose cis-regulatory elements are bound by the same proteins in vivo. Hierarchical clustering has also been used to organize genes into hierarchical dendograms on the basis of their expression across multiple growth conditions. The application of Fourier analysis to synchronized yeast mRNA expression data has identified cell-cycle periodic genes, many of which have expected cis-regulatory elements. Here we apply a systematic set of statistical algorithms, based on whole-genome mRNA data, partitional clustering and motif discovery, to identify transcriptional regulatory sub-networks in yeast-without any a priori knowledge of their structure or any assumptions about their dynamics. This approach uncovered new regulons (sets of co-regulated genes) and their putative cis-regulatory elements. We used statistical characterization of known regulons and motifs to derive criteria by which we infer the biological significance of newly discovered regulons and motifs. Our approach holds promise for the rapid elucidation of genetic network architecture in sequenced organisms in which little biology is known.
    Nature Genetics 08/1999; 22(3):281-5. DOI:10.1038/10343 · 29.35 Impact Factor
Show more