Conference Paper

DHC: A Density-Based Hierarchical Clustering Method for Time Series Gene Expression Dat.

Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA;
DOI: 10.1109/BIBE.2003.1188978 Conference: 3rd IEEE International Symposium on BioInformatics and BioEngineering (BIBE 2003), 10-12 March 2003, Bethesda, MD, USA
Source: DBLP

ABSTRACT terns in underlying data, have proved to be useful in finding co-expressed genes. Clustering the time series gene expression data is an im-portant task in bioinformatics research and biomedical ap-plications. Recently, some clustering methods have been adapted or proposed. However, some concerns still remain, such as the robustness of the mining methods, as well as the quality and the interpretability of the mining results. In this paper, we tackle the problem of effectively clus-tering time series gene expression data by proposing al-gorithm DHC, a density-based, hierarchical clustering method. We use a density-based approach to identify the clusters such that the clustering results are of high quality and robustness. Moreover, The mining result is in the form of a density tree, which uncovers the embedded clusters in a data set. The inner-structures, the borders and the out-liers of the clusters can be further investigated using the attraction tree, which is an intermediate result of the min-ing. By these two trees, the internal structure of the data set can be visualized effectively. Our empirical evaluation using some real-world data sets show that the method is effective, robust and scalable. It matches the ground truth provided by bioinformatics experts very well in the sample data sets.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a new hierarchical clustering method using genetic algorithms for the analysis of gene expression data. This method is based on the mathematical proof of several results, showing its effectiveness with regard to other clustering methods. Genetic algorithms applied to cluster analysis have disclosed good results on biological data and many studies have been carried out in this sense, although most of them are focused on partitional clustering methods. Even though there are few studies that attempt to use genetic algorithms for building hierarchical clustering, they do not include constraints that allow us to reduce the complexity of the problem. Therefore, these studies become intractable problems for large data sets. On the other hand, the deterministic hierarchical clustering methods generally face the problem of convergence towards local optimums due to their greedy strategy. The method introduced here is an alternative to solve some of the problems existing methods face. The results of the experiments have shown that our approach can be very effective in cluster analysis of DNA microarray data.
    Expert Systems with Applications 06/2013; 40(7):2575-2591. · 1.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The invention of microarrays has rapidly changed the state of biological and biomedical research. Clustering algorithms play an important role in clustering microarray data sets where identifying groups of co-expressed genes are a very difficult task. Here we have posed the problem of clustering the microarray data as a multiobjective clustering problem. A new symmetry based fuzzy clustering technique is developed to solve this problem. The effectiveness of the proposed technique is demonstrated on five publicly available benchmark data sets. Results are compared with some widely used microarray clustering techniques. Statistical and biological significance tests have also been carried out.
    Computers in biology and medicine 11/2013; 43(11):1965-77. · 1.27 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Blob or granular object recognition is an image processing task with a rich application background, ranging from cell/nuclei segmentation in biology to nanoparticle recognition in physics. In this study, we establish a new and comprehensive framework for granular object recognition. Local density clustering and connected component analysis constitute the first stage. To separate overlapping objects, we further propose a modified watershed approach called the gradient-barrier watershed, which better incorporates intensity gradient information into the geometrical watershed framework. We also revise the marker-finding procedure to incorporate a clustering step on all the markers initially found, potentially grouping multiple markers within the same object. The gradient-barrier watershed is then conducted based on those markers, and the intensity gradient in the image directly guides the water flow during the flooding process. We also propose an important scheme for edge detection and fore/background separation called the intensity moment approach. Experimental results for a wide variety of objects in different disciplines – including cell/nuclei images, biological colony images, and nanoparticle images – demonstrate the effectiveness of the proposed framework.
    Pattern Recognition. 01/2014; 47(6):2266–2279.


Available from