Conference Paper

A Novel Framework for Discovering Robust Cluster Results

DOI: 10.1007/11893318_45 Conference: Discovery Science, 9th International Conference, DS 2006, Barcelona, Spain, October 7-10, 2006, Proceedings
Source: DBLP


We propose a novel method, called heterogeneous clustering ensemble (HCE), to generate robust clustering results that combine
multiple partitions (clusters) derived from various clustering algorithms. The proposed method combines partitions of various
clustering algorithms by means of newly-proposed the selection and the crossover operation of the genetic algorithm (GA) during
the evolutionary process.

Download full-text


Available from: Ju Han Kim, Mar 27, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The main focus of this thesis concerns the further developments in the areas of ensemble and constrained clustering. The goal of the proposed methods is to address clustering problems, in which the optimal clustering method is unknown. Additionally, by means of pairwise linkage constraints, it is possible to aggregate extra information to the clustering framework. Part I investigates the concept of ensemble clustering. It presents a comprehensive review of the state of the art in ensemble clustering. It follows by discussing the impact of the ensemble variability in the final consensual result. Visualization of ensemble variability based on multidimensional scaling is also a topic addressed in this part. A software which is able to perform ensemble clustering using various existing consensus functions is also introduced. A consensus function based on random walker originally developed for image segmentation combination is adapted to the ensemble clustering problem. A lower bound is proposed to explore how well cluster ensemble methods perform in an absolute sense, without the usage of ground-truth. Finally, a study evaluating how well the general ensemble clustering techniques perform in the context of image segmentation combination closes this part. Part II introduces an ensemble clustering method based on a new formulation for the median partition problem. The performance of this method is assessed in relation to other well known ensemble clustering methods. Part III addresses the potential of ensemble techniques in the framework of constrained clustering. It presents a comprehensive review of the state of the art in constrained clustering and discusses the impact of considering constraints locally or globally. An experiment is presented comparing both approaches. A new clustering method is introduced combining both ensemble and constrained clustering. Constraints are introduced into three consensus functions. This part closes with an experimental evaluation, in which constraints are considered in different steps of the clustering ensemble framework. In Part IV a review of the imaging protocol known as diffusion tensor imaging is presented, and a new fiber segmentation methodology based on the definition of pairwise linkage constraints is proposed to drive the semi-supervised segmentation process.
    Full-text · Thesis · Jan 2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The combination of multiple clustering results (clustering ensemble) has emerged as an important procedure to improve the quality of clustering solutions. In this paper we propose a new cluster ensemble method based on kernel functions, which introduces the Partition Relevance Analysis step. This step has the goal of analyzing the set of partition in the cluster ensemble and extract valuable information that can improve the quality of the combination process. Besides, we propose a new similarity measure between partitions proving that it is a kernel function. A new consensus function is introduced using this similarity measure and based on the idea of finding the median partition. Related to this consensus function, some theoretical results that endorse the suitability of our methods are proven. Finally, we conduct a numerical experimentation to show the behavior of our method on several databases by making a comparison with simple clustering algorithms as well as to other cluster ensemble methods.
    Full-text · Article · Aug 2010 · Pattern Recognition
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The clustering ensemble has emerged as a prominent method for improving robustness, stability, and accuracy of unsupervised classification solutions. It combines multiple partitions generated by different clustering algorithms into a single clustering solution. Genetic algorithms are known as methods with high ability to solve optimization problems including clustering. To date, significant progress has been contributed to find consensus clustering that will yield better results than existing clustering. This paper presents a survey of genetic algorithms designed for clustering ensembles. It begins with the introduction of clustering ensembles and clustering ensemble algorithms. Subsequently, this paper describes a number of suggested genetic-guided clustering ensemble algorithms, in particular the genotypes, fitness functions, and genetic operations. Next, clustering accuracies among the genetic-guided clustering ensemble algorithms is compared. This paper concludes that using genetic algorithms in clustering ensemble improves the clustering accuracy and addresses open questions subject to future research.
    Full-text · Article · Apr 2011 · Artificial Intelligence Review
Show more