Inducing multi-objective clustering ensembles with genetic programming.

Graduate Program in Applied Informatics, Center of Technological Sciences, University of Fortaleza, Av. Washington Soares, 1321/J30, 60811-905 Fortaleza, CE, Brazil; Federal University of São Carlos, Sorocaba Campus, Rod. João Leme dos Santos, Km 110, Bairro Itinga, 18052-780 Sorocaba, SP, Brazil
Neurocomputing 01/2010; 74:494-498. DOI: 10.1016/j.neucom.2010.09.014
Source: DBLP

ABSTRACT The recent years have witnessed a growing interest in two advanced strategies to cope with the data clustering problem, namely, clustering ensembles and multi-objective clustering. In this paper, we present a genetic programming based approach that can be considered as a hybrid of these strategies, thereby allowing that different hierarchical clustering ensembles be simultaneously evolved taking into account complementary validity indices. Results of computational experiments conducted with artificial and real datasets indicate that, in most of the cases, at least one of the Pareto optimal partitions returned by the proposed approach compares favorably or go in par with the consensual partitions yielded by two well-known clustering ensemble methods in terms of clustering quality, as gauged by the corrected Rand index.

  • [Show abstract] [Hide abstract]
    ABSTRACT: In real-world problems we encounter situations where patterns are described by blocks (families) of features where each of these groups comes with a well-expressed semantics. For instance, in spatiotemporal data we are dealing with spatial coordinates of the objects (say, x–y coordinates) while the temporal part of the objects forms another collection of features. It is apparent that when clustering objects being described by families of features, it becomes intuitively justifiable to anticipate their different role and contribution to the clustering process of the data whereas the clustering is sought to be reflective of an overall structure in the data set. To address this issue, we introduce an agreement based fuzzy clustering—a fuzzy clustering with blocks of features. The detailed investigations are carried out for the well-known algorithm of fuzzy clustering that is fuzzy C-means (FCM). We propose an extended version of the FCM where a composite distance function is endowed with adjustable weights (parameters) quantifying an impact coming from the blocks of features. A global evaluation criterion is used to assess the quality of the obtained results. It is treated as a fitness function in the optimization of the weights through the use of particle swarm optimization (PSO). The behavior of the proposed method is investigated in application to synthetic and real-world data as well as a certain case study.
    Neurocomputing 01/2014; 127:266–280. · 1.63 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, a new multi-objective genetic programming (GP) with a diversity preserving mechanism and a real number alteration operator is presented and successfully used for Pareto optimal modelling of some complex non-linear systems using some input–output data. In this study, two different input–output data-sets of a non-linear mathematical model and of an explosive cutting process are considered separately in three-objective optimisation processes. The pertinent conflicting objective functions that have been considered for such Pareto optimisations are namely, training error (TE), prediction error (PE), and the length of tree (complexity of the network) (TL) of the GP models. Such three-objective optimisation implementations leads to some non-dominated choices of GP-type models for both cases representing the trade-offs among those objective functions. Therefore, optimal Pareto fronts of such GP models exhibit the trade-off among the corresponding conflicting objectives and, thus, provide different non-dominated optimal choices of GP-type models. Moreover, the results show that no significant optimality in TE and PE may occur when the TL of the corresponding GP model exceeds some values.
    International Journal of Systems Science 06/2014; · 1.31 Impact Factor


Available from
May 22, 2014