Article

A greedy randomized adaptive search procedure applied to the clustering problem as an initialization process using K-Means as a local search procedure.

Department of Software Engineering, University of Huelva, 21071, La Rabida (Huelva), Spain; Department of Computer Science and A.I, University of Granada, Spain; Department of Computer Science, University of Oviedo, Oviedo, Spain
Journal of Intelligent and Fuzzy Systems (Impact Factor: 0.79). 01/2002; 12:235-242.
Source: DBLP

ABSTRACT We present a new approach for Cluster Analysis based on a Greedy Randomized Adaptive Search Procedure (GRASP), with the objective of overcoming the convergence to a local solution. It uses a probabilistic greedy Kaufman initialization to get initial solutions and K-Means as a local search algorithm. The approach is a new initialization one for K-Means. Hence, we compare it with some typical initialization methods: Random, Forgy, Macqueen and Kaufman. Our empirical results suggest that the hybrid GRASP – K-Means with probabilistic greedy Kaufman initialization performs better than the other methods with improved results. The new approach obtains high quality solutions for eight benchmark problems.

0 Bookmarks
 · 
54 Views
  • 2nd International Fuzzy Systems Symposium, (C. Gökçeoğlu, H.C. Aladağ, A. Akgün, Eds.); 11/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a greedy randomized adaptive search procedure (GRASP) coupled with path relinking (PR) to solve the problem of clustering n nodes in a graph into p clusters. The objective is to maximize the sum of the edge weights within each cluster such that the sum of the corresponding node weights does not exceed a fixed capacity. In phase I, both a heaviest weight edge (HWE) algorithm and a constrained minimum cut algorithm are used to select seeds for initializing the p clusters. Feasible solutions are obtained with the help of a self-adjusting restricted candidate list that sequentially guides the assignment of the remaining nodes. At each major GRASP iteration, the list length is randomly set based on a probability density function that is updated dynamically to reflect the solution quality realized in past iterations. In phase II, three neighborhoods, each defined by common edge and node swaps, are explored to attain local optimality. The following exploration strategies are investigated: cyclic neighborhood search, variable neighborhood descent, and randomized variable neighborhood descent (RVND). The best solutions found are stored in an elite pool. In a post-processing step, PR is applied to the pool members to cyclically generate paths between each pair. As new solutions are uncovered, a systematic attempt is made to improve a subset of them with local search. Should a better solution be found, it is saved temporally and placed in the pool after all the pairs are investigated and the bottom member is removed. The procedure ends when no further improvement is possible. Extensive computational testing was done to evaluate the various combinations of construction and local search strategies. For instances with up to 40 nodes and 5 clusters, the reactive GRASP with PR found optimal solutions within a negligible amount of time compared to CPLEX. In general, the HWE algorithm in the construction phase, RVND in the local search phase, and the use of PR provided the best results. The largest instances solved involved 82 nodes and 8 clusters.
    Journal of Heuristics 01/2011; 17:119-152. · 1.47 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A new algorithm is designed for handling fuzziness while mining large data. A new novel cost function weighted by fuzzy membership, is proposed in the framework of CLARANS. A new scalable approximation to the maximum number of neighbors, explored at each node, is developed; thus reducing the computational time for large data while eliminating the need for user-defined (heuristic) parameters in the existing equation. The goodness of the generated clusters is evaluated in terms of Xie–Beni validity index. Results demonstrate the superiority of the proposed algorithm, over both synthetic and real data sets, in terms of goodness of clustering. It is interesting to note that our algorithm always converges to the globally best values at the optimal number of partitions. Moreover compared to existing fuzzy algorithms, FCLARANS without scanning the whole dataset, searching small number of neighbors, is able to handle the uncertainty due to overlapping nature of the various partitions. This is the main motivation of fuzzification of the algorithm CLARANS.
    Applied Soft Computing. 04/2013; 13(4):1639–1645.

Full-text (2 Sources)

View
18 Downloads
Available from
Jun 2, 2014