Article

Clustering: A neural network approach

Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada, H3G 1M8.
Neural networks: the official journal of the International Neural Network Society (Impact Factor: 2.71). 08/2009; 23(1):89-107. DOI: 10.1016/j.neunet.2009.08.007
Source: DBLP

ABSTRACT

Clustering is a fundamental data analysis method. It is widely used for pattern recognition, feature extraction, vector quantization (VQ), image segmentation, function approximation, and data mining. As an unsupervised classification technique, clustering identifies some inherent structures present in a set of objects based on a similarity measure. Clustering methods can be based on statistical model identification (McLachlan & Basford, 1988) or competitive learning. In this paper, we give a comprehensive overview of competitive learning based clustering methods. Importance is attached to a number of competitive learning based clustering neural networks such as the self-organizing map (SOM), the learning vector quantization (LVQ), the neural gas, and the ART model, and clustering algorithms such as the C-means, mountain/subtractive clustering, and fuzzy C-means (FCM) algorithms. Associated topics such as the under-utilization problem, fuzzy clustering, robust clustering, clustering based on non-Euclidean distance measures, supervised clustering, hierarchical clustering as well as cluster validity are also described. Two examples are given to demonstrate the use of the clustering methods.

Download full-text

Full-text

Available from: K.-L. Du
  • Source
    • "In clustering methods, data samples are unlabeled, and the challenge is how to categorize them into meaningful clusters. Being a fundamental data analysis method, clustering is commonly used in many applications, which include pattern recognition, image segmentation, and function approximation [4]. Unlike standard statistical methods, many clustering methods do not depend on assumptions; therefore they are useful in situations where little or no prior knowledge is available [3]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: When no prior knowledge is available, clustering is a useful technique for categorizing data into meaningful groups or clusters. In this paper, a modified fuzzy min–max (MFMM) clustering neural network is proposed. Its efficacy for tackling power quality monitoring tasks is demonstrated. A literature review on various clustering techniques is first presented. To evaluate the proposed MFMM model, a performance comparison study using benchmark data sets pertaining to clustering problems is conducted. The results obtained are comparable with those reported in the literature. Then, a real-world case study on power quality monitoring tasks is performed. The results are compared with those from the fuzzy c-means and k-means clustering methods. The experimental outcome positively indicates the potential of MFMM in undertaking data clustering tasks and its applicability to the power systems domain.
    Full-text · Article · Mar 2015 · Applied Soft Computing
  • Source
    • "Although such methods are available and applicable for processing of uncertain data from SHM systems in some cases, these neural network models also have some disadvantages, for instance, local optimal solution and poor extrapolation for a BP network, complex network architecture for a fuzzy neural network structure, and difficult determination of the parameter í µí¼Ž of Gaussian kernel function in a probabilistic neural network [9]. The counter-propagation network (CPN) began to be adopted in pattern classification , function approximation, and statistical analysis not very long ago [10] [11], because it has a relatively simple network structure and does not have an error criterion for convergence owing to its combination of the Kohonen selforganizing map and Grossberg competitive learning network model [12] [13]. Nevertheless, the CPN model often requires higher data storage memory and more runtime than other neural network models [10] [11]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a revised counter-propagation network (CPN) model by integrating rough set in structural damage detection, applicable for processing redundant and uncertain information as well as assessing structural health states. Firstly, rough set is used in the model to deal with a large volume of data; secondly, a revised training algorithm is developed to improve the capabilities of the CPN model; and lastly, the least input vectors are input to the revised CPN (RCPN) model, hence the rough set-based RCPN is proposed in the paper. To validate the model proposed, numerical experiments are conducted, and, as a result, six damage patterns have been successfully identified in a steel frame. The influence of measurement noise, the network models adopted, and the data preprocessing methods on damage identification is also discussed in the paper. The results show that the proposed model not only has good damage detection capability and noise tolerance, but also significantly reduces the data storage requirement and saves computing time.
    Full-text · Article · Nov 2013 · International Journal of Distributed Sensor Networks
  • Source
    • "It has better convergence properties as compare to k-means algorithm. The FCM needs to store U and all ci's and the alternating estimation of U and ci's causes a computational and storage burden for largescale data sets [23]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Nonlinear Hybrid Dynamical Systems (NHDS) are characterized by interacting dynamics of continuous and discrete domain. Application such systems has been reported in chemical systems, manufacturing systems, mechanical systems, electrical systems, telecommunication systems, automobile control and computer disk drive control.The nonlinear continuous dynamic in NHDS will change due to occurrence of some unknown discrete events. So, for identification of NHDS, it is required to classify the open loop data according to discrete events. Clustering is the process of organizing objects into groups whose members are similar in some way. A variety of algorithms have recently emerged that meet these requirements and were successfully applied to real life data mining problem. Fuzzy c-means (FCM) and k-means are commonly used partitional algorithm based on unsupervised learning methods. This paper focuses on the analysis of FCM and k-means partitional clustering methods for the single tank NHDS data classification.
    Full-text · Article · Jun 2013
Show more

Questions & Answers about this publication

  • Noha Elprince added an answer in Clustering:
    How can I get the topics in a dynamic text?

    I have a big dataset (repository) which contains  text data. The repository contains a data related to many different domains so that I can search about any domain or general topic.
    when I search in my repository about general topic (e.g; nano-technology), I would like to get all sub topics related to that topic. Someone else also wants to search about another topic in the same repository and get the subtopics. What I’m going to do is: developing  a tool which is able to identify the topics for a data which returned by search query.  each time I search, I get different data=>dynamic


    I have tried to apply clustering algorithm like k-means, and k-means with Canopy, as well as the topic modeling (LDA). But unfortunately for both the flat clustering like k-means and term-based clustering like LDA, there is no specific way to automatically set the right K (#Clusters, or #Topics).

    Is there anyone has any idea how to identify topics in dynamic text? I mean how can I bring the clustering algorithms into real world to work in real application? 

    Thanks in advance.

    Noha Elprince

    Sorry, I misunderstood your question.

    You may like to read about "subtractive clustering" that determine dynamically the number of clusters. I attach a paper that uses it:

    I hope this may help,

    Noha

    • Source
      [Show abstract] [Hide abstract]
      ABSTRACT: Clustering is a fundamental data analysis method. It is widely used for pattern recognition, feature extraction, vector quantization (VQ), image segmentation, function approximation, and data mining. As an unsupervised classification technique, clustering identifies some inherent structures present in a set of objects based on a similarity measure. Clustering methods can be based on statistical model identification (McLachlan & Basford, 1988) or competitive learning. In this paper, we give a comprehensive overview of competitive learning based clustering methods. Importance is attached to a number of competitive learning based clustering neural networks such as the self-organizing map (SOM), the learning vector quantization (LVQ), the neural gas, and the ART model, and clustering algorithms such as the C-means, mountain/subtractive clustering, and fuzzy C-means (FCM) algorithms. Associated topics such as the under-utilization problem, fuzzy clustering, robust clustering, clustering based on non-Euclidean distance measures, supervised clustering, hierarchical clustering as well as cluster validity are also described. Two examples are given to demonstrate the use of the clustering methods.
      Full-text · Article · Aug 2009 · Neural networks: the official journal of the International Neural Network Society