Article

Unsupervised Clustering on Multi-Components Datasets: Applications on Images and Astrophysics Data

09/2008;
Source: OAI

ABSTRACT This paper proposes an original approach to cluster multi-component data sets with an estimation of the number of clusters. From the construction of a minimal spanning tree with Prim's algorithm and the assumption that the vertices are approximately distributed according to a Poisson distribution, the number of clusters is estimated by thresholding the Prim's trajectory. The corresponding cluster centroids are then computed in order to initialize the Generalized Lloyd's algorithm, also known as K-means, which allows to circumvent initialization problems. Metrics used for measuring similarity between multi-dimensional data points are based on symmetrical divergences. The use of these informational divergences together with the proposed method lead to better results than some other clustering methods in the framework of astrophysical data processing. An application of this method in the multi-spectral imagery domain with a satellite view of Paris is also presented.

Download full-text

Full-text

Available from: Pierre Comon, Jul 05, 2015
0 Followers
 · 
76 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an original approach to cluster multi‐component datasets, in the frame of two applications: taxonomic classification of asteroids and clustering of chemical species on a Mars hyperspectral image. The method is based on an estimation of the number of clusters and an initialization of clusters centroids in order to partition the data with a popular clustering algorithm: K‐means. These information are extracted from the construction of a minimal spanning tree (MST) and under the assumption that the vertices are approximately distributed according to a Poisson distribution. We also introduce some recents populars clustering algorithms based on spectral graph theory: spectral clustering methods. Metrics used for measuring similarity between multidimensional data points are based on symmetrical divergences (e.g Kullback‐Leibler and RŐnyi). The use of these informational divergences together with the proposed method leads to better results than some other clustering methods applied to astrophysical data such as: popular reflectance spectra surveys such as Eight Color Asteroid Survey and Small Main Belt Asteroid Spectroscopic Survey II, and hyper‐spectral images.
    12/2008; 1082(1):165-171. DOI:10.1063/1.3059034
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a technique to help the experts in agricultural monitoring, by mining Satellite Image Time Series over cultivated areas. We use frequent sequential patterns extended to this spatiotemporal context in order to extract sets of connected pixels sharing a similar temporal evolution. We show that a pixel connectivity constraint can be partially pushed to prune the search space, in conjunction with a support threshold. Together with a simple maximality constraint, the method reveals meaningful patterns in real data.
    Advances in Data Mining. Applications and Theoretical Aspects - 11th Industrial Conference, ICDM 2011, New York, NY, USA, August 30 - September 3, 2011. Proceedings; 08/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: An important aspect of satellite image time series is the simultaneous access to spatial and temporal information. Various tools allow end users to interpret these data without having to browse the whole data set. In this paper, we intend to extract, in an unsupervised way, temporal evolutions at the pixel level and select those covering at least a minimum surface and having a high connectivity measure. To manage the huge amount of data and the large number of potential temporal evolutions, a new approach based on data-mining techniques is presented. We have developed a frequent sequential pattern extraction method adapted to that spatiotemporal context. A successful application to crop monitoring involving optical data is described. Another appli- cation to crustal deformation monitoring using synthetic aperture radar images gives an indication about the generic nature of the proposed approach.
    IEEE Transactions on Geoscience and Remote Sensing 04/2011; 49:1417-1430. DOI:10.1109/TGRS.2010.2081372 · 2.93 Impact Factor