Fig 4 - uploaded by Patrice Aknin
Content may be subject to copyright.
Source publication
The supervised self-organizing map consists in associating output vectors to input vectors through a map, after self-organizing
it on the basis of both input and desired output given altogether. This paper compares the use of Euclidian distance and Mahalanobis
distance for this model. The distance comparison is made on a data classification applica...
Context in source publication
Context 1
... . 4 depicts the four class nodes weight maps for the LASSO with Mahalanobis metric, after self-organization (observation nodes weight maps are not represented). On the figure, a weight value is represented by a dot in gray scale : dark dot means low value whereas light dot means high value. The prototypes that specialize in one particular ...
Similar publications
Natural image segmentation is an important topic in digital image processing, and it could be solved by clustering methods. We present in this paper an SOM-based k-means method (SOM-K) and a further saliency map-enhanced SOM-K method (SOM-KS). In SOM-K, pixel features of intensity and 𝐿∗𝑢∗𝑣∗ color space are trained with SOM and followed by a k-mean...
Subspace clustering is the task of identifying clusters in subspaces of the input dimensions of a given dataset. Noisy data in certain attributes cause difficulties for traditional clustering algorithms, because the high discrepancies within them can make objects appear too different to be grouped in the same cluster. This requires methods speciall...
This paper presents document search model based on its visual content. There is used hierarchical clustering algorithm - GHSOM. Description of proposed model is given as learning and searching phase. Also some experiments are described on benchmark image sets (e.g. ICPR, MIRFlickr) and created document set. Paper presents some experiments connected...
This work presents a methodology toward fully automated road centerline extraction that exploits spectral content from high resolution multispectral images. Preliminary detection of candidate road centerline components is performed with Anti-parallel-edge Centerline Extraction (ACE). This is followed by constructing a road vector topology with a fu...
Resumo. O estudo de redes complexas, como as urbanas e sociais, oferece subsídios para a compreensão da organização espacial da sociedade e ações de planejamento estratégico de empresas e de governos. Neste trabalho propõe-se uma metodologia que concilia e estende o conceito de associação dominante entre cidades com as reconhecidas propriedades de...
Citations
... The combination of SOMs and nearest-neighbor classification is shown by Ji [25] for the classification of land use based on Landsat satellite data. Additional combinations of unsupervised SOMs and supervised algorithms used for classification are presented in [26][27][28][29]. One example for the application of SOMs to solve nonlinear regression tasks is presented by Hecht et al. [30] in the field of robotics. ...
Machine learning approaches are valuable methods in hyperspectral remote sensing, especially for the classification of land cover or for the regression of physical parameters. While the recording of hyperspectral data has become affordable with innovative technologies, the acquisition of reference data (ground truth) has remained expensive and time-consuming. There is a need for methodological approaches that can handle datasets with significantly more hyperspectral input data than reference data. We introduce the Supervised Self-organizing Maps (SuSi) framework, which can perform unsupervised, supervised and semi-supervised classification as well as regression on high-dimensional data. The methodology of the SuSi framework is presented and compared to other frameworks. Its different parts are evaluated on two hyperspectral datasets. The results of the evaluations can be summarized in four major findings: (1) The supervised and semi-Supervised Self-organizing Maps (SOM) outperform random forest in the regression of soil moisture. (2) In the classification of land cover, the supervised and semi-supervised SOM reveal great potential. (3) The unsupervised SOM is a valuable tool to understand the data. (4) The SuSi framework is versatile, flexible, and easy to use. The SuSi framework is provided as an open-source Python package on GitHub.
... Some supervised and semi-supervised learning methods have been proposed for SOMs. In early studies, the class label was concatenated to the input data [21,22,23]. Kohonen et al. adopted learning vector quantization (LVQ), which updates only the winner node, and then fine-tuned the SOM [24]. ...
The self-organizing map (SOM) is an unsupervised artificial neural network that is widely used in, e.g., data mining and visualization. Supervised and semi-supervised learning methods have been proposed for the SOM. However, their teacher labels do not describe the relationship between the data and the location of nodes. This study proposes a landmark map (LAMA), which is an extension of the SOM that utilizes several landmarks, e.g., pairs of nodes and data points. LAMA is designed to obtain a user-intended nonlinear projection to achieve, e.g., the landmark-oriented data visualization. To reveal the learning properties of LAMA, the Zoo dataset from the UCI Machine Learning Repository and an artificial formant dataset were analyzed. The analysis results of the Zoo dataset indicated that LAMA could provide a new data view such as the landmark-centered data visualization. Furthermore, the artificial formant data analysis revealed that LAMA successfully provided the intended nonlinear projection associating articular movement with vertical and horizontal movement of a computer cursor. Potential applications of LAMA include data mining, recommendation systems, and human-computer interaction.
... The combination of SOMs and nearest-neighbor classification is shown by Ji (2000) for the classification of land use. Additional combinations of unsupervised SOMs and supervised algorithms used for classification are presented by Martinez et al. (2001); Zaccarelli et al. (2003); Zhong et al. (2006); Fessant et al. (2001). One example for the application of SOMs to solve non-linear regression tasks is presented by Hecht et al. (2015) in the field of robotics. ...
In many research fields, the sizes of the existing datasets vary widely. Hence, there is a need for machine learning techniques which are well-suited for these different datasets. One possible technique is the self-organizing map (SOM), a type of artificial neural network which is, so far, weakly represented in the field of machine learning. The SOM's unique characteristic is the neighborhood relationship of the output neurons. This relationship improves the ability of generalization on small datasets. SOMs are mostly applied in unsupervised learning and few studies focus on using SOMs as supervised learning approach. Furthermore, no appropriate SOM package is available with respect to machine learning standards and in the widely used programming language Python. In this paper, we introduce the freely available SUpervised Self-organIzing maps (SUSI) Python package which performs supervised regression and classification. The implementation of SUSI is described with respect to the underlying mathematics. Then, we present first evaluations of the SOM for regression and classification datasets from two different domains of geospatial image analysis. Despite the early stage of its development, the SUSI framework performs well and is characterized by only small performance differences between the training and the test datasets. A comparison of the SUSI framework with existing Python and R packages demonstrates the importance of the SUSI framework. In future work, the SUSI framework will be extended, optimized and upgraded e.g. with tools to better understand and visualize the input data as well as the handling of missing and incomplete data. https://arxiv.org/abs/1903.11114
... In this paper we investigate two additional improvements. The first one [6], [2] uses Mahalanobis metric instead of the Euclidean one. The second improvement [13] shows how to use SOM in a supervised manner. ...
... In our approach, contrary to [6], [2], [3], instead of computing the Mahalanobis matrix as an inverse of covariance matrix, it is learned in a way assuring the smallest distance between points from the same class and large margin separation of points from different classes. Several algorithms exist for distance metric learning (DML) [17], [7]. ...
Self-Organising Maps (SOM) are Artificial Neural Networks used in Pattern Recognition tasks. Their major advantage over other architectures is human readability of a model. However, they often gain poorer accuracy. Mostly used metric in SOM is the Euclidean distance, which is not the best approach to some problems. In this paper, we study an impact of the metric change on the SOM's performance in classification problems. In order to change the metric of the SOM we applied a distance metric learning method, so-called 'Large Margin Nearest Neighbour'. It computes the Mahalanobis matrix, which assures small distance between nearest neighbour points from the same class and separation of points belonging to different classes by large margin. Results are presented on several real data sets, containing for example recognition of written digits, spoken letters or faces.
... Only limited research has been reported for comparing two or more SOMs with each other. An analysis of different distance measures for a supervised version of the SOM and it's application to the classification of rail defects, for example, is studied in [2]. Quality measures for the evaluation of data distribution across maps trained on multi-modal data are explored in [5], where the effect of multiple modalities is shown by the example of song lyrics and acoustic features for audio files. ...
SOMs have proven to be a very powerful tool for data anal- ysis. However, comparing multiple SOMs trained on the same data set using dierent,parameters or initialisations is still a dicult,task. In most cases it is performed only via visual inspection or by utilising one of a range of quality measures to compare,vector quantisation or topology preservation characteristics of the maps. Yet, comparing SOMs system- atically is both necessary as well as a powerful tool to further analyse data: necessary, because it may help to pick the most suitable SOM out of dierent,training runs; a powerful tool because it allows analysing map- ping stabilities across a range of parameter settings. In this paper we present an analytic approach to compare multiple SOMs trained on the same data set. Analysis of output space mapping, supported by a set of visualisations, reveals data co-locations and shifts on pairs of SOMs, con- sidering both dierent,neighbourhood sizes at source and target maps. A similar concept of mutual distances and relationships can be anal- ysed at a cluster level. Finally, Comparisons aggregated automatically across several SOMs are strong indicators for strength and stability of mappings.
... At each iteration, the model parameters are re-estimated to maximize the model log-likelihood, f (X|Φ), until convergence. At each retrieval iteration, the Mahalanobis distance [11] from each Gaussian feature cluster to the existing objects in the database is computed as a measure of similarity (a minimum distance classifier) and the objects in the database are presented to the user as a ranked list in descending order of the cumulative similarity score S(x) where each feature is weighted in direct proportion to the number of its positive samples indicated by the user. The Mahalanobis distance is expressed as: ...
In this paper we present an interactive, object-based video retrieval system which features a novel query formulation method
that is used to iteratively refine an underlying model of the search object. As the user continues query composition and browsing
of retrieval results, the system’s object modeling process, based on Gaussian probability distributions, becomes incrementally
more accurate, leading to better search results. To make the interactive process understandable and easy to use, a custom
user-interface has been designed and implemented that allows the user to interact with segmented objects in formulating a
query, in browsing a search result, and in re-formulating a query by selecting an object in the search result.
... Eventually, only few codebook vectors lie in areas where the input data is sparse. This method is commonly trained in an unsupervised fashion though some supervised ones do exist345 . Typically, supervised SOMs concatenate class information to the input vector, an approach which has a number of disadvantages: 1.) ...
Recent developments with self-organizing maps allow the application to graph structured data. This paper proposes a supervised learning technique for self-organizing maps for structured data. The ideas presented in this paper differ from Kohonen’s approach in that a rejection term is introduced. This approach is superior because it is more robust to the variation of the number of different classes in a dataset. It is also more flexible because it is able to efficiently process data with missing or incomplete class information, and hence, includes the unsupervised version as a special case. We demonstrate the capabilities of the proposed model through an application to a relatively large practical data set from the area of image recognition, viz., logo recognition. It is shown that by adding supervised learning to the learning process the discrimination between pattern classes is enhanced, while the computational complexity is similar to that of the unsupervised version.
... Each method has its own properties and generally gives different perspectives of the data turning the matter of choice not trivial (Meyer, 2002). The most commonly used methods for calculating distance in SOM learning is Euclidean distance measure that considers each observation dimension with the same significance whatever the observation distribution inside classes (Fessant, F et al. 2001). Among all distance measures, some have very similar behaviors in similarity queries. ...
... Based on past researches (Huang, Y. et al., 1998), (Fessant, F et al., 2001) and (Keeratipranon, N. and Maire, F., 2005), shows that although the used of different distance measures has affect the SOM performance in classification tasks but the results obtained is nearly equivalent and not quite promising. For this reason, an enhancement learning methodology for SOM algorithm is proposed that is known as multilevel SOM learning in order to find out whether it can give better improvement on SOM classification results. ...
... This paper (Fessant, F. et al., 2001) focuses on the metric choice for prototype-to-observation distance estimation required during the self-organization and exploitation phases. The result from this study show that Mahalanobis distance turns out to be more effective for the large range of data components variants compared to most commonly use Euclidean distance, which considers each observation with the same significance whatever the observation distribution inside classes. ...
Classification is one of the most active research and application areas of neural networks. Self-organizing map (SOM) is a feed-forward neural network approach that uses an unsupervised learning algorithm has shown a particular ability for solving the problem of classification in pattern recognition. Classification is the procedure of recognizing classes of patterns that occur in the environment and assigning each pattern to its relevant class. Unlike classical statistical methods, SOM does not require any preventive knowledge about the statistical distribution of the patterns in the environment. In this study, an alternative classification of self organizing neural networks, known as multilevel learning, is proposed to solve the task of pattern separation. The performance of standard SOM and multilevel SOM are evaluated with different distance or dissimilarity measures in retrieving similarity between patterns. The purpose of this analysis is to evaluate the quality of map produced by SOM learning using different distance measures in representing a given dataset. Based on the results obtained from both SOM learning methods, predictions can be made for the unknown samples. This study aims to investigate the performance of standard SOM and multilevel SOM as supervised pattern recognition method. The multilevel SOM resembles the self-organizing map (SOM) but it has several advantages over the standard SOM. Experiments present a comparison between a standard SOM and multilevel SOM for classification of pattern for five different datasets. The results show that the multilevel SOM learning gives good classification rate, however the computational times is increased compared over the standard SOM especially for medium and large scale dataset.
A self-organizing map (SOM) is an unsupervised artificial neural network that is widely used in, e.g., data mining and visualization. Supervised and semi-supervised learning methods have been proposed for the SOM. However, their teacher labels do not describe the relationship between the data and the location of nodes. This study proposes a landmark map (LAMA), which is an extension of SOMs that utilizes several landmarks, e.g., pairs of nodes and data points. LAMA is designed to obtain a user-intended nonlinear projection to achieve, e.g., the landmark-oriented data visualization. To reveal the learning properties of LAMA, the Zoo dataset from the UCI Machine Learning Repository, the McDonald’s dataset from Kaggle, and an artificial formant dataset were analyzed. The analysis results of the Zoo dataset indicated that LAMA could provide a new data view such as the landmark-centered data visualization. McDonald’s dataset analysis demonstrated menu recommendation examples based on a few designated items. Furthermore, the artificial formant data analysis revealed that LAMA successfully provided the intended nonlinear projection associating articular movement with vertical and horizontal movement of a computer cursor. Potential applications of LAMA include data mining, recommendation systems, and human–computer interaction.
The works concern with detection and classification problems for fault diagnosis. Two approaches are treated. The first one, where the K-classes global problem is splitted into sub problems, is called simultaneous detection and classification. Each sub problem consists of one or several classes detection, and it is solved by a block that links together pre-processing phase, choice of the representation space, detection then decision. The complete resolution of the K-classes problem is carried out by a sequential arrangement of the blocks - in accordance with a hierarchic decision tree - or a parallel decision scheme.
The second approach is the successive detection and classification approach. It consits of a first basic signal
processing for alarm generation that indicates the possible existence of default. Then, high-level processings are activated in order to precisely analyze the default signature. Classification tools - linear classifiers, neural classifiers, support vector machines - are detailed. A section is focused on the margins tuning and generalisation capabilities of the classifiers.
All these methods have been validated on a rail surface defect detection application in subway context. A real
time prototype allows testing the simultaneous detection and classification solutions in running conditions. The
good detection and false alarm rates have been calculated for 4 classes of rail defect. The wavelet transform, the inverse filtering and the independent component analysis are particularly detailed for the pre-processing phase.
A set of on-site labelled measurings allows to statistically qualifying the solutions of the successive
classification and detection approach. A hierarchical presentation of the methods is proposed, in terms of
generalisation capability, complexity and also ability to solve the problem with or without optimisation of the
representation space.