Table 2 - uploaded by John Anzola
Content may be subject to copyright.
Keywords for cluster

Keywords for cluster

Source publication
Conference Paper
Full-text available
An important feature in data analysis is the exploration and data representation. This article describes the Principal Components Analysis techniques (PCA) and clusters analysis with k-means, in order to represent a set of two-dimensional spatial data and group similar data to find relationships between the two techniques. Data is extracted from IE...

Context in source publication

Context 1
... the value of í µí±˜ increases will be dialing a horizontal division red at the bottom that contains information of little variability, which is part of the keywords that have less than 3 keywords published per year, in other words, they are gathering research topics little impact or little explored. Table 2 shows the number of keywords contained in each group values í µí±˜ = 3,4,5,6. Figure 12. ...

Citations

... PCA was employed to improve K-means for quality evaluation of groundwater and management [69]. These two methods were also used in digital library exploration [7]. In [87], to solve the problems in the recruitment and determination of employees, PCA was utilized to reduce the dimensions of evaluation indexes, while K-means was utilized for the hierarchical clustering analysis of data after dimension reduction. ...
Article
Full-text available
The K-means algorithm is a popular clustering method, which is sensitive to the initialization of samples and selecting the number of clusters. Its performance on high-dimensional datasets is considerably influenced. Principal component analysis (PCA) is a linear dimensionless reduction method that is closely related to the K-means algorithm. Dimension reduction leads to the selection of initial centers in a smaller space, which is a solution to solve initialization problems. The present study investigates the reciprocal relationship between K-means and PCA and adopts an innovative approach of creating sub-datasets and applying step-by-step labeling in the hybrid execution of both algorithms to propose two methods, namely K-P and P-K. The clusters that are obtained from the two proposed methods are of high interpretability. This was verified by the step-by-step labeling results of a human resource dataset. Interpretability was evaluated via the distribution of features of interest (FoI), suggesting improved results for both datasets. In addition to the improvement of the qualitative results, the outcome of the present study showed the sum of squared estimate of errors (SSE)/N (total number of data) and silhouette improvement of 10 datasets with eight initialization methods in previous studies. The P-K results and run time were better than the K-P ones.