ArticlePublisher preview available

An equidistance index intuitionistic fuzzy c-means clustering algorithm based on local density and membership degree boundary

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Fuzzy c-means (FCM) algorithm is an unsupervised clustering algorithm that effectively expresses complex real world information by integrating fuzzy parameters. Due to its simplicity and operability, it is widely used in multiple fields such as image segmentation, text categorization, pattern recognition and others. The intuitionistic fuzzy c-means (IFCM) clustering has been proven to exhibit better performance than FCM due to further capturing uncertain information in the dataset. However, the IFCM algorithm has limitations such as the random initialization of cluster centers and the unrestricted influence of all samples on all cluster centers. Therefore, a novel algorithm named equidistance index IFCM (EI-IFCM) is proposed for improving shortcomings of the IFCM. Firstly, the EI-IFCM can commence its learning process from more superior initial clustering centers. The EI-IFCM algorithm organizes the initial cluster centers based on the contribution of local density information from the data samples. Secondly, the membership degree boundary is assigned for the data samples satisfying the equidistance index to avoid the unrestricted influence of all samples on all cluster centers in the clustering process. Finally, the performance of the proposed EI-IFCM is numerically validated using UCI datasets which contain data from healthcare, plant, animal, and geography. The experimental results indicate that the proposed algorithm is competitive and suitable for fields such as plant clustering, medical classification, image differentiation and others. The experimental results also indicate that the proposed algorithm is surpassing in terms of iteration and precision in the mentioned fields by comparison with other efficient clustering algorithms.
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Applied Intelligence (2024) 54:3205–3221
https://doi.org/10.1007/s10489-024-05297-1
An equidistance index intuitionistic fuzzy c‑means clustering
algorithm based onlocal density andmembership degree boundary
QianxiaMa1· XiaominZhu1· XiangkunZhao1· ButianZhao2· GuanhuaFu3· RuntongZhang2
Accepted: 28 January 2024 / Published online: 27 February 2024
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024
Abstract
Fuzzy c-means (FCM) algorithm is an unsupervised clustering algorithm that effectively expresses complex real world
information by integrating fuzzy parameters. Due to its simplicity and operability, it is widely used in multiple fields such as
image segmentation, text categorization, pattern recognition and others. The intuitionistic fuzzy c-means (IFCM) clustering
has been proven to exhibit better performance than FCM due to further capturing uncertain information in the dataset. How-
ever, the IFCM algorithm has limitations such as the random initialization of cluster centers and the unrestricted influence of
all samples on all cluster centers. Therefore, a novel algorithm named equidistance index IFCM (EI-IFCM) is proposed for
improving shortcomings of the IFCM. Firstly, the EI-IFCM can commence its learning process from more superior initial
clustering centers. The EI-IFCM algorithm organizes the initial cluster centers based on the contribution of local density
information from the data samples. Secondly, the membership degree boundary is assigned for the data samples satisfying the
equidistance index to avoid the unrestricted influence of all samples on all cluster centers in the clustering process. Finally,
the performance of the proposed EI-IFCM is numerically validated using UCI datasets which contain data from healthcare,
plant, animal, and geography. The experimental results indicate that the proposed algorithm is competitive and suitable for
fields such as plant clustering, medical classification, image differentiation and others. The experimental results also indicate
that the proposed algorithm is surpassing in terms of iteration and precision in the mentioned fields by comparison with
other efficient clustering algorithms.
Keywords Equidistance index· Local density· Membership degree boundary· Intuitionistic fuzzy c-means· Equidistance
index intuitionistic fuzzy c-means
1 Introduction
As an essential branch of machine learning, clustering analy-
sis aims to gather high similarity data samples into the same
group. As an unsupervised learning algorithm, clustering
has been widely used in many fields, such as image segmen-
tation [1], evaluation of credit risk prediction [2], and pattern
recognition [3]. In various clustering algorithms [47], the
fuzzy c-means clustering (FCM) proposed by Bellman etal.
[8] can integrate the uncertainty of the actual datasets by
combining Zada’s fuzzy theory [9]. The use of fuzzy infor-
mation is mostly driven by the ability to understand opera-
tions in a manner akin to human logical thinking, which
can capture more information about actual problems [10]. In
FCM clustering, the interaction between different clusters is
generated by FCM, which can effectively avoid falling into
the local optimal solution [11, 12]. Due to the uncertainty in
data collection in practical problems, FCM may experience
uncertainty when calculating the membership value of a
given sample [13]. In other words, due to the fact that fuzzy
theory only obtains uncertain information through mem-
bership functions in expressing fuzzy information, this can
result in the loss of some fuzzy information [14]. Therefore,
FCM has certain limitations in comprehensively obtaining
uncertain information [15].
In order to improve the problem of fuzzy sets being
unable to obtain more uncertain information, Atanassov
* Xiaomin Zhu
xmzhu@bjtu.edu.cn
1 School ofMechanical, Electronic andControl Engineering,
Beijing Jiaotong University, Beijing100044, China
2 School ofEconomics andManagement, Beijing Jiaotong
University, Beijing100044, China
3 Rail Transit Department, Tianjin Jinhang Computing
Technology Research Institute, Tianjin300308, China
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... To further confirm the clustering performance of the proposed GHFCM algorithm, we compare the proposed GHFCM algorithm with the latest clustering algorithms such as FRCM [26], FALRCM [63], EI_IFCM [64], and GMFCM [37] by testing it on both numeric and image data. ...
Article
Full-text available
Aiming at the defects of fuzzy clustering based on Ruspini partitioning in revealing the relationship between categories, a new generalized harmonic fuzzy partition C-means clustering algorithm is proposed in this paper. Firstly, based on the existing concept of Zadeh’s fuzzy set, the new concept of generalized harmonic fuzzy set is introduced and some basic operations of generalized harmonic fuzzy sets are given. Secondly, the axiomatic definitions and specific expressions of fuzzy entropy and similarity for generalized harmonic sets applied to pattern analysis and machine intelligence are provided. Again, the corresponding harmonic fuzzy partition for cluster analysis are further defined. Fourthly, based on the concept of harmonic fuzzy partition, we propose a novel generalized harmonic fuzzy partition C-means clustering algorithm. Finally, the competitiveness and advantages of the proposed algorithm are verified by comparing with existing representative fuzzy clustering algorithms.
... Calculating the centroid aims to find a representative value of the fuzzy distribution by dividing the first moment by the total area under the membership function curve (Ma et al., 2024). This centroid reflects a weighted average value, where the weights are determined by the degree of membership. ...
Article
The fermentation process of cassava tape involves complex biochemical reactions and requires careful attention to various factors. Temperature and fermentation time are crucial parameters that significantly affect the final quality of cassava tape. Along with the development of artificial intelligence (AI), the fermentation process can be controlled and monitored more precisely, thereby increasing the efficiency and consistency of the final product. This study aims to determine the Mamdani fuzzy logic approach in the temperature control system and fermentation duration in the cassava tape production process to optimally regulate the alcohol content of cassava tape. The research method is a literature study, testing Mamdani fuzzy logic using Matlab software, and analyzing input variables manually. The results of the study showed that the optimal temperature for cassava tape fermentation was between 25°C - 30°C and the optimal time for cassava tape fermentation was with a "long" duration. From the defuzzification results, the final results showed an alcohol content of 15.9% at a temperature of 29°C and a fermentation time of 80 hours so that the alcohol content of cassava tape was in accordance with the specifications.
... In addition, by considering the golden ratio, uncertainties in the decision-making processes can be overcome more successfully. This situation provides a significant advantage to this model over the previously generated ones (Ma et al., 2024;Bose et al., 2024) (iii) Integration of quantum theory and Spherical fuzzy sets is another factor that increases the originality of the model. Thanks to quantum theory, conditions within different possibilities can be taken into consideration. ...
Article
Full-text available
The purpose of this study is to analyze the investment success of renewable energy generation projects design. A novel model has been constructed for this purpose. At the first stage, collaborative filtering methodology is taken into consideration to complete missing evaluations. After that, M-SWARA based on QUSFSs with golden cut is used to compute the weights of these factors. Finally, the components of the service design are ranked by TOPSIS approach. The main contribution of the paper is that a new methodology (M-SWARA) has been created in this study by making improvements to SWARA. With the help of this new model, causal directions between the indicators can also be examined. Similarly, collaborative filtering methodology is taken into consideration to complete missing evaluations. In this process, the decision makers are allowed to leave the questions they wanted blank. This situation is considered as the superiority of the proposed model compared to many previous models in the literature. The findings indicate that cost is the most significant factor for the success of renewable energy investments because it gets the highest weight (.261). The ranking results also demonstrate that product is the most essential component of the service design of renewable energy investments. Therefore, solving the high-cost problem is of vital importance to increase these investments. First, renewable energy companies can reduce costs with more effective financial management. To carry out this process effectively, a finance department consisting of qualified personnel is needed. Thanks to this team, current situations in the financial markets will be better followed and this will play an important role in reducing costs.
Article
Full-text available
This paper aims to revolutionize teaching and learning practices in higher education by optimizing prediction accuracy through recommendation system methods in applied art education. The study explores advanced machine learning techniques, particularly intuitive fuzzy C-means clustering and user interest collaborative filtering (IFCM-UIR-CF). Quantitative analysis reveals significant performance differences among three methods: FCM-CF, IFCM-CF, and IFCM-UIR-CF. FCM-CF starts with a Mean Absolute Error (MAE) of 0.845, improving to 0.750 with 150 neighbors. IFCM-CF starts at 0.810 and decreases to 0.700. IFCM-UIR-CF outperforms both, starting at 0.800 and reaching a minimum MAE of 0.658. These results show that increasing the number of nearest neighbors enhances prediction accuracy, with IFCM-UIR-CF providing the most accurate recommendations. The study emphasizes the significance of user-commodity relationships and decomposition processes in improving recommendation accuracy, offering insights for future research.
Chapter
Fuzzy clustering algorithms have emerged as powerful tools for various image processing tasks, owing to their ability to handle uncertainties and ambiguities inherent in image data. This chapter provides a comprehensive review of recent advancements in fuzzy clustering algorithms for image processing, focusing on applications such as image classification, texture analysis, segmentation of remote sensing images, and object recognition. Specifically, we discuss the principles and applications of fuzzy clustering in image classification, texture analysis, and segmentation tasks, highlighting the advantages and limitations of popular algorithms such as fuzzy C-means (FCM), spatial fuzzy C-means (SFCM), and intuitionistic fuzzy C-means (IFCM). Furthermore, we present a comparative analysis of these algorithms based on their performance metrics and suitability for different image processing tasks. Finally, we identify open challenges and propose potential future research directions in fuzzy clustering for image processing, including handling high-dimensional data, integration with deep learning techniques, scalability, interpretability, and addressing complex image structures.
Article
The classification and processing of multimedia audio and video teaching resources data is crucial for the development of the next generation of multimedia technology. In music courses, the traditional method of merging, classification, and identification of multimedia audio and video teaching resources uses functional traditional fuzzy C-average class methods, and treats the entire document as a systematic research object. However, this method cannot subdivide documents, let alone handle information about multimedia audio and video teaching resources that are specific to music courses. To address this issue, we propose a method of improving the fuzzy C-average polyet algorithm to classify and identify multimedia audio and video teaching resources with dual subtraction backgrounds. First, we use information entropy as a standard for classification and identification, and leverage the nonlinear mapping ability of neural networks to calculate and blur the weight fuzzy C-average polyet algorithm. This approach solves the issues of inaccurate classification and incompetence of classes. For actual test verification, we used five documents, with five documents in each category and three functional items. The results show that the improved fuzzy C-average polyet algorithm can more effectively identify and classify multimedia audio and video teaching resources in classified music courses. It is less characteristic, more distributed in the distribution of multimedia audio teaching resources in random music courses, and has strong convergence and high application value. Overall, this study demonstrates the effectiveness of the proposed method in classifying and identifying multimedia audio and video teaching resources in music courses. The improved fuzzy C-average polyet algorithm can be used as a valuable tool for researchers and practitioners in the field of multimedia technology.
Article
Full-text available
Segmentation of brain MRI images becomes a challenging task due to spatially distributed noise and uncertainty present between boundaries of soft tissues. In this work, we have presented intuitionistic fuzzy set theory based probabilistic intuitionistic fuzzy c-means with spatial neighborhood information method for MRI image segmentation. We have investigated two well known negation functions namely, Sugeno’s negation function and Yager’s negation function for representing the image in terms of intuitionistic fuzzy sets. The proposed approach takes leverage of intuitionistic fuzzy set theory to address vagueness and uncertainty present in the data. The spatial neighborhood information term in the segmentation process is included to dampen the effect of noise. The segmentation performance of the proposed method is evaluated in terms of average segmentation accuracy and Dice score. Further, the comparison of the proposed method with other similar state-of-art methods is carried out on two publicly available brain MRI dataset which shows the significant improvements in segmentation performance in terms of average segmentation accuracy and Dice score. The proposed approach achieves on average 91% average segmentation accuracy in the presence of noise and intensity inhomogeneity on BrainWeb simulated dataset, which outperformed the state-of-art methods.
Article
Full-text available
Fuzzy K-Means clustering (FKM) is one of the most popular methods to partition data into clusters. Traditional FKM and its extensions perform fuzzy clustering based on original high-dimensional features. However, the presence of noisy and redundant features would cause the degradation of clustering performance. To avoid this problem, we integrate fuzzy clustering and feature selection into a unified model where the structured sparsity-inducing norm is imposed on the transformation matrix to determine the valuable feature subse adaptively. The clustering task and feature selection process are promoted mutually. To solve this model, an iterative algorithm is developed. Extensive experiments conducted on benchmark data sets demonstrate the effectiveness of our proposed method.
Article
Full-text available
Random vector functional link (RVFL) is a widely used powerful model for solving real-life problems in classification and regression. However, the RVFL is not able to reduce the impact of noisy data, despite its high generalization capability. This paper presents a new intuitionistic RVFL classifier (IFRVFLC) for binary classification with the goals of improving the overall classification capability of the RVFL network and increasing its classification efficiency on noisy data sets. In IFRVFLC, each training sample is associated with an intuitionistic fuzzy number which consists of membership or non-membership frames. The membership degree of a pattern considers the distance from the respective class centre. The degree of non-membership, on the other hand, is determined by the ratio of the heterogeneous point number to the total number of neighbouring points. To check the efficiency of the proposed IFRVFLC model, its classification performance is compared with the support vector machine (SVM), twin SVM, kernel ridge regression, extreme learning machine , intuitionistic fuzzy SVM, intuitionistic fuzzy twin SVM and RVFL networks. The obtained results show the usability of the proposed IFRVFLC model.
Article
Circular intuitionistic fuzzy set (C-IFS) is introduced by Atanassov in 2020 as an extension of intuitionistic fuzzy sets. It is represented by a circle with a radius (r) of each element consist of degrees of membership and non-membership. Several MCDM methods based on distance measures of C-IFS are already proposed in the literature. The primary objective of this study is the development, with the use of the C-IFS, of a new formulation of functions to form a novel C-IFS multi-criteria decision making (MCDM) method. In addition to the existing literature, this study contributes to circular intuitionistic fuzzy sets by proposing some formulations on radius calculation and a new defuzzification function for C-IFS. The optimistic and pessimistic points are also defined on the set to identify a novel score function and an accuracy function with decision-makers attitude ( ). When the perspective of the decision-maker ( ) approaches 1, it means that C-IFS is defuzzified close to its optimistic point, and when the perspective ( ) approaches 0, it is defuzzified close to the pessimistic point of C-IFS. With the use of these functions, a novel C-IFS MCDM method is presented based on criteria weighting and alternative ranking algorithms. This technique is applied to a supplier selection problem for a seamless supply chain network. A sensitivity analysis is also performed to test the effect of parameter changes on the final results. The findings of the study are compared with the results of a classical IFS-MCDM model. Since C-IFS is an extension of IFS, in addition to similar rankings, more precise results are obtained by considering the optimistic and pessimistic points by including the decision-maker attitude in the functions proposed for C-IFS. The study is a pioneer in the C-IFS literature by presenting C-IFS defuzzification function and a new C-IFS MCDM procedure.
Article
The fuzzy c-mean (FCM) clustering algorithm is a typical algorithm using Euclidean distance for data clustering and it is also one of the most popular fuzzy clustering algorithms. However, FCM does not perform well in noisy environments due to its possible constraints. To improve the clustering accuracy of item varieties, an improved fuzzy c-mean (IFCM) clustering algorithm is proposed in this paper. IFCM uses the Euclidean distance function as a new distance measure which can give small weights to noisy data and large weights to compact data. FCM, possibilistic C-means (PCM) clustering, possibilistic fuzzy C-means (PFCM) clustering and IFCM are run to compare their clustering effects on several data samples. The clustering accuracies of IFCM in five datasets IRIS, IRIS3D, IRIS2D, Wine, Meat and Apple achieve 92.7%, 92.0%, 90.7%, 81.5%, 94.2% and 88.0% respectively, which are the highest among the four algorithms. The final simulation results show that IFCM has better robustness, higher clustering accuracy and better clustering centers, and it can successfully cluster item varieties.
Article
Distribution of snow and its melting is a critical factor affecting local weather, avalanche and flood forecasting, livelihood of people residing, and hydropower production. Most of the existing dry and wet snow identification methods were based on expensive quad-pol synthetic aperture radar (SAR) with finite generalizability, while dual-pol SAR with larger coverage, longer time series, and open availability has more advantages. In this study, an unsupervised algorithm for dry and wet snow discrimination, neighborhood-based sparse autoencoder (NSAE)-weighted fuzzy C-means clustering (WFCM), is proposed based on a variety of polarimetric features derived from the Hα\alpha decomposition in the dual-pol mode using the C-band Sentinel-1 SAR data. NSAE-WFCM constructs a deep training network using the pixel NSAE to optimize polarimetric parameters and inputs reconstructed features with different weights into feature-WFCM to distinguish dry and wet snow for each underlying surface. Ground observation was carried out during the snow melting period of March 2021 in Altay, China, to validate the dual-pol NSAE-WFCM method with an overall accuracy and a Kappa coefficient of 88.8% and 0.68, respectively. The results show that NSAE-WFCM’s accuracy is similar to that of the quad-pol SAR-based dry and wet snow result (90.0%) and significantly better than that of previously published approaches extended to dual-pol SAR, such as support vector machine (SVM) (76.7%), H– α\alpha -Wishart (65.5%), total power-based method (51.7%), and wet snow-based method (43.1%). Therefore, the NSAE-WFCM algorithm improves the ability to classify wet and dry snow based on dual-pol polarimetric features, overcomes the high dependence of existing methods on quad-pol SAR data, and reduces manual interpretation by using unsupervised clustering.