ArticlePDF Available

Generalized fuzzy c-means clustering strategies using Lp norm distances

Authors:

Abstract and Figures

Fuzzy c-means (FCM) is a useful clustering technique. Modifications of FCM using L1 norm distances increase robustness to outliers. Object and relational data versions of FCM clustering are defined for the more general case where the Lp norm (p&ges;1) or semi-norm (0<p<1) is used as the measure of dissimilarity. We give simple (though computationally intensive) alternating optimization schemes for all object data cases of p>0 in order to facilitate the empirical examination of the object data models. Both object and relational approaches are included in a numerical study
Content may be subject to copyright.
576 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 5, OCTOBER 2000
Generalized Fuzzy c-Means Clustering Strategies
Using Norm Distances
Richard J. Hathaway, Member, IEEE, James C. Bezdek, Fellow, IEEE, and Yingkang Hu
Abstract—Fuzzy c-means (FCM) is a useful clustering tech-
nique. Recent modifications of FCM using norm distances
increase robustness to outliers. Object and relational data versions
of FCM clustering are defined for the more general case where
the norm or semi-norm is used as the
measure of dissimilarity. We give simple (though computationally
intensive) alternating optimization schemes for all object data
cases of in order to facilitate the empirical examination of
the object data models. Both object and relational approaches are
included in a numerical study.
Index Terms—Clustering, fuzzy c-means, norm, outlier.
I. INTRODUCTION
THE fuzzy c-means (FCM) algorithm [1] has successfully
been applied to a wide variety of clustering problems [2].
This approach partitions a set of object data
into c-(fuzzy) clusters based on a computed minimizer of the
fuzzy within-group least squares functional
(1)
where fuzzification parameter;
prototype (or mean) of
the th cluster;
degree to which datum
belongs to the th
cluster;
matrix of cluster proto-
types;
partition matrix;
Euclidean or 2-norm
squared.
For later notational convenience, we will array the object data
as columns in the object data matrix
. The partition matrix is a convenient
tool for representing cluster structure in the data ;
Manuscript received September 6, 1999; revised May 15, 2000. The work of
R. J. Hathaway was supported by the ONR under Grant 00014-96-1-0642 and in
partbytheFacultyResearchSubcommitteeofGeorgiaSouthernUniversity. The
work of J. C. Bezdek was supported by the ONR under Grant 00014-96-1-0642 .
The work of Y. Hu was supported in part by the Faculty Research Subcommittee
of Georgia Southern University.
R. J. Hathaway and Y. Hu are with the Mathematics and Computer Science
Department, Georgia Southern University, Statesboro, GA 30460 USA (e-mail:
hathaway@ieee.org).
J. C. Bezdek is with the Department of Computer Science, University of West
Florida, Pensacola, FL 32514 USA.
Publisher Item Identifier S 1063-6706(00)08467-8.
we define the set of all nondegenerate fuzzy partition
matrices for partitioning data into clusters as
(2)
The most popular and effective method of optimizing (1) is
the fuzzy c-means algorithm, which alternates between opti-
mizations of over with fixed and
over with fixed, producing a sequence .
Specifically, the st value of is computed
using the th value of in the right-hand side of
for
(3)
Thenthe updated st value of is used to calculate the st
value of via
where (4)
for and (5)
The FCM iteration is initialized using some (or
possibly ) and continues by alternating the updates in
(3) and (4) until the difference measured in any norm on
(or )in successive partition matrices (or matrices) is less
than some prescribed tolerance .
While FCM has proven itself to be very useful, the quality
of the computed cluster centers can sometimes
be degraded due to the effects of outliers in the data set. This
occurs because ,
the datum-to-prototype dissimilarity term in (1), can place con-
siderable weight on outlying data points, thus pulling cluster
prototypes away from the “center” or main distribution of the
(nonoutlying) cluster.
There are a number of useful approaches for controlling the
harmful effects of outlying data, including the possibilistic clus-
tering approach of Krishnapuram and Keller [3] and the fuzzy
noise-clustering approach of Dave [4]. Most important to this
note is the work of Kersten [5]–[7] and Miyamoto and Agusta
[8], who independently suggested replacing with
in the FCM functional in order
1063–6706/00$10.00 © 2000 IEEE
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
HATHAWAY et al.: GENERALIZED FUZZY c-MEANS CLUSTERING STRATEGIES 577
Fig. 1. An illustration of the possible shapes of .
to increase robustness against outlying data. Earlier work that
uses the norm or its square in FCM-based clustering ap-
pears in Bobrowski and Bezdek [9] and Jajuga [10]. Bobrowski
and Bezdek [9] also gave a method for optimizing when the
square of the sup norm is used in (1).
In this note, we examine FCM-based clustering using general
norm distances, where the norm of the -dimensional real
vector is defined as . In Sec-
tion II, we present an object-data strategy for using norms
due to Miyamoto and Agusta [11], [12] and Overstreet [13].
Additionally, we describe how a relational data approach can be
taken using the non-Euclidean fuzzy c-means (NERFCM) algo-
rithm of Hathaway and Bezdek [14]. The object-data approach
operates directly and solely, on the object datamatrix . There-
lational approach clusters indirectly through the
use of derived dissimilarity data , where is some
measure of the dissimilarity between and . The two strate-
gies will be compared using numerical examples in Section III.
The final section summarizes our findings.
II. EXTENSIONS OF FUZZY c-MEANS
A. Object-Data Strategy
This approach is based on a direct modification of the fuzzy
c-means functional . The generalization of (1) that we
consider here was originally proposed by Miyamoto and Agusta
[11] and later extended by Overstreet [13]. The objective func-
tion is
(6)
The optimization of (6) is relatively straightforward and the
choice of has considerable effect on the influence of outliers
and other properties of the representation of the clusters. We see
that isthe original FCM functional and is the more ro-
bust functional used by Kersten [5]–[7]. Miyamoto and Agusta
[11], [12] consider this model in general for and the
range of is extended to include in Overstreet [13].
The object-data approach considered in this correspondence is
based on iterative minimization of .
The optimization of (6) for general is more complicated and
costly than the optimization of the special case (FCM).
However, as with FCM, optimization can be done by alternating
separate optimizations over the and variables. The ( -vari-
able) minimizer of over (for a fixed )isgiven
by (4) using the datum-to-prototype dissimilarities
for and (7)
Appropriate methods of computing the -variable minimizer of
over (for a fixed ) depend on the value of
, but in all cases this optimization can be decoupled into
independent univariate minimizations of functions of the form
for and (8)
The geometric form of is nonconvex for , with
a cusp at each datum value .For is convex and
piecewise linear, with a corner at each . The function is
differentiable and strictly convex if . The three types
of shapes are illustrated in Fig. 1 using the function .
We choose to minimize in (8) in the simplest possible way
sinceour emphasis here is on understanding the properties ofthe
clusterings and not on computational efficiency. For ,we
note that takes its minimum value over for one (or more)
elements in the set .For , the minimizing
value of is simply taken to be the smallest of the values,
which globally minimize over . The mean of
the smallest and largest globally minimizing values is used
for the special case of .For , the computed value of
is taken to be a numerical approximation to the unique zero
of , obtained
here using the method of bisection.
We summarize the object-data strategy. It consists of alter-
nating optimizations of in (6) between the and vari-
ables. The optimization over the variable is accomplished
using(7) in (4) and the optimization overthe variableis decou-
pled into univariate optimizations of functions of the form
(8). The univariate optimizations are essentially done using ex-
haustive search over for and bisection on
for . (Exhaustive search, which is necessary for ,
is prohibitively expensive for sufficiently large data sets and we,
therefore, acknowledge a practical limitation to the usefulness
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
578 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 5, OCTOBER 2000
of this model in some cases. We repeat that the emphasis here
is on studying the clustering solutions produced by the various
models.) The alternating optimization is continued until succes-
sive partitions are within of each other, as mea-
sured by the sup norm of - . It is important to under-
stand that the independence of the components of (for fixed
) in (6) allows the minimizing to be calculated using only
univariate optimizations; and the result so obtained does opti-
mize over for .
B. Relational-Data Strategy
The relational-data strategy uses the non-Euclidean re-
lational fuzzy c-means (NERFCM) in [14]. In essence,
NERFCM is a safeguarded version of RFCM, which is the
relational dual of fuzzy c-means. RFCM produces the FCM
clustering of indirectly using relational data
. Given some matrix of rela-
tional data , the NERFCM algorithm iteratively generates
a sequence of partition matrices according to the
following steps. The current matrix is used to calculate
vectors according to
(9)
These vectors are then used to calculate new dissimilarities
according to
for and (10)
If necessary, the dissimilarities are altered to guarantee posi-
tivity ([14]) and then they are used in (4) to generate the new
iterate. The iteration is continued until the sup-norm differ-
ence in successive matrices is sufficiently small.
The relational data approach for an extension of FCM
consists of applying NERFCM with the -based relational
data .
This approach will produce a terminal partition matrix ,
which attempts to represent the cluster memberships for
, but it does not directly provide cluster proto-
types . We recover meaningful prototypes by
using the terminal partition with and solving
(11)
We remark that the duality theory (in [14] and [15]) that guar-
antees that the same solution is found using the object and re-
lational versions of FCM holds only for . At other values
for , it is possible that the object data version and its relational
derivative yield different pairs for the same choices of
common algorithmic parameters. One purpose for trying a rela-
tional approach for is to discover any general similarities
between the object and relational approaches that extend past
the limited duality theory. Theoretical convergence of the object
data algorithms for is shown in [12]; existing con-
vergence theory ([14]) for the relational approach only covers
the case of .
Fig. 2. “Two cluster data” scatterplot and initial prototypes, .
III. NUMERICAL EXPERIMENTS
In all experiments we chose and stopped iteration as
soon as the absolute value of differences of all pairs of elements
(i.e., ) in a successive pair of matrices dif-
fered by less than 0.00001. The first experiment uses the (
point) data set in Fig. 2, which consists of three (25 point) ra-
dial clusters centered at and
and a varying number of outliers located at
the indicated positions. The purpose of this experiment is to in-
vestigate sensitivity to outliers of the object and relational ap-
proaches for various values of . The leftmost column of the
table gives the number of outliers included with the three clus-
tersin the sample.The number of outliers isevenand the outliers
are evenly divided between the two positions shown in Fig. 2.
All iteration sequences are initialized using a hard partition that
correctly partitions the three clusters and groups each outlier
with its nearest cluster. Fora computed set of terminal prototype
vectors and , we measure the sensitivity to the outliers
as the Frobenius norm distance between the true centers and
terminal prototype vectors: , where
is defined by .
Note the effects of increasing numbers of outliers as we move
down the rows of Table I. We see no deterioration in the quality
of the terminal prototype vectors for as many as 24 outliers for
both the object and relational approaches with and
. The last few rows indicate that this resistance to out-
liers is actually slightly greater for than for .
For any value of , the deviation of the computed proto-
types from the true cluster centers steadily increases with the
number of outliers. The object and relational data results are
quite comparable for and, as predicted by duality theory,
they produce identical results for .For , the object
and relational protoypes eventually vary from the true centers
for sufficiently large numbers of outliers, but fewer outliers are
required to cause substantial deviation for the relative data ap-
proach. Based on the results of this experiment, the object data
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
HATHAWAY et al.: GENERALIZED FUZZY c-MEANS CLUSTERING STRATEGIES 579
TABLE I
DEVIATION OF COMPUTED FROM TRUE CLUSTER CENTERS:
Fig. 3. “Two cluster data” scatterplot and initial prototypes .
approach for offers the greatest robustness and efficient
implementational approaches for it are discussed in [7].
The results of the first experiment show that in at least
some cases, the errors produced by the object and relational
approaches are of a similar magnitude. Is it also true that
the computed prototypes and partition matrices of the two
approaches are themselves very similar? The remaining nu-
merical experiments use other artificial two-dimensional data
sets that allow us to graphically depict the effect of on the
placement of the terminal prototype vectors. The two data
sets (and initial prototype values) are depicted in Figs. 3 and
4 and are, respectively, called the “two cluster data” and “no
cluster data.” Using identical data values, initializations,
and stopping criteria, we calculate and
using the relational and object data approaches, respectively.
We calculated the Frobenius norm difference in the terminal
partitions and prototypes produced by the two approaches as
and . These differences
are given for a range of values using the “two cluster data”
and “no cluster data” in Table II.
Duality theory for NERFCM ([11], [12]) guarantees that the
difference is zero when , but note that it is reasonably
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
580 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 5, OCTOBER 2000
Fig. 4. “No cluster data” scatterplot and initial prototypes .
TABLE II
FROBENIUS NORM DIFFERENCE IN TERMINAL OBJECT AND RELATIONAL
VALUES
small for near two. Note also how the difference is some-
times much greater than the difference . We also see from
this table that the relational and object-based results can be very
different for the important case of .
The remaining figures in this section graphically depict the
position of the computed terminal prototypes for a range of
values. Fig. 5 shows the results for the “two cluster data”
obtained by the two approaches. Note that the outliers have
increasing effect as increases from to .As
continues to increase above two, the outlying data has an
even more powerful draw on the prototypes, which move ever
nearer to the approximate center of the figure. Note that for
, the terminal prototypes produced by the object-data
approach collapse into coincident clusters. Because the “two
cluster data” set has vertical symmetry, so do and from
either approach.
Fig. 6 shows a similar experiment for the relational-data ap-
proach applied to the “no cluster data.” While this is not a “clus-
tering” example, we used it to better understand the behavior of
the methods. Surprisingly, we observed coincident prototypes at
the center of the data for small values of such as and
nearly coincident prototypes for large values such as ;
the prototypes are most different and farthest from the center
when . The behavior of the object-data approach on this
example was similar in that gave the most separated pro-
totype values.
IV. DISCUSSION
We described relational and object-data approaches for gen-
erating norm extensions of FCM. Also, we examined the
behavior of the approaches for various values using artificial
data sets. We believe that the two most useful models are ob-
ject data based and correspond to and .For , the
fuzzy c-means algorithm in [1] offers the least expensive clus-
tering technique of all, and it works very well in most cases. For
cases where noisy data and outliers may degrade FCM results,
we recommend the use of the object data model with , op-
timized using the fuzzy c-medians algorithm described in [7].
The relational data approach is best saved for cases when ob-
ject data is unavailable or, in special cases for , when the
dimension of the feature space is very high but the number of
data is small. (In this case, it may be computationally cheaper
to form and operate on it rather than on the original feature
data.) We believe the relational approach for exhibits
some robustness properties, but overall we view it as inferior to
the object data approach of Kersten [7]. Our experiments always
used , but we believe the importance of and holds
for any choice of the fuzzification constant.
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
HATHAWAY et al.: GENERALIZED FUZZY c-MEANS CLUSTERING STRATEGIES 581
Fig. 5. “Two cluster data” terminal prototypes for the object and relational approaches .
Fig. 6. “No cluster data” terminal prototypes for the relational-data approach .
Choices for other than or lead to models which
can provide good clustering results and possibly classifier de-
signs, but the models are more difficult to optimize in the ob-
ject data case. For values near two, the results obtained using
the object and relational approaches are quite similar. However,
it now appears that the existing duality theory stated in [14] is
complete; that is, the object and relational approaches have a
strict duality relationship only when .As values in-
crease above one, the attraction of terminal prototype vectors
to outliers increases. The empirical migration of the prototypes
to the approximate center of the data sets as increases is inter-
esting to us, but no illuminating result regarding this has been
obtained. A partial analysis of the relational case indicates that
there is still a strong dependence on the point of initialization,
even as increases without bound. For example, consider clus-
tering into two clusters. It is not hard
to show that as , the relational based approach is es-
sentially equivalent to that done using NERFCM on the
matrix with all zero entries except . Simple
numerical experiments with different crisp initializations show
that different solutions are possible. Because of this, we believe
there may not be a nice theoretical result regarding the limiting
position of the prototypes as . We conclude by giving
one last question. Why is special in the sense demon-
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
582 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 8, NO. 5, OCTOBER 2000
strated by Fig. 6 and does this mean that the original FCM is in
some sense optimal as a quantization tool?
ACKNOWLEDGMENT
The authors would like to thank the editor, associate editor,
and all reviewers for their helpful suggestions for improving this
manuscript.
REFERENCES
[1] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algo-
rithms. New York: Plenum, 1981.
[2] J. C. Bezdek, J. M. Keller, R. Krishnapuram, and N. R. Pal, Fuzzy
Models and Algorithms for Pattern Recognition and Image Pro-
cessing. Norwell, MA: Kluwer, 1999.
[3] R. Krishnapuram and J. M. Keller, “A possibilistic approach to clus-
tering,” IEEE Trans. Fuzzy Syst., vol. 1, pp. 98–110, May 1993.
[4] R. N. Dave, “Characterization and detection of noise in clustering,” Pat-
tern Recogn. Lett., vol. 12, pp. 657–664, 1991.
[5] P. R. Kersten, “The fuzzy median and the fuzzy MAD,” in Proc. ISUMA-
NAFIPS, College Park, MD, 1995, pp. 85–88.
[6] , “Fuzzy order statistics and their application to fuzzy clustering,”
IEEE Trans. Fuzzy Syst., vol. 7, pp. 708–712, Dec. 1999.
[7] , “Implementing the fuzzy c-medians clustering algorithm,” in
Proc. IEEE Conf. Fuzzy Syst., Barcelona, Spain, 1997, pp. 957–962.
[8] S. Miyamoto and Y. Agusta, “An efficient algorithm for fuzzy
c-means and its termination,” Contr. Cybern., vol. 25, pp. 421–436,
1995.
[9] L. Bobrowski and J. Bezdek, “C-means clustering with the and
norms,” IEEE Trans. Syst., Man, Cybern., vol. 21, pp. 545–554,
May/June 1991.
[10] K. Jajuga, “ norm-based fuzzy clustering,” Fuzzy Sets Syst., vol. 39,
pp. 43–50, 1991.
[11] S. Miyamoto and Y. Agusta, “Efficient algorithms for fuzzy c-means
and their termination properties,” in Proc. 5th Conf. Int. Federation
Classification Soc., Kobe, Japan, Mar. 1996, pp. 255–258.
[12] , “Algorithms for and fuzzy c-means and their conver-
gence,” in Studies in Classification, Data Analysis, and Knowledge
Organization: Data Science, Classification, and Related Methods,C.
Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. H. Bock, and Y. Baba,
Eds. Tokyo, Japan: Springer-Verlag, 1998, pp. 295–302.
[13] D. D. Overstreet, “Generalized fuzzy c-means clustering,” M.S. thesis
in mathematics, Georgia Southern Univ., Statesboro, GA, June 1998.
[14] R. J. Hathaway and J. C. Bezdek, “NERF c-means: Non-Euclidean re-
lational fuzzy clustering,” Pattern Recogn., vol. 27, pp. 429–437, 1994.
[15] , “Optimization of clustering criteria by reformulation,” IEEE
Trans. Fuzzy Syst., vol. 1, pp. 241–245,May 1995.
Richard J. Hathaway (M’93) (“BIG Rivets”) re-
ceived the B.S. degree in applied math from the
University of Georgia, Athens, in 1979, and the
Ph.D. degree in mathematical sciences from Rice
University, Houston, TX, in 1983.
He is currently a Professor in the Mathematics and
Computer Science Department, Georgia Southern
University, Statesboro. His research interests include
pattern recognition and numerical optimization.
James C. Bezdek (M’90–SM’90–F’92) (affection-
ately known as “Jackhammer” by coworkers in the
reviewing construction trades) received the B.S.C.E.
degree from the University of Nevada, Reno, in 1969,
and the Ph.D. degree in applied math from Cornell
University, Ithaca, NY, in 1973.
He is currently a Professor in the Computer
Science Department, University of West Florida,
Pensacola. His interests include woodworking, fine
cigars, optimization, motorcycles, pattern recog-
nition, gardening, fishing, image processing, snow
skiing, computational neural networks, blues music, woodworking, and com-
putational medicine.
Dr. Bezdek is the founding editor of the National Journal of Approximate
Reasoning and the IEEE TRANSACTIONS ON FUZZY SYSTEMS. He is a fellow of
IFSA.
Yingkang Hu (“Grinder”) received the B.S. degree
in mathematics from Beijing University of Chemical
Engineering, China, in 1982, and the Ph.D. degree in
mathematics from the University of South Carolina,
Columbia, in 1989.
He is a Professor in the Mathematics and
Computer Science Department, Georgia Southern
University, Statesboro. His research interests include
approximation theory, numerical computation, and
pattern recognition.
Authorized licensed use limited to: Richard Hathaway. Downloaded on February 1, 2009 at 13:19 from IEEE Xplore. Restrictions apply.
... FCM is widely used for its efficiency and simplicity, yet it struggles with complex, high-dimensional, and non-Euclidean datasets. To mitigate these limitations, several variants have been introduced, incorporating improved objective functions and constraints, such as adaptive FCM [19], generalized FCM [20], fuzzy weighted c-means [21], and generalized FCM with improved fuzzy partitioning [22]. Kernel-based approaches like kernel FCM (KFCM) [5] and constrained models, including agglomerative fuzzy k-means (AFKM) [23], robust self-sparse fuzzy clustering (RSSFCA) [18], robust and sparse fuzzy k-means (RSFKM) [24], possibilistic FCM (PFCM) [25], and principal component analysis-embedded FCM (P SFCM) [26] as well as hyperbolic extensions such as hyperbolic smoothing-based fuzzy clustering (HSFC) [27] and Integration of hyperbolic tangent and Gaussian kernels for FCM (HGFCM) [28], have also been explored. ...
Preprint
Full-text available
Clustering algorithms play a pivotal role in unsupervised learning by identifying and grouping similar objects based on shared characteristics. While traditional clustering techniques, such as hard and fuzzy center-based clustering, have been widely used, they struggle with complex, high-dimensional, and non-Euclidean datasets. In particular, the Fuzzy C-Means (FCM) algorithm, despite its efficiency and popularity, exhibits notable limitations in non-Euclidean spaces. Euclidean spaces assume linear separability and uniform distance scaling, limiting their effectiveness in capturing complex, hierarchical, or non-Euclidean structures in fuzzy clustering. To overcome these challenges, we introduce Filtration-based Hyperbolic Fuzzy C-Means (HypeFCM), a novel clustering algorithm tailored for better representation of data relationships in non-Euclidean spaces. HypeFCM integrates the principles of fuzzy clustering with hyperbolic geometry and employs a weight-based filtering mechanism to improve performance. The algorithm initializes weights using a Dirichlet distribution and iteratively refines cluster centroids and membership assignments based on a hyperbolic metric in the Poincar\'e Disc model. Extensive experimental evaluations demonstrate that HypeFCM significantly outperforms conventional fuzzy clustering methods in non-Euclidean settings, underscoring its robustness and effectiveness.
... The study proposed to replace dealing with image pixel with gray level histogram. The latter issue regarding the Euclidian distance that was solved by studies [51,52]. The studies proposed LP norms (0 < p <= 1) as a distance measures rather than Euclidian distance that minimized outliers. ...
... Bobrowski and Bezdek extended the k-means clustering algorithm by exploring two similarity measures with nonhyperelliptical topologies: the square of the l 1 norm and the square of the supremum norm l ∞ [5]. Hathaway et al. examined clustering with general l p norm distances, with p varying from 0 to 1, and evaluated clustering performance with arti-ficial datasets [18]. Aside from members of the Minkowski or p-norm family, Filippone et al. surveyed modified k-means clustering algorithms incorporating kernel-based similarity measures that produce nonlinear separating hypersurfaces between clusters [15]. ...
Article
Full-text available
Recognizing the pivotal role of choosing an appropriate distance metric in designing the clustering algorithm, our focus is on innovating the k-means method by redefining the distance metric in its distortion. In this study, we introduce a novel k-means clustering algorithm utilizing a distance metric derived from the ℓpp\ell _p quasi-norm with p∈(0,1)p(0,1)p\in (0,1). Through an illustrative example, we showcase the advantageous properties of the proposed distance metric compared to commonly used alternatives for revealing natural groupings in data. Subsequently, we present a novel k-means type heuristic by integrating this sub-one quasi-norm-based distance, offer a step-by-step iterative relocation scheme, and prove the convergence to the Kuhn-Tucker point. Finally, we empirically validate the effectiveness of our clustering method through experiments on synthetic and real-life datasets, both in their original form and with additional noise introduced. We also investigate the performance of the proposed method as a subroutine in a deep learning clustering algorithm. Our results demonstrate the efficacy of the proposed k-means algorithm in capturing distinctive patterns exhibited by certain data types.
... To remedy the shortcoming of performance sensitivity to initialization, many alternative methods have been proposed, such as the FCM variants with improved objective function and initialization, and additional constraints. FCM-like algorithms with improved objective function include adaptive FCM algorithm [38], generalized FCM clustering [39], enhanced FCM [40], fast generalized FCM [41], fuzzy weighted c-means [42], [43], generalized FCM algorithm with improved fuzzy partition [44], fuzzy local information c-means [45], Bayesian fuzzy clustering (BFC) [46], and kernel fuzzy c-means clustering (KFCM) [47]. FCM with improved initialization includes multistage random sampling [48], the genetic algorithm [49], the Gustafson-Kessel algorithm [50], initialization schemes by utilizing color space in image segmentation [51], [52], the Markov random field [53], and the two-phase fuzzy c-means (2PFCM) [54]. ...
Article
The fuzzy c-means clustering algorithm is the most widely used soft clustering algorithm. In contrast to hard clustering, the cluster membership of data generated using the fuzzy c-means algorithm is ambiguous. Similar to hard clustering algorithms, the clustering results of the fuzzy cmeans clustering algorithm are also sub-optimal with varied performance depending on initial solutions. In this paper, a collaborative annealing fuzzy c-means algorithm is presented. To address the issue of ambiguity, the proposed algorithm leverages an annealing procedure to phase out the fuzzy cluster membership degree toward a crispy one by reducing the exponent gradually according to a cooling schedule. To address the issue of sub-optimality, the proposed algorithm employs multiple fuzzy c-means modules to generate alternative clusters based on membership srepeatedly re-initialized using a meta-heuristic rule. Experimental results on eight benchmark datasets are elaborated to demonstrate the superiority of the proposed algorithm to thirteen prevailing hard and soft algorithms in terms of internal and external cluster validity indices.
Article
In celestial source detection in Tianlai project, locating interferometric fringe in visibility data accurately will influence downstream tasks drastically, such as physical parameters estimation and weak celestial source exploration. Considering that traditional locating methods are time-consuming and supervised methods require a great quantity of expensive labeled data, in this paper, we firstly investigate characteristic of interferometric fringes in simulation and real scenario respectively, and integrate almost parameter-free unsupervised clustering method and seeding filling or eraser algorithm to propose a hierarchical plug and play method to improve locating accuracy. Then, we apply our method to locate single and multiple sources' interferometric fringes in simulation data. Next, we apply our method in real data taken from Tianlai radio telescope array. Finally, we do comparison with unsupervised methods of state of the art. These results show that our method has robustness in different scenarios and can improve locating measurement accuracy effectively.
Article
The membership matrix is a key element in fuzzy clustering, enabling novel data representation in multiple clusters. The row vectors of the membership matrix represent each sample's degree of membership to different clusters. Notably, researchers have confirmed the presence of the local affinity among these row vectors, effectively preserving the local structure of the original data distribution. However, in this work, we consider that most sample points have insignificant fuzziness, with fuzziness found primarily in a few clusters, resulting in most membership vectors being sparse. To tackle this issue, we present the SMalFCM model, which leverages the sparsity of the row vectors and its affinity to establish a more appropriate representation, along with an optimization algorithm. Our experimental results on both simulated and real datasets demonstrate that the combination of sparsity and affinity can significantly enhance fuzzy clustering performance over other models.
Chapter
Because they are essential to people’s lives, classifying brain tumors using machine learning approaches has become necessary. The key to lowering the mortality percentage, which has risen recently to significant levels, is accurate and quick diagnosis. Modern medicine extensively uses procedures like CT scanning and MRI imaging, which is more popular since it offers high-quality images of brain tissues from various perspectives. A specialist with a thorough understanding of brain tumors manually identifies the correct brain tumor. Additionally, it takes a lot of time and is tiresome for several images. Furthermore, since human error is possible, a false detection might result in an inappropriate course of action and treatment. As a result, scientists and researchers developed several techniques for automatically and effectively categorizing tumor kinds without human understanding. This paper presents a systematic review of brain tumor detection and classification. An overview of brain MRI categorization systems is presented in this work.
Article
Full-text available
The relational fuzzy c-means (RFCM) algorithm can be used to cluster a set of n objects described by pair-wise dissimilarity values if (and only if) there exist n points in Rn − 1 whose squared Euclidean distances precisely match the given dissimilarity data. This strong restriction on the dissimilarity data renders RFCM inapplicable to most relational clustering problems. This paper substantially improves RFCM by generalizing it to the case of arbitrary (symmetric) dissimilarity data. The generalization is obtained using a computationally efficient modification of the existing algorithm that is equivalent to applying a “spreading” transformation to the dissimilarity data. While the method given applies specifically to dissimilarity data, a simple transformation can be used to convert similarity relations into dissimilarity data, so the method is applicable to any numerical relational data that are positive, reflexive (or anti-reflexive) and symmetric. Numerical examples illustrate and compare the present approach to problems that can be studied with alternatives such as the linkage algorithms.
Article
A fast algorithm for calculating cluster centers in the iteration procedure of the ℓ 1 fuzzy c-means clustering is proposed. The algorithm is a simple sequential search on the set of coordinates of data points. The complexity of calculation of each cluster center is the order of the number of data points except that the coordinates should be sorted before the iteration begins. The efficiency is comparable to the computation of a center in the ordinary Euclidean fuzzy c-means. Thus, the ℓ 1 fuzzy c-means algorithm is efficient and is applicable to large data sets. It is proved that the algorithm terminates after a finite number of iterations and the upper bound for the number of iterations is estimated. Numerical examples including a set of 10,000 data points are shown.
Article
Algorithms for L 1 and L p based fuzzy c-means are proposed. These algorithms calculate cluster centers in the general alternating algorithm of the fuzzy c-means. The algorithm for the L 1 space is based on a simple linear search on nodes of step functions derived from derivatives of components of the objective function for the fuzzy c-means, whereas the algorithm for the L p spaces use binary search on the nodes and then the interval to which the cluster center belong. Termination of the algorithms based on different criteria for the convergence is discussed. The algorithm for the L 1 space is proved to be convergent after a finite number of iterations. A numerical example is shown.
Article
The paper presents the L1 version of the well-known fuzzy clustering method, namely fuzzy ISODATA, proposed by Bezdek and Dunn. Due to their robustness, L1-norm based methods gained much attention in statistics.The presented fuzzy clustering problem uses the distance between observations and location parameter vectors, which is based on the L1-norm, instead of the inner product induced norm used in classical fuzzy ISODATA.Two alternative methods to solve the L1 fuzzy clustering problem are derived. In practice both membership grades and location parameter vectors are unknown. The paper presents two iterative algorithms, each the implementation of the derived method. Finally, numerical examples are presented. One of them refers to famous Iris data.
Article
Some data sets contain outlying datavalues which can degrade the quality of the clustering results obtained using standard techniques such as the fuzzy c-means algorithm. This note gives an extended family of fuzzl c-means type models, and attempts to empirically identify those members ofthe family which are least influenced by the presence of outliers. The form ofthe extended family ofclustering criteria suggests an alternating optimization approach is feasible, and specific algorithms for implementing the optimization ofthe models are stated. The implemented approach is then tested using various artificial data sets.
Article
A concept of ‘Noise Cluster’ is introduced such that noisy data points may be assigned to the noise class. The approach is developed for objective functional type (K-means or fuzzy K-means) algorithms, and its ability to detect ‘good’ clusters amongst noisy data is demonstrated. The approach presented is applicable to a variety of fuzzy clustering algorithms as well as regression analysis.