Content uploaded by Zahra Asghari Varzaneh
Author content
All content in this area was uploaded by Zahra Asghari Varzaneh on Feb 09, 2021
Content may be subject to copyright.
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
229
http://www.TJMCS.com
A survey of hierarchical clustering algorithms
Marjan Kuchaki Rafsanjani
Zahra Asghari Varzaneh
Nasibeh Emami Chukanlo
Abstract
Keywords
2010 Mathematics Subject Classification: Primary 91C20; Secondary 62D05.
1. Introduction
The Journal of
Mathematics and Computer Science
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
230
2. Clustering process
XCiCj
Fig. 1
3. Clustering algorithms
Sequential algorithms:
Hierarchical clustering algorithms:
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
231
Agglomerative algorithms (bottom-up, merging):
m
Divisive algorithms (top-down, splitting):
m
Clustering algorithms based on cost function optimization:
J
m
J.
Other
Fig. 2.
4. Hierarchical clustering algorithms
Fuzzy C.A
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
232
Fig. 3.
Fig. 4.
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
233
Fig. 5.
5. Specific algorithms
5.l. CURE (Clustering Using REpresentatives)
Fig. 6.
5.1.1 Disadvantage of CURE
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
234
5.2 BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
Fig. 7.
On
5.2.1 Advantages of BIRCH
5.2.2 Disadvantages of BIRCH
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
235
5.3 ROCK (RObust Clustering using linKs)
Fig. 8.
5.4 CHAMELEON
Fig. 9.
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
236
nOn mm
5.4.1 Disadvantage of CHAMELEON
5.5 Linkage algorithms
S Ave
5.5.1 Disadvantages of linkage algorithm
S
5.6 Leaders–Subleaders
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
237
d
1
Fig. 10.
5.7 Bisecting k-means
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
238
6. Comparison of algorithms
Table 1.
Algorithms
Hierarchical For
large
data
set
Sensitive to
Outlier/Noise
Model
Time complexity Space complexity
agglomerative divisive Static Dynamic
CURE
Less sensitive to noise
O(n2logn)
O(n)
BIRTH
Handle noise
effectively O(n)
ROCK
- O(n2+mmma+n2logn) O(min{n2,nmmma})
Chameleon
_
O(n(log2n +m))
S
-lin
k
Sensitive to outlier _ O(n2logn) O(n2
)
Ave-lin
k
_
_
Com-link
Not strongly affected
by outliers
O(n3) : obvious algorithm
O(n2logn) : priority queues O(n2
)
O(n2
)
O(n)
O(nlog2n) : Euclidean plan O(n)
O(n logn + nlog2(1/ε)
:ε- approximation
O(n)
Leader-
Subleaders
_
O(ndh) :h=2 O((L+SL)d)
BKMS
_
O(nk)
m:
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
239
7. Conclusion
Acknowledgment.
References.
M. Kuchaki Rafsanjani, Z. Asghari Varzaneh, N. Emami Chukanlo / TJMCS Vol .5 No.3 (2012) 229-240
240