Two-phase clustering algorithm with density exploring distance measure

Jingjing Ma; Xiangming Jiang; Maoguo Gong

Two-phase clustering algorithm with density exploring distance measure

View Fulltext

Author(s): Jingjing Ma¹ ; Xiangming Jiang¹ ; Maoguo Gong¹
- Affiliations: 1: Key Lab of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University , PO Box 224, Xi'an 710071 , People's Republic of China
Source: Volume 3, Issue 1, March 2018, p. 59 – 64
DOI: 10.1049/trit.2018.0006 , Online ISSN 2468-2322

This is an open access article published by the IET, Chinese Association for Artificial Intelligence and Chongqing University of Technology under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/)

Received 03/01/2018, Accepted 20/02/2018, Revised 11/02/2018, Published 05/03/2018

Here, the authors propose a novel two-phase clustering algorithm with a density exploring distance (DED) measure. In the first phase, the fast global K-means clustering algorithm is used to obtain the cluster number and the prototypes. Then, the prototypes of all these clusters and representatives of points belonging to these clusters are regarded as the input data set of the second phase. Afterwards, all the prototypes are clustered according to a DED measure which makes data points locating in the same structure to possess high similarity with each other. In experimental studies, the authors test the proposed algorithm on seven artificial as well as seven UCI data sets. The results demonstrate that the proposed algorithm is flexible to different data distributions and has a stronger ability in clustering data sets with complex non-convex distribution when compared with the comparison algorithms.

References

1. 1)
  - [7]. Huang, J., Liu, J.: ‘A similarity-based modularization quality measure for software module clustering problems’, Inf. Sci., 2016, 342, pp. 96–110 (doi: 10.1016/j.ins.2016.01.030).
2. 2)
  - [20]. Wang, L., Bo, L., Jiao, L.: ‘A modified k-means clustering with a density-sensitive distance metric’. Proc. of the Int. Conf. on Rough Sets and Knowledge Technology, Chongquing, China, 2006, pp. 544–551.
3. 3)
  - [9]. Meng, F., Li, H., Wu, Q., et al: ‘Globally measuring the similarity of superpixels by binary edge maps for superpixel clustering’, IEEE Trans. Circuits Syst. Video Technol., 2016, doi: 10.1109/TCSVT.2016.2632148.
4. 4)
  - [14]. Cao, F., Liang, J., Li, D., et al: ‘A dissimilarity measure for the k-modes clustering algorithm’, Knowl.-Based Syst., 2012, 26, pp. 120–127 (doi: 10.1016/j.knosys.2011.07.011).
5. 5)
  - [30]. Bousquet, O., Chapelle, O., Hein, M.: ‘Measure based regularization’. Advances in Neural Information Processing Systems, Vancouver, Canada, 2004, pp. 1221–1228.
6. 6)
  - [2]. Hartigan, J.A., Wong, M.A.: ‘Algorithm as 136: a k-means clustering algorithm’, J. R. Stat. Soc. C, Appl. Stat., 1979, 28, (1), pp. 100–108.
7. 7)
  - [6]. Wu, J., Wu, Z., Cao, J., et al: ‘Fuzzy consensus clustering with applications on big data’, IEEE Trans. Fuzzy Syst., 2017, 25, (6), pp. 1430–1445 (doi: 10.1109/TFUZZ.2017.2742463).
8. 8)
  - [8]. Xia, G., Sun, H., Feng, L., et al: ‘Human motion segmentation via robust kernel sparse subspace clustering’, IEEE Trans. Image Process., 2018, 27, (1), pp. 135–150 (doi: 10.1109/TIP.2017.2738562).
9. 9)
  - [26]. Machné, R., Murray, D.B., Stadler, P.F.: ‘Similarity-based segmentation of multi-dimensional signals’, Sci. Rep., 2017, 7, (1), p. 12355 (doi: 10.1038/s41598-017-12401-8).
10. 10)
  - [28]. Zong, Y., Xu, G., Zhang, Y., et al: ‘A robust iterative refinement clustering algorithm with smoothing search space’, Knowl.-Based Syst., 2010, 23, (5), pp. 389–396 (doi: 10.1016/j.knosys.2010.01.012).
11. 11)
  - [18]. Su, M.-C., Chou, C.-H.: ‘A modified version of the k-means algorithm with a distance based on cluster symmetry’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (6), pp. 674–680 (doi: 10.1109/34.927466).
12. 12)
  - [31]. Blum, A., Chawla, S.: ‘Learning from labeled and unlabeled data using graph mincuts’. Proc. of the 18th Int. Conf. on Machine Learning (ICML), Williamstown, MA, USA, 2001, pp. 19–26.
13. 13)
  - [22]. Likas, A., Vlassis, N., Verbeek, J.J.: ‘The global k-means clustering algorithm’, Pattern Recognit., 2003, 36, (2), pp. 451–461 (doi: 10.1016/S0031-3203(02)00060-2).
14. 14)
  - [3]. Wang, W., Yang, J., Muntz, R., et al: ‘STING: a statistical information grid approach to spatial data mining’. Proc. of the 23rd Int. Conf. on Very Large Data Bases, Athens, Greece, 1997, pp. 186–195.
15. 15)
  - [21]. Gong, M., Jiao, L., Wang, L., et al: ‘Density-sensitive evolutionary clustering’. Proc. of the 11th Pacific-Asia Conf. on Knowledge Discovery and Data Mining, Nanjing, China, 2007, pp. 507–514.
16. 16)
  - [23]. Maulik, U., Bandyopadhyay, S.: ‘Genetic algorithm-based clustering technique’, Pattern Recognit., 2000, 33, (9), pp. 1455–1465 (doi: 10.1016/S0031-3203(99)00137-5).
17. 17)
  - [17]. Lu, Y., Hou, X., Chen, X.: ‘A novel travel-time based similarity measure for hierarchical clustering’, Neurocomputing, 2016, 173, pp. 3–8 (doi: 10.1016/j.neucom.2015.01.090).
18. 18)
  - [10]. Taşdemir, K., Yalçin, B., Yildirim, I.: ‘Approximate spectral clustering with utilized similarity information using geodesic based hybrid distance measures’, Pattern Recognit., 2015, 48, (4), pp. 1465–1477 (doi: 10.1016/j.patcog.2014.10.023).
19. 19)
  - [13]. Son, L.H.: ‘Generalized picture distance measure and applications to picture fuzzy clustering’, Appl. Soft Comput., 2016, 46, (C), pp. 284–295 (doi: 10.1016/j.asoc.2016.05.009).
20. 20)
  - [4]. Agrawal, R., Gehrke, J., Gunopulos, D., et al: ‘Automatic subspace clustering of high dimensional data for data mining applications’. Proc. of ACM-SIGMOND Int. Conf. Management on Data, Seattle, Washington, USA, 1998, pp. 94–105.
21. 21)
  - [29]. Zhu, S., Wang, D., Li, T.: ‘Data clustering with size constraints’, Knowl.-Based Syst., 2010, 23, (8), pp. 883–889 (doi: 10.1016/j.knosys.2010.06.003).
22. 22)
  - [32]. Blake, C.L., Merz, C.J.: ‘UCI repository of machine learning databases’. Technical Report, Department of Information and Computer Science, University of California, Irvine, CA, 1998.
23. 23)
  - [5]. Guha, S., Rastogi, R., Shim, K.: ‘CURE: an efficient clustering algorithm for large databases’. Proc. of ACM-SIGMOND Int. Conf. Management on Data, Seattle, Washington, USA, 1998, pp. 73–84.
24. 24)
  - [24]. Kang, Q., Liu, S., Zhou, M., et al: ‘A weight-incorporated similarity-based clustering ensemble method based on swarm intelligence’, Knowl.-Based Syst., 2016, 104, pp. 156–164 (doi: 10.1016/j.knosys.2016.04.021).
25. 25)
  - [1]. Jain, A., Murty, M., Flynn, P.: ‘Data clustering: a review’, ACM Comput. Surv., 1999, 31, (3), pp. 264–323 (doi: 10.1145/331499.331504).
26. 26)
  - [19]. Charalampidis, D.: ‘A modified k-means algorithm for circular invariant clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27, (12), pp. 1856–1865 (doi: 10.1109/TPAMI.2005.230).
27. 27)
  - [25]. Kang, Z., Peng, C., Cheng, Q.: ‘Twin learning for similarity and clustering: a unified kernel approach’. AAAI, San Francisco, California, USA, 2017, pp. 2080–2086.
28. 28)
  - [16]. Ferrari, D.G., De Castro, L.N.: ‘Clustering algorithm selection by meta-learning systems: a new distance-based problem characterization and ranking combination methods’, Inf. Sci., 2015, 301, pp. 181–194 (doi: 10.1016/j.ins.2014.12.044).
29. 29)
  - [15]. Yang, P., Zhu, Q., Huang, B.: ‘Spectral clustering with density sensitive similarity function’, Knowl.-Based Syst., 2011, 24, (5), pp. 621–628 (doi: 10.1016/j.knosys.2011.01.009).
30. 30)
  - [27]. Zhou, D., Bousquet, O., Lal, T.N., et al: ‘Learning with local and global consistency’. Advances in Neural Information Processing Systems, Vancouver, Canada, 2004, pp. 321–328.
31. 31)
  - [11]. Mori, U., Mendiburu, A., Lozano, J.A.: ‘Similarity measure selection for clustering time series databases’, IEEE Trans. Knowl. Data Eng., 2016, 28, (1), pp. 181–195 (doi: 10.1109/TKDE.2015.2462369).
32. 32)
  - [12]. Ye, J.: ‘Single-valued neutrosophic clustering algorithms based on similarity measures’, J. Classif., 2017, 34, (1), pp. 148–162 (doi: 10.1007/s00357-017-9225-y).

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Two-phase clustering algorithm with density exploring distance measure

References

Related content