Although there is no shortage of clustering algorithms, existing algorithms are often afflicted by problems of one kind or another. Dominant sets clustering is a graph-theoretic approach to clustering and exhibits significant potential in various applications. However, the authors' work indicates that this approach suffers from two major problems, namely over-segmentation tendency and sensitiveness to distance measures. In order to overcome these two problems, the authors present a density-based enhancement to dominant sets clustering where a cluster merging step is used to fuse adjacent clusters close enough from the original dominant sets clustering. Experiments on various datasets validate the effectiveness of the proposed method.

References

1. 1)
  - 5. Wang, W., Yang, J., Sting, M.R.: ‘A statistical information grid approach to spatial data mining’. Proc. VLDB Conf., Athens, Greece, August 1997, pp. 186–195.
2. 2)
  - 21. Yang, X.W., Liu, H.R., Laecki, L.J.: ‘Contour-based object detection as dominant set computation’, Pattern Recognit., 2012, 45, (5), pp. 1927–1936 (doi: 10.1016/j.patcog.2011.11.010).
3. 3)
  - 1. Bradley, P.S., Mangasarian, O.L.: ‘Clustering via concave minimization’. Proc. Neural Information Processing Systems, Denver, USA, 1997, pp. 368–374.
4. 4)
  - 31. Chang, H., Yeung, D.Y.: ‘Robust path-based spectral clustering’, Pattern Recognit., 2008, 41, (1), pp. 191–203 (doi: 10.1016/j.patcog.2007.04.010).
5. 5)
  - 3. Zhang, T., Ramakrishnan, R., Birch, L.M.: ‘An efficient data clustering method for very large databases’. Proc. ACM SIGMOD Int. Conf. Management of Data, Montreal, Canada, June 1996, pp. 103–114.
6. 6)
  - 6. Shi, J., Malik, J.: ‘Normalized cuts and image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22, (8), pp. 167–172 (doi: 10.1109/ICIP.1998.723676).
7. 7)
  - 7. Dhillon, I.: ‘Co-clustering documents and words using bipartite spectral graph partitioning’. Proc. Int. Conf. Knowledge Discovery and Data Mining, San Francisco, USA, August 2001, pp. 269–274.
8. 8)
  - 33. Fu, L., Medico, E.: ‘FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data’, BMC Bioinf., 2007, 8, (1), pp. 1–15 (doi: 10.1186/1471-2105-8-3).
9. 9)
  - 23. Hamid, R., Maddi, S., Johnson, A.Y., Bobick, A.F., Essa, I.A., Isbell, C.: ‘A novel sequence representation for unsupervised analysis of human activities’, Artif. Intell., 2009, 173, (14), pp. 1221–1244 (doi: 10.1016/j.artint.2009.05.002).
10. 10)
  - 16. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: ‘Consensus clustering: a resampling based method for class discovery and visualization of gene expression microarray data’, Mach. Learn., 2003, 52, (1–2), pp. 91–118 (doi: 10.1023/A:1023949509487).
11. 11)
  - 12. Fraley, C., Raftery, A.E.: ‘How many clusters? which clustering method? answers via model-based cluster analysis’, Comput. J., 1998, 41, (8), pp. 578–588 (doi: 10.1093/comjnl/41.8.578).
12. 12)
  - 19. Pavan, M., Pelillo, M.: ‘Dominant sets and pairwise clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (1), pp. 167–172 (doi: 10.1109/TPAMI.2007.250608).
13. 13)
  - 11. Smyth, P.: ‘Clustering with Monte Carlo cross-validation’. Proc. Int. Conf. Knowledge Discovery and Data Mining, Portland, USA, August 1996, pp. 126–133.
14. 14)
  - 32. Jain, A., Law, M.: ‘Data clustering: a user's dilemma’, (LNCS3776), 2005, pp. 1–10.
15. 15)
  - 27. Wu, Z., Leahy, R.: ‘An optimal graph theoretic approach to data clustering: theory and its application to image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 1993, 15, (11), pp. 1101–1113 (doi: 10.1109/34.244673).
16. 16)
  - 34. Zahn, C.T.: ‘Graph-theoretical methods for detecting and describing gestalt clusters’, IEEE Trans. Comput., 1971, 100, (1), pp. 68–86 (doi: 10.1109/T-C.1971.223083).
17. 17)
  - 9. Ng, A., Jordan, M., Weiss, Y.: ‘On spectral clustering: analysis and an algorithm’. Proc. Neural Information Processing Systems, Vancouver, Canada, December 2002, pp. 849–856.
18. 18)
  - 18. Evanno, G., Regnaut, S., Goudet, J.: ‘Detecting the number of clusters of individuals using the software structure: a simulation study’, Mol. Ecol., 2005, 14, (8), pp. 2611–2620 (doi: 10.1111/j.1365-294X.2005.02553.x).
19. 19)
  - 24. Wang, M., Ye, Z., Wang, Y., Wang, S.: ‘Dominant sets clustering for image retrieval’, Signal Process., 2008, 88, (11), pp. 2843–2849 (doi: 10.1016/j.sigpro.2008.04.007).
20. 20)
  - 15. Tibshirani, R., Walther, G., Hastie, T.: ‘Estimating the number of clusters in a data set via the gap statistic’, J. Royal Stat. Soc. B, Stat. Methodol., 2001, 63, (2), pp. 411–423 (doi: 10.1111/1467-9868.00293).
21. 21)
  - 29. Rota Bulò, S., Pelillo, M., Bomze, I.M.: ‘Graph-based quadratic optimization: a fast evolutionary approach’, Comput. Vis. Image Underst., 2011, 115, (7), pp. 984–995 (doi: 10.1016/j.cviu.2010.12.004).
22. 22)
  - 2. Arthor, D., Vassilvitskii, S.: ‘K-means ++ : the advantages of careful seeding’. Proc. ACM-SIAM Symp. Discrete Algorithms, New Orleans, USA, January 2007, pp. 1027–1035.
23. 23)
  - 13. Hansen, M.H., Yu, B.: ‘Model selection and the principle of minimum descriptor length’, J. Am. Stat. Assoc, 1998, 96, (454), pp. 746–774 (doi: 10.1198/016214501753168398).
24. 24)
  - 14. Ray, S., Turi, R.H.: ‘Determination of number of clusters in k-means clustering and application in color image segmentation’. Proc. Int. Conf. Advances in Pattern Recognition and Digital Techniques, Calcutta, India, December 1999, pp. 137–143.
25. 25)
  - 17. Tibshirani, R., Walther, G., Botstein, D., Brown, P.: ‘Cluster validation by prediction strength’, J. Computat. Graph. Stat., 2005, 14, (3), pp. 511–528 (doi: 10.1198/106186005X59243).
26. 26)
  - 22. Frommlet, F.: ‘Tag snp selection based on clustering according to dominant sets found using replicator dynamics’, Adv. Data Anal. Classif., 2010, 4, (1), pp. 65–83 (doi: 10.1007/s11634-010-0059-2).
27. 27)
  - 25. Hou, J., Pelillo, M.: ‘A simple feature combination method based on dominant sets’, Pattern Recognit., 2013, 46, (11), pp. 3129–3139 (doi: 10.1016/j.patcog.2013.04.005).
28. 28)
  - 4. Ester, M., Kriegel, H.P., Sander, J., Xu, X.W.: ‘A density-based algorithm for discovering clusters in large spatial databases with noise’. Proc. Int. Conf. Knowledge Discovery and Data Mining, Portland, USA, August 1996, pp. 226–231.
29. 29)
  - 10. Belkin, M., Niyogi, P.: ‘Laplacian eigenmaps for dimensionality reduction and data representation’, Neural Comput., 2003, 15, (6), pp. 1373–1396 (doi: 10.1162/089976603321780317).
30. 30)
  - 20. Torsello, A., Bulo, S.R., Pelillo, M.: ‘Beyond partitions: allowing overlapping groups in pairwise clustering’. Proc. Int. Conf. Pattern Recognit., Tempa, USA, December 2008, pp. 1–4.
31. 31)
  - 8. Ding, C., He, X., Zha, H., Gu, M., Simon, H.: ‘A min-max cut algorithm for graph partitioning and data clustering’. Proc. Int. Conf. Data Mining, San Jose, USA, November 2011, pp. 107–114.
32. 32)
  - 30. Gionis, A., Mannila, H., Tsaparas, P.: ‘Clustering aggregation’, ACM Trans. Knowl. Discov. Data, 2007, 1, (1), pp. 1–30 (doi: 10.1145/1217299.1217303).
33. 33)
  - 26. Brendan, J.F., Delbert, D.: ‘Clustering by passing messages between data points’, Science, 2007, 315, pp. 972–976 (doi: 10.1126/science.1136800).
34. 34)
  - 28. Pavan, M., Pelillo, M.: ‘A graph-theoretic approach to clustering and segmentation’. Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Madison, USA, June 2003, pp. 145–152.

A density-based enhancement to dominant sets clustering

References

Related content