access icon free A density-based enhancement to dominant sets clustering

Although there is no shortage of clustering algorithms, existing algorithms are often afflicted by problems of one kind or another. Dominant sets clustering is a graph-theoretic approach to clustering and exhibits significant potential in various applications. However, the authors' work indicates that this approach suffers from two major problems, namely over-segmentation tendency and sensitiveness to distance measures. In order to overcome these two problems, the authors present a density-based enhancement to dominant sets clustering where a cluster merging step is used to fuse adjacent clusters close enough from the original dominant sets clustering. Experiments on various datasets validate the effectiveness of the proposed method.

Inspec keywords: set theory; pattern clustering; graph theory; unsupervised learning

Other keywords: density-based enhancement; distance measure sensitiveness; adjacent cluster fusion; unsupervised learning tools; over-segmentation tendency; cluster merging step; dominant sets clustering; graph-theoretic approach

Subjects: Combinatorial mathematics; Data handling techniques; Knowledge engineering techniques

References

    1. 1)
      • 5. Wang, W., Yang, J., Sting, M.R.: ‘A statistical information grid approach to spatial data mining’. Proc. VLDB Conf., Athens, Greece, August 1997, pp. 186195.
    2. 2)
      • 21. Yang, X.W., Liu, H.R., Laecki, L.J.: ‘Contour-based object detection as dominant set computation’, Pattern Recognit., 2012, 45, (5), pp. 19271936 (doi: 10.1016/j.patcog.2011.11.010).
    3. 3)
      • 1. Bradley, P.S., Mangasarian, O.L.: ‘Clustering via concave minimization’. Proc. Neural Information Processing Systems, Denver, USA, 1997, pp. 368374.
    4. 4)
      • 31. Chang, H., Yeung, D.Y.: ‘Robust path-based spectral clustering’, Pattern Recognit., 2008, 41, (1), pp. 191203 (doi: 10.1016/j.patcog.2007.04.010).
    5. 5)
      • 3. Zhang, T., Ramakrishnan, R., Birch, L.M.: ‘An efficient data clustering method for very large databases’. Proc. ACM SIGMOD Int. Conf. Management of Data, Montreal, Canada, June 1996, pp. 103114.
    6. 6)
      • 6. Shi, J., Malik, J.: ‘Normalized cuts and image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2000, 22, (8), pp. 167172 (doi: 10.1109/ICIP.1998.723676).
    7. 7)
      • 7. Dhillon, I.: ‘Co-clustering documents and words using bipartite spectral graph partitioning’. Proc. Int. Conf. Knowledge Discovery and Data Mining, San Francisco, USA, August 2001, pp. 269274.
    8. 8)
      • 33. Fu, L., Medico, E.: ‘FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data’, BMC Bioinf., 2007, 8, (1), pp. 115 (doi: 10.1186/1471-2105-8-3).
    9. 9)
      • 23. Hamid, R., Maddi, S., Johnson, A.Y., Bobick, A.F., Essa, I.A., Isbell, C.: ‘A novel sequence representation for unsupervised analysis of human activities’, Artif. Intell., 2009, 173, (14), pp. 12211244 (doi: 10.1016/j.artint.2009.05.002).
    10. 10)
      • 16. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: ‘Consensus clustering: a resampling based method for class discovery and visualization of gene expression microarray data’, Mach. Learn., 2003, 52, (1–2), pp. 91118 (doi: 10.1023/A:1023949509487).
    11. 11)
      • 12. Fraley, C., Raftery, A.E.: ‘How many clusters? which clustering method? answers via model-based cluster analysis’, Comput. J., 1998, 41, (8), pp. 578588 (doi: 10.1093/comjnl/41.8.578).
    12. 12)
      • 19. Pavan, M., Pelillo, M.: ‘Dominant sets and pairwise clustering’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (1), pp. 167172 (doi: 10.1109/TPAMI.2007.250608).
    13. 13)
      • 11. Smyth, P.: ‘Clustering with Monte Carlo cross-validation’. Proc. Int. Conf. Knowledge Discovery and Data Mining, Portland, USA, August 1996, pp. 126133.
    14. 14)
      • 32. Jain, A., Law, M.: ‘Data clustering: a user's dilemma’, (LNCS3776), 2005, pp. 110.
    15. 15)
      • 27. Wu, Z., Leahy, R.: ‘An optimal graph theoretic approach to data clustering: theory and its application to image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 1993, 15, (11), pp. 11011113 (doi: 10.1109/34.244673).
    16. 16)
      • 34. Zahn, C.T.: ‘Graph-theoretical methods for detecting and describing gestalt clusters’, IEEE Trans. Comput., 1971, 100, (1), pp. 6886 (doi: 10.1109/T-C.1971.223083).
    17. 17)
      • 9. Ng, A., Jordan, M., Weiss, Y.: ‘On spectral clustering: analysis and an algorithm’. Proc. Neural Information Processing Systems, Vancouver, Canada, December 2002, pp. 849856.
    18. 18)
      • 18. Evanno, G., Regnaut, S., Goudet, J.: ‘Detecting the number of clusters of individuals using the software structure: a simulation study’, Mol. Ecol., 2005, 14, (8), pp. 26112620 (doi: 10.1111/j.1365-294X.2005.02553.x).
    19. 19)
      • 24. Wang, M., Ye, Z., Wang, Y., Wang, S.: ‘Dominant sets clustering for image retrieval’, Signal Process., 2008, 88, (11), pp. 28432849 (doi: 10.1016/j.sigpro.2008.04.007).
    20. 20)
      • 15. Tibshirani, R., Walther, G., Hastie, T.: ‘Estimating the number of clusters in a data set via the gap statistic’, J. Royal Stat. Soc. B, Stat. Methodol., 2001, 63, (2), pp. 411423 (doi: 10.1111/1467-9868.00293).
    21. 21)
      • 29. Rota Bulò, S., Pelillo, M., Bomze, I.M.: ‘Graph-based quadratic optimization: a fast evolutionary approach’, Comput. Vis. Image Underst., 2011, 115, (7), pp. 984995 (doi: 10.1016/j.cviu.2010.12.004).
    22. 22)
      • 2. Arthor, D., Vassilvitskii, S.: ‘K-means ++ : the advantages of careful seeding’. Proc. ACM-SIAM Symp. Discrete Algorithms, New Orleans, USA, January 2007, pp. 10271035.
    23. 23)
      • 13. Hansen, M.H., Yu, B.: ‘Model selection and the principle of minimum descriptor length’, J. Am. Stat. Assoc, 1998, 96, (454), pp. 746774 (doi: 10.1198/016214501753168398).
    24. 24)
      • 14. Ray, S., Turi, R.H.: ‘Determination of number of clusters in k-means clustering and application in color image segmentation’. Proc. Int. Conf. Advances in Pattern Recognition and Digital Techniques, Calcutta, India, December 1999, pp. 137143.
    25. 25)
      • 17. Tibshirani, R., Walther, G., Botstein, D., Brown, P.: ‘Cluster validation by prediction strength’, J. Computat. Graph. Stat., 2005, 14, (3), pp. 511528 (doi: 10.1198/106186005X59243).
    26. 26)
      • 22. Frommlet, F.: ‘Tag snp selection based on clustering according to dominant sets found using replicator dynamics’, Adv. Data Anal. Classif., 2010, 4, (1), pp. 6583 (doi: 10.1007/s11634-010-0059-2).
    27. 27)
      • 25. Hou, J., Pelillo, M.: ‘A simple feature combination method based on dominant sets’, Pattern Recognit., 2013, 46, (11), pp. 31293139 (doi: 10.1016/j.patcog.2013.04.005).
    28. 28)
      • 4. Ester, M., Kriegel, H.P., Sander, J., Xu, X.W.: ‘A density-based algorithm for discovering clusters in large spatial databases with noise’. Proc. Int. Conf. Knowledge Discovery and Data Mining, Portland, USA, August 1996, pp. 226231.
    29. 29)
      • 10. Belkin, M., Niyogi, P.: ‘Laplacian eigenmaps for dimensionality reduction and data representation’, Neural Comput., 2003, 15, (6), pp. 13731396 (doi: 10.1162/089976603321780317).
    30. 30)
      • 20. Torsello, A., Bulo, S.R., Pelillo, M.: ‘Beyond partitions: allowing overlapping groups in pairwise clustering’. Proc. Int. Conf. Pattern Recognit., Tempa, USA, December 2008, pp. 14.
    31. 31)
      • 8. Ding, C., He, X., Zha, H., Gu, M., Simon, H.: ‘A min-max cut algorithm for graph partitioning and data clustering’. Proc. Int. Conf. Data Mining, San Jose, USA, November 2011, pp. 107114.
    32. 32)
      • 30. Gionis, A., Mannila, H., Tsaparas, P.: ‘Clustering aggregation’, ACM Trans. Knowl. Discov. Data, 2007, 1, (1), pp. 130 (doi: 10.1145/1217299.1217303).
    33. 33)
      • 26. Brendan, J.F., Delbert, D.: ‘Clustering by passing messages between data points’, Science, 2007, 315, pp. 972976 (doi: 10.1126/science.1136800).
    34. 34)
      • 28. Pavan, M., Pelillo, M.: ‘A graph-theoretic approach to clustering and segmentation’. Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Madison, USA, June 2003, pp. 145152.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2013.0072
Loading

Related content

content/journals/10.1049/iet-cvi.2013.0072
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading