Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Development of a global batch clustering with gradient descent and initial parameters in colour image classification

This study addresses two issues from batch clustering using K-means algorithm in colour image classification application. One of the major issues is the drifting phenomenon in the batch clustering due to the stochastic nature of the clustering procedure. Also in literature, the initial parameter is important to direct the clustering algorithm converge to the proper local solution. In this study, a new algorithm is proposed to address these two issues in application. Recently, a research found that the principal component analysis (PCA) result directly indicates the membership of the clusters in K-means algorithm. Hence using this, the first part of the proposed algorithm shows the possibility to estimate the initial parameters accurately for K-means with a hierarchical manner of PCA solution. In addition, a gradient descent approach is used for the global batch clustering to reduce the drifting and hence speed up convergence in the refining stage. All necessary proofs and justifications are also provided. The evaluation study has shown that the proposed algorithm performs better than the original K-means clustering algorithms with various initial parameter estimation processes.

References

    1. 1)
      • 49. Asuni, N., Giachetti, A.: ‘Testimage: a large-scale archive for testing visual devices and basic image processing algorithms’ (STAG: Smart Tools & Apps for Graphics, Eurographics, Geneva, 2014).
    2. 2)
      • 13. Sugar, C.A., James, G.M.: ‘Finding the number of clusters in a dataset: an information-theoretic approach’, J. Am. Stat. Assoc., 2003, 98, pp. 750763.
    3. 3)
      • 6. Ball, G.H., Hall, D.J.: ‘ISODATA, a novel method of data analysis and pattern classification’ (Stanford Research Institute, Menlo Park, CA, 1965).
    4. 4)
      • 29. Wu, X., Kumar, V., Quinlan, J.R., et al: ‘Top 10 algorithms in data mining’, Knowl. Inf. Syst., 2008, 14, (1), pp. 137.
    5. 5)
      • 47. Sezgin, M., Sankur, B.: ‘Survey over image thresholding techniques and quantitative performance evaluation’, J. Electron. Imaging, 2004, 13, (1), pp. 146165.
    6. 6)
      • 26. Kumar, K.M., Reddy, A.R.M.: ‘An efficient k-means clustering filtering algorithm using density based initial cluster centers’, Inf. Sci., 2017, 418–419, (2017), pp. 286301.
    7. 7)
      • 18. Chen, B., Tai, P.C., Harrison, R., et al: ‘Novel hybrid hierarchical-K-means clustering method (H-K-means) for microarray analysis’. Computational Systems Bioinformatics Conf., Stanford, USA, 2005, pp. 105108.
    8. 8)
      • 10. Steinley, D.: ‘K-means clustering: a half-century synthesis’, Br. J. Math. Stat. Psychol., 2006, 57, pp. 134.
    9. 9)
      • 27. Nidheesh, N., Nazeer, K.A.A., Ameer, P.M.: ‘A hierarchical clustering algorithm based on silhouette index for cancer subtype discovery from omics data’. bioRxiv, 2018.
    10. 10)
      • 5. Lai, J.Z.C., Liaw, Y.-C.: ‘Improvement of the k-means clustering filtering algorithm’, Pattern Recognit., 2008, 41, pp. 36773681.
    11. 11)
      • 44. Otsu, N.: ‘A threshold selection method from gray-level histograms’, IEEE Trans. Syst., Man, Cybern., 1979, SMC-9, (1), pp. 6266.
    12. 12)
      • 15. Lyakh, Y., Gurianov, V., Gorshkov, O., et al: ‘Estimating the number of data clusters via the contrast statistic’, J. Biomed. Sci. Eng., 2012, 5, pp. 9599.
    13. 13)
      • 50. Manning, C.D., Raghavan, P., Schutze, H.: ‘An introduction to information retrieval’ (Cambridge University Press, Cambridge, 2009).
    14. 14)
      • 40. Zaremba, W., Sutskever, I.: ‘Learning to execute’. Internationa Conf. Learning Representations., San Diego, USA, 2015.
    15. 15)
      • 34. Sculley, D.: ‘Web-scale k-means clustering’. Proc. 19th Int. Conf. World Wide Web, Raleigh, NC, USA, 2010.
    16. 16)
      • 1. Jain, A.K.: ‘Data clustering: 50 years beyond K-means’, Pattern Recognit. Lett., 2010, 31, pp. 651666.
    17. 17)
      • 25. Ji, J., Pang, W., Zheng, Y., et al: ‘An initialization method for clustering mixed numeric and categorical data based on the density and distance’, Int. J. Pattern Recognit. Artif. Intell., 2015, 29, (7), pp. 1550024-11550024-16.
    18. 18)
      • 46. Rao, K.R., Yip, P.C.: ‘The transform and data compression handbook’ (Boca Raton, CRC Press LLC, Florida, 2001).
    19. 19)
      • 36. Selim, S.Z., Ismail, M.A.: ‘K-means-type algorithms: a generalized convergence theorem and characterization of local optimality’, IEEE Trans. Pattern Anal. Mach. Intell., 1984, PAMI-6, (1), pp. 8187.
    20. 20)
      • 4. Du, K.-L., Swamy, M.N.S.: ‘Neural networks in a softcomputing framework’ (Springer-Verlag London Limited, London, 2006).
    21. 21)
      • 35. Cheng, Y.: ‘Batch self-organizing maps on a unit sphere’. Neural Networks Proc., IEEE World Congress on Computational Intelligence, Anchorage, AK, 1998.
    22. 22)
      • 20. Sujatha, S., Sona, A.S.: ‘New fast K-means clustering algorithm using modified centroid selection method’, Int. J. Eng. Res. Technol. (IJERT), 2013, 2, (2), pp. 19.
    23. 23)
      • 31. Kohonen, T.: ‘Self-organizing maps’ (Springer, Berlin, 2001, 3rd edn.).
    24. 24)
      • 12. Dudoit, S., Fridlyand, J.: ‘A prediction-based resampling method for estimating the number of clusters in a dataset’, Genome Biol., 2002, 3, (7), pp. 121.
    25. 25)
      • 51. Steinley, D., Brusco, M.J.: ‘Initializing K-means batch cluster: a critical evaluation of several teqhniques’, J. Classif., 2007, 24, pp. 99121.
    26. 26)
      • 30. Macqueen, J.: ‘Some methods for classification and analysis of multivariate observations’. 5th Berkeley Symp on Math Statistics and Probability, 1967, Berkeley, University of California Press.
    27. 27)
      • 42. Ding, C., He, X.: ‘K-means clustering via principal component analysis’. Proc. 21st Int. Conf. on Machine Learning, Banff, Canada, 2004.
    28. 28)
      • 3. Xu, R., Wunsch, D.: ‘Survey of clustering algorithms’, IEEE Trans. Neural Netw., 2005, 16, (3), pp. 645678.
    29. 29)
      • 33. Haykin, S.: ‘Neural network a comprehensive foundation’ (Tom Robbins, New Jersey, 1999, 2nd edn.).
    30. 30)
      • 41. Ioffe, S., Szegedy, C.: ‘Batch normalization: accelerating deep network training by reducing internal covariate shift’. 2015, arXiv:1502.03167.
    31. 31)
      • 39. Qian, N.: ‘On the momentum term in gradient descent learning algorithms’, Neural Netw., 1999, 12, (1), pp. 145151.
    32. 32)
      • 28. Linde, Y., Buzo, A., Gray, R.M.: ‘An algorithm for vector quantizer design’, IEEE Trans. Commun., 1980, COM-28, (1), pp. 8495.
    33. 33)
      • 2. Duda, R.O., Hart, P.E., Stork, D.G.: ‘Pattern classification’ (Wiley-Interscience, New York, 2000, 2nd edn.).
    34. 34)
      • 45. Jolliffe, I.T.: ‘Principal component analysis’ (Springer, New York, 2002, 2nd edn.).
    35. 35)
      • 43. Zhang, Y., Wu, H., Cheng, L.: ‘Some new deformation formulas about variance and covariance’. Proc. 2012 Int. Conf. Modelling, Identification and Control. Wuhan, China, 2012.
    36. 36)
      • 24. Ji, J., Pang, W., Zheng, Y., et al: ‘A novel cluster center initialization method for the k-prototypes algorithms using centrality and distance’, Appl. Math. Inf. Sci. 9, 2015, 6, pp. 29332942.
    37. 37)
      • 48. Leunberger, D.G., Ye, Y.: ‘Linear and nonlinear programming’ (Springer, New York, 2008, 3rd edn.).
    38. 38)
      • 37. Duchi, J., Hazan, E., Singer, Y.: ‘Adaptive subgradient methods for online learning and stochastic optimization’, J. Mach. Learn. Res., 2011, 12, (July), pp. 21212159.
    39. 39)
      • 11. Tibshirani, R., Walther, G., Hastie, T.: ‘Estimating the number of clusters in a data set via the gap statistic’, J. R. Stat. Soc. Ser B, 2001, 63, (2), pp. 411423.
    40. 40)
      • 22. Su, T., Dy, J.G.: ‘In search of deterministic methods for initializing k-means and Gaussian mixture clustering’, Intell. Data Anal., 2007, 11, (4), pp. 319338.
    41. 41)
      • 32. Kohonen, T.: ‘The self-organizing map’. Proc. IEEE, 1990.
    42. 42)
      • 52. Kurita, T.: ‘An efficient agglomerative clustering algorithm using a heap’, Pattern Recognit., 1991, 24, (3), pp. 205209.
    43. 43)
      • 8. Ward, Jr.J.H.: ‘Hierarchical grouping to optimize an objective function’, J. Am. Stat. Assoc., 1963, 58, (301), pp. 236244.
    44. 44)
      • 21. Xiang, T., Gong, S.: ‘Spectral clustering with eigenvector selection’, Pattern Recognit., 2008, 41, pp. 10121029.
    45. 45)
      • 9. Willigan, G.W., Sokol, L.M.: ‘A two-stage clustering algorithm with robust recovery characteristics’, Educ. Psychol. Meas., 1980, 40, pp. 755759.
    46. 46)
      • 16. Chiang, M.M.-T., Mirkin, B.: ‘Intelligent choice of the number of clusters in K-means clustering: an experimental study with different cluster spreads’, J. Classif., 2010, 27, pp. 340.
    47. 47)
      • 23. Tzortzis, G., Likas, A.: ‘The MinMax k-means clustering algorithm’, Pattern Recognit., 2014, 47, (2014), pp. 25052516.
    48. 48)
      • 38. Kingma, D.P., Ba, J.L.: ‘ADAM: a method for stochastic optimization’. Int. Conf. Learning Representations, 2015, pp. 1–13.
    49. 49)
      • 7. Feng, Y., Harmerly, G.: ‘PG-means: learning the number of clusters in data’ in ‘advances in neural information processing systems’ (MIT Press, Cambridge, MA, 2007).
    50. 50)
      • 19. Khan, F.: ‘An initial seed selection algorithm for k-means clustering of georeferenced data to improve replicability of cluster assignments for mapping application’, Appl. Soft Comput., 2012, 12, pp. 36983700.
    51. 51)
      • 17. Qian, Y., Li, F., Liang, J., et al: ‘Space structure and clustering of categorical data’, IEEE Trans. Neural Netw. Learn. Syst., 2016, 27, (10), pp. 20472059.
    52. 52)
      • 14. Tibshirani, R., Walther, G.: ‘Cluster validation by prediction strength’, J. Comput. Graph. Stat., 2005, 14, pp. 511528.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2018.5956
Loading

Related content

content/journals/10.1049/iet-ipr.2018.5956
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address