Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free High discriminative SIFT feature and feature pair selection to improve the bag of visual words model

The bag of visual words (BOW) model has been widely applied in the field of image recognition and image classification. However, all scale-invariant feature transform (SIFT) features are clustered to construct the visual words which result in a substantial loss of discriminative power for the visual words. The corresponding visual phrases will further render the generated BOW histogram sparse. In this study, the authors aim to improve the classification accuracy by extracting high discriminative SIFT features and feature pairs. First, high discriminative SIFT features are extracted with the within- and between-class correlation coefficients. Second, the high discriminative SIFT feature pairs are selected by using minimum spanning tree and its total cost. Next, high discriminative SIFT features and feature pairs are exploited to construct the visual word dictionary and visual phrase dictionary, respectively, which are concatenated to a joint histogram with different weights. Compared with the state-of-the-art BOW-based methods, the experimental results on Caltech 101 dataset show that the proposed method has higher classification accuracy.

References

    1. 1)
      • 20. Qin, J., Young, N.H.C.: ‘Scene categorization via contextual visual words’, Pattern Recognit., 2010, 43, (5), pp. 18741888.
    2. 2)
      • 25. Chen, T., Yap, K., Zhang, D.: ‘Discriminative soft bag-of-visual phrase for mobile landmark recognition’, IEEE Trans. Multimed., 2014, 16, (3), pp. 612662.
    3. 3)
      • 17. Liu, D., Hua, G., Viola, P.A., et al: ‘Integrated feature selection and higher-order spatial feature extraction for object categorization’. Proc. Int. 26th IEEE Conf. Computer Vision and Pattern Recognition, Alaska, USA, June 2008, pp. 18.
    4. 4)
      • 12. Alqasrawi, Y., Neagu, D., Cowling, P.I.: ‘Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification’, Signal Image Video Process., 2013, 7, (4), pp. 759775.
    5. 5)
      • 30. Li, T., Mei, T., Kweon, I.S., et al: ‘Contextual bags-of-words for visual categorization’, IEEE Trans. Circuits Syst. Video Technol., 2011, 21, (4), pp. 381392.
    6. 6)
      • 29. Zheng, Y.T., Zhao, M., Neo, S.Y., et al: ‘Visual synset: towards a higher-level visual representation’. Proc. Int. Conf. Computer Vision and Pattern Recognition, Alaska, USA, June 2008, pp. 18.
    7. 7)
      • 22. Yuan, J., Wu, Y., Yang, M.: ‘Discovery of collocation patterns: from visual words to visual phrases’. Proc. Int. Conf. Computer Vision and Pattern Recognition, Ezhou, China, August 2007, pp. 18.
    8. 8)
      • 3. Liu, S., Nguyen, T.V., Feng, J., et al: ‘Hi, magic closet, tell me what to wear!’. MM 2012 – Proc. 20th ACM Int. Conf. Multimedia, Nara, Japan, October 2012, pp. 619628.
    9. 9)
      • 11. Huang, J., Kumar, S.R., Mitra, M., et al: ‘Image indexing using color correlograms’. Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Juan, Puerto Rico, June 1997, pp. 762768.
    10. 10)
      • 33. Fei-Fei, L., Fergus, R., Perona, P.: ‘Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories’. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Washington, DC, USA, June 2004, pp. 178178.
    11. 11)
      • 23. van Gemert, J.C., Geusebroek, J.M., Veenman, C.J., et al: ‘Kernel codebooks for scene categorization’. Proc. Tenth European Conf. Computer Vision, Marseille, France, October 2008, pp. 696709.
    12. 12)
      • 1. Bao, B.K., Li, T., Yan, S.: ‘Hidden-concept driven multilabel image annotation and label ranking’, IEEE Trans. Multimed., 2012, 14, (1), pp. 199210.
    13. 13)
      • 10. Mansoori, N., Nejati, M., Razzaghi, P., et al: ‘Bag of visual words approach for image retrieval using color information’. Proc. Int. 21st Iranian Conf. Electrical Engineering (ICEE), Mashhad, Iran, February 2013, pp. 16.
    14. 14)
      • 24. Zhou, X., Cui, N., Li, Z., et al: ‘Huang, hierarchical gaussianization for image classification’. Proc. Int. Conf. Computer Vision, Kyoto, Japan, October 2009, pp. 19711977.
    15. 15)
      • 18. Savarese, S., Winn, J., Criminisi, A.: ‘Discriminative object class models of appearance and shape by correlations’. Proc. Int. Conf. Computer Vision and Pattern Recognition, Ezhou, China, August 2007, pp. 20332040.
    16. 16)
      • 14. Lazebnik, S., Schmid, C., Ponce, J.: ‘Beyond bags of features: spatial pyramid matching for recognizing natural scene categories’. Proc. Int. Conf. Computer Vision and Pattern Recognition, New York, USA, June 2006, pp. 21692178.
    17. 17)
      • 21. Zhou, G., Wang, Z., Wang, J., et al: ‘Spatial context for visual vocabulary construction’. Proc. Int. Conf. Image Analysis and Signal Processing, Zhejiang, China, April 2010, pp. 176181.
    18. 18)
      • 26. Morioka, N., Satoh, S.: ‘Building compact local pairwise codebook with joint feature space clustering’. Proc. 11th European Conf. on Computer Vision, Berlin, Heidelberg, 2010, pp. 692705.
    19. 19)
      • 2. Yu, J., Rui, Y., Chen, B.: ‘Exploiting click constraints and multi-view features for image re-ranking’, IEEE Trans. Multimed., 2014, 16, (1), pp. 159168.
    20. 20)
      • 7. Zhang, T., Xu, C., Zhu, G., et al: ‘A generic framework for video annotation via semi-supervised learning’, IEEE Trans. Multimedia, 2012, 14, (4), pp. 12061219.
    21. 21)
      • 19. Wu, L., Li, M., Li, Z., et al: ‘Visual language modeling for image classification’. Proc. Ninth ACM SIGMM Int. Workshop on Multimedia Information Retrieval, Augsburg, Bavaria, Germany, September 2007, pp. 115124.
    22. 22)
      • 27. Zheng, Q., Wang, W., Gao, W.: ‘Effective and efficient object-based image retrieval using visual phrases’. Proc. Int. Conf. Multimedia (ACM MM), Santa Barbara, USA, October 2006, pp. 7780.
    23. 23)
      • 32. Zhang, S.L., Tian, Q., Hua, G., et al: ‘Descriptive visual words and visual phrases for image applications’. ACM Int. Conf. Multimedia, Beijing, China, October 2009, pp. 7584.
    24. 24)
      • 6. Yu, J., Tao, D., Rui, Y., et al: ‘Pairwise constraints based multiview features fusion for scene classification’, Pattern Recognit., 2013, 46, (2), pp. 483496.
    25. 25)
      • 15. Khan, R., Barat, C., Muselet, D., et al: ‘Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model’, Comput. Vis. Image Underst., 2015, 132, pp. 102112.
    26. 26)
      • 13. Gao, H., Dou, L., Chen, W., et al: ‘Image classification with bag-of-words model based on improved SIFT algorithm’. Proc. Ninth Asian Control Conf., Istanbul, Turkey, June 2013, pp. 16.
    27. 27)
      • 28. Song, Y., Tian, Q., Wang, M., et al: ‘Multiple instance learning using visual phrases for object classification’. Proc. Int. Conf. Multimedia and Expo, Singapore, July 2010, pp. 649654.
    28. 28)
      • 16. Kim, S., Jin, X., Han, J.: ‘discriminative frequent pattern-based image classification’. Proc. Int. Conf. Multimedia Data Mining, MDMKDD'10, ACM, New York, USA, June 2010, pp. 710.
    29. 29)
      • 8. Csurka, G., Dance, C., Fan, L., et al: ‘Visual categorization with bags of keypoints’. Workshop on Statistical Learning in Computer Vision, EEVC, 2004, pp. 12.
    30. 30)
      • 4. Zhang, T., Ghanem, B., Liu, S., et al: ‘Low-rank sparse learning for robust visual tracking’. ECCV 2012 – Proc. 12th European Conf. Computer Vision, Florence, Italy, October 2012, pp. 470484.
    31. 31)
      • 5. Zhang, T., Ghanem, B., Liu, S., et al: ‘Robust visual tracking via multi-task sparse learning’. IEEE Proc. Int. Conf. Computer Vision and Pattern Recognition, Rhode Island, USA, June 2012, pp. 20422049.
    32. 32)
      • 31. Yeh, J.B., Wu, C.H.: ‘Extraction of robust visual phrases using graph mining for image retrieval’. Proc. Int. Conf. Circuits and Systems, San Francisco, USA, October 2010, pp. 36813684.
    33. 33)
      • 9. Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’, Int. J. Comput. Vis., 2004, 60, (2), pp. 91110.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2017.0062
Loading

Related content

content/journals/10.1049/iet-ipr.2017.0062
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address