Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Effect of fusing features from multiple DCNN architectures in image classification

Automatic image classification has become a necessary task to handle the rapidly growing digital image usage. It has branched out many algorithms and adopted new techniques. Among them, feature fusion-based image classification methods rely on hand-crafted features traditionally. However, it has been proven that the bottleneck features extracted through pre-trained convolutional neural networks (CNNs) can improve the classification accuracy. Thence, this study analyses the effect of fusing such cues from multiple architectures without being tied to any hand-crafted features. First, the CNN features are extracted from three different pre-trained models, namely AlexNet, VGG-16, and Inception-V3. Then, a generalised feature space is formed by employing principal component reconstruction and energy-level normalisation, where the features from individual CNN are mapped into a common subspace and embedded using arithmetic rules to construct fused feature vectors (FFVs). This transformation play a vital role in creating a representation that is appearance invariant by capturing complementary information of different high-level features. Finally, a multi-class linear support vector machine is trained. The experimental results demonstrate that such multi-modal CNN feature fusion is well suited for image/object classification tasks, but surprisingly it has not been explored so far by the computer vision research community extensively.

References

    1. 1)
      • 37. Dong, C., Loy, C.-C., He, K., et al: ‘Image super-resolution using deep convolutional networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, pp. 295307.
    2. 2)
      • 14. Gehler, P.-V., Nowozin, S.: ‘On feature combination for multiclass object classification’. Proc. IEEE 12th Int. Conf. Computer Vision (ICCV), 2009.
    3. 3)
      • 17. Zeiler, M.D., Fergus, R.: ‘Visualizing and understanding convolutional networks’, 2014, pp. 818833.
    4. 4)
      • 41. Krizhevsky, A., Sutskever, I., Hinton, G.: ‘ImageNet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems (NIPS), 2012, pp. 10971105.
    5. 5)
      • 34. Thangarajah, A., Wu, Q.M.J., Safaei, A., et al: ‘A late fusion approach for harnessing multi-CNN model high-level features’. IEEE Int. Conf. Systems, Man, and Cybernetics, SMC, 2017, pp. 566571.
    6. 6)
      • 24. Hayat, M., Khan, S.H., Bennamoun, M., et al: ‘A spatial layout and scale invariant feature representation for indoor scene classification’, IEEE Trans. Image Process., 2016, 25, (10), pp. 48294841.
    7. 7)
      • 35. Cai, S., Zhang, L., Zuo, W., et al: ‘A probabilistic collaborative representation based approach for pattern classification’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 29502959.
    8. 8)
      • 4. Wang, T., Wu, D.J., Coates, A., et al: ‘End-to-end text recognition with convolutional neural networks’. Inter. Conf. Pattern Recognition (ICPR), 2012.
    9. 9)
      • 38. Gao, Z., Fatih, P., Hongdong, L.: ‘Robust visual tracking with deep convolutional neural network based object proposals on pets’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2633.
    10. 10)
      • 8. Akilan, T., Wu, Q.M.J., Yang, Y., et al: ‘Fusion of transfer learning features and its application in image classification’. IEEE Canadian Conf. Electrical and Computer Engineering (CCECE), 2017, pp. 15.
    11. 11)
      • 20. Huang, X., Xu, Y., Yang, L.: ‘Local visual similarity descriptor for describing local region’. Int. Conf. Machine Vision, 2017, p. 103410S.
    12. 12)
      • 50. Snoek, J., Larochelle, H., Adams, R.P.: ‘Practical Bayesian optimization of machine learning algorithms’, Adv. Neural Inf. Process. Syst., 2012, 25, pp. 29512959.
    13. 13)
      • 39. Wenling, S., Kihyuk, S., Diogo, A., et al: ‘The extraordinary link between deep neural networks and the nature of the universe’, MIT Technol. Rev., 2016.
    14. 14)
      • 15. Khan, F.-S., van de Weijer, J., Vanrell, M.: ‘Modulating shape features by color attention for object recognition’, Int. J. Comput. Vis., 2012, 98, pp. 4964.
    15. 15)
      • 6. Sharif Razavian, A., Azizpour, H., Sullivan, J., et al: ‘CNN features off-the shelf: an astounding baseline for recognition’. IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2014, pp. 806813.
    16. 16)
      • 12. Yu, K., Zhang, T.: ‘Improved local coordinate coding using local tangents’. Proc. 27th Int. Conf. Machine Learning (ICML), 2010, pp. 12151222.
    17. 17)
      • 42. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 19.
    18. 18)
      • 46. Fei-Fei, L., Fergus, R., Perona, P.: ‘Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories’, Comput. Vis. Image Underst., 2007, 106, (1), pp. 5970.
    19. 19)
      • 1. Kolouri, S., Park, S.-R., Rohde, G.-K.: ‘The radon cumulative distribution transform and its application to image classification’, IEEE Trans. Image Process., 2016, 25, (2), pp. 920934.
    20. 20)
      • 49. Everingham, M., Eslami, S.M.A., Van Gool, L., et al: ‘The Pascal visual object classes challenge: a retrospective’, Int. J. Comput. Vis., 2015, 111, (1), pp. 98136.
    21. 21)
      • 7. Sermanet, P., Eigen, D., Zhang, X., et al: ‘Overfeat: integrated recognition, localization and detection using convolutional networks’. Int. Conf. Learning Representations (ICLR), 2014.
    22. 22)
      • 48. Quattoni, A., Torralba, A.: ‘Recognizing indoor scenes’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
    23. 23)
      • 22. Sun, M., Han, T.-X., Liu, M.-C., et al: ‘Latent model ensemble with auto-localization’. Proc. Int. Conf. Pattern Recognition (ICPR), 2016.
    24. 24)
      • 26. Dai, J., Li, Y., He, K., et al: ‘R-FCN: object detection via region-based fully convolutional networks’, Adv. Neural Inf. Process. Syst., 2016, pp. 379387.
    25. 25)
      • 10. Hoashi, H., Joutou, T., Yanai, K.: ‘Image recognition of 85 food categories by feature fusion’. Proc. Second Workshop on Multimedia for Cooking and Eating Activities, 2010.
    26. 26)
      • 31. Thangarajah, A., Wu, Q.J., Yimin, Y.: ‘Fusion-based foreground enhancement for background subtraction using multivariate multi-model Gaussian distribution’, Inf. Sci., 2018, 430, pp. 414431.
    27. 27)
      • 28. Zhang, Y., Shi, B.: ‘Improving pooling method for regularization of convolutional networks based on the failure probability density’, Opt. – Int. J. Light Electron Opt., 2017, 145, (Suppl. C), pp. 258265.
    28. 28)
      • 40. Krizhevsky, A., Hinton, G.: ‘Learning multiple layers of features from tiny images’. Technical Report, University of Toronto, 2009.
    29. 29)
      • 18. Dixit, M., Chen, S., Gao, D., et al: ‘Scene classification with semantic Fisher vectors’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 29742983.
    30. 30)
      • 44. Luo, Y., Tao, D., Wen, Y., et al: ‘Tensor canonical correlation analysis for multi-view dimension reduction’, IEEE Trans. Knowl. Data Eng., 2015, 27, (11), pp. 31113124.
    31. 31)
      • 11. Park, D.-C.: ‘Multiple feature-based classifier and its application to image classification’. IEEE Int. Conf. Data Mining Workshops, 2010, pp. 6571.
    32. 32)
      • 29. Snoek, J., Rippel, O., Swersky, K., et al: ‘Scalable Bayesian optimization using deep neural networks’. Int. Conf. Machine Learning (ICML), 2015, pp. 21712180.
    33. 33)
      • 9. Li, P., Wang, Q., Zeng, H., et al: ‘Local log-Euclidean multivariate Gaussian descriptor and its application to image classification’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (4), pp. 803817.
    34. 34)
      • 27. Oquab, M., Bottou, L., Laptev, I., et al: ‘Learning and transferring mid-level image representations using convolutional neural networks’. Proc. IEEE Computer Vision and Pattern Recognition (CVPR), 2014, pp. 17171724.
    35. 35)
      • 23. Khan, S.H., Hayat, M., Bennamoun, M., et al: ‘A discriminative representation of convolutional features for indoor scene recognition’, IEEE Trans. Image Process., 2016, 25, (7), pp. 33723383.
    36. 36)
      • 43. Szegedy, C., Vanhoucke, V., Ioffe, S., et al: ‘Rethinking the inception architecture for computer vision’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 28182826.
    37. 37)
      • 13. Zhang, J., Marszalek, M., Lazebink, S., et al: ‘Local features and kernels for classification of texture and object categories: a comprehensive study’, Int. J. Comput. Vis., 2007, 73, (2), pp. 213238.
    38. 38)
      • 45. Hsu, C.-W., Lin, C.-J.: ‘A comparison of methods for multiclass support vector machines’, IEEE Trans. Neural Netw., 2002, 13, (2), pp. 415425.
    39. 39)
      • 32. Ai, D., Duan, G., Han, X., et al: ‘Multiple feature and fusion based on generalized n-dimensional independent component analysis’. Int. Conf. Pattern Recognition (ICPR), 2012, vol. 21, pp. 971974.
    40. 40)
      • 47. Griffin, G., Holub, A., Perona, P.: ‘The Caltech-256: Caltech’. Technical Report, 2007, 7694.
    41. 41)
      • 3. Feng, J., Liu, X., Dong, Y., et al: ‘Structural difference histogram representation for texture image classification’, IET Image Process., 2017, 11, (2), pp. 118125.
    42. 42)
      • 33. Bahrampour, S., Nasrabadi, N.M., Ray, A., et al: ‘Multimodal task-driven dictionary learning for image classification’, IEEE Trans. Image Process., 2016, 25, pp. 2438.
    43. 43)
      • 30. Yang, J., Yang, M.-H.: ‘Learning hierarchical image representation with sparsity, saliency and locality’. Proc. British Machine Vision Conf., 2011, pp. 111.
    44. 44)
      • 19. Lin, G., Zhu, H., Kang, X., et al: ‘Feature structure fusion modelling for classification’, IET Image Process., 2015, 9, (10), pp. 883888.
    45. 45)
      • 16. Xiao, J., Hays, K., Ehinger, A., et al: ‘Sun database: large-scale scene recognition from abbey to zoo’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010, pp. 34853492.
    46. 46)
      • 25. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016.
    47. 47)
      • 2. Ristin, M., Gal, J., Guillaumin, M., et al: ‘From categories to subcategories: large-scale image classification with partial class label refinement’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 231239.
    48. 48)
      • 5. Fernando, T.-B., Fromont, E., Muselet, D., et al: ‘Discriminative feature fusion for image classification’. Inter. Conf. Pattern Recognition (ICPR), 2012, pp. 34343441.
    49. 49)
      • 21. Zhu, Q.-H., Wang, Z.-Z., Mao, X.-J., et al: ‘Spatial locality-preserving feature coding for image classification’, Appl. Intell., 2017, 47, (1), pp. 148157.
    50. 50)
      • 51. Zhou, B., Lapedriza, A., Xiao, J., et al: ‘Learning deep features for scene recognition using places database’, Adv. Neural Inf. Process. Syst., 2014, 27, pp. 487495.
    51. 51)
      • 36. Chen, S., Yang, J., Luo, L., et al: ‘Low-rank latent pattern approximation with applications to robust image classification’, IEEE Trans. Image Process., 2017, 26, (11), pp. 55195530.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2017.0232
Loading

Related content

content/journals/10.1049/iet-ipr.2017.0232
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address