http://iet.metastore.ingenta.com
1887

Effect of fusing features from multiple DCNN architectures in image classification

Effect of fusing features from multiple DCNN architectures in image classification

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Image Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Automatic image classification has become a necessary task to handle the rapidly growing digital image usage. It has branched out many algorithms and adopted new techniques. Among them, feature fusion-based image classification methods rely on hand-crafted features traditionally. However, it has been proven that the bottleneck features extracted through pre-trained convolutional neural networks (CNNs) can improve the classification accuracy. Thence, this study analyses the effect of fusing such cues from multiple architectures without being tied to any hand-crafted features. First, the CNN features are extracted from three different pre-trained models, namely AlexNet, VGG-16, and Inception-V3. Then, a generalised feature space is formed by employing principal component reconstruction and energy-level normalisation, where the features from individual CNN are mapped into a common subspace and embedded using arithmetic rules to construct fused feature vectors (FFVs). This transformation play a vital role in creating a representation that is appearance invariant by capturing complementary information of different high-level features. Finally, a multi-class linear support vector machine is trained. The experimental results demonstrate that such multi-modal CNN feature fusion is well suited for image/object classification tasks, but surprisingly it has not been explored so far by the computer vision research community extensively.

References

    1. 1)
      • S. Kolouri , S.-R. Park , G.-K. Rohde .
        1. Kolouri, S., Park, S.-R., Rohde, G.-K.: ‘The radon cumulative distribution transform and its application to image classification’, IEEE Trans. Image Process., 2016, 25, (2), pp. 920934.
        . IEEE Trans. Image Process. , 2 , 920 - 934
    2. 2)
      • M. Ristin , J. Gal , M. Guillaumin .
        2. Ristin, M., Gal, J., Guillaumin, M., et al: ‘From categories to subcategories: large-scale image classification with partial class label refinement’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 231239.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 231 - 239
    3. 3)
      • J. Feng , X. Liu , Y. Dong .
        3. Feng, J., Liu, X., Dong, Y., et al: ‘Structural difference histogram representation for texture image classification’, IET Image Process., 2017, 11, (2), pp. 118125.
        . IET Image Process. , 2 , 118 - 125
    4. 4)
      • T. Wang , D.J. Wu , A. Coates .
        4. Wang, T., Wu, D.J., Coates, A., et al: ‘End-to-end text recognition with convolutional neural networks’. Inter. Conf. Pattern Recognition (ICPR), 2012.
        . Inter. Conf. Pattern Recognition (ICPR)
    5. 5)
      • T.-B. Fernando , E. Fromont , D. Muselet .
        5. Fernando, T.-B., Fromont, E., Muselet, D., et al: ‘Discriminative feature fusion for image classification’. Inter. Conf. Pattern Recognition (ICPR), 2012, pp. 34343441.
        . Inter. Conf. Pattern Recognition (ICPR) , 3434 - 3441
    6. 6)
      • A. Sharif Razavian , H. Azizpour , J. Sullivan .
        6. Sharif Razavian, A., Azizpour, H., Sullivan, J., et al: ‘CNN features off-the shelf: an astounding baseline for recognition’. IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2014, pp. 806813.
        . IEEE Conf. Computer Vision and Pattern Recognition Workshops , 806 - 813
    7. 7)
      • P. Sermanet , D. Eigen , X. Zhang .
        7. Sermanet, P., Eigen, D., Zhang, X., et al: ‘Overfeat: integrated recognition, localization and detection using convolutional networks’. Int. Conf. Learning Representations (ICLR), 2014.
        . Int. Conf. Learning Representations (ICLR)
    8. 8)
      • T. Akilan , Q.M.J. Wu , Y. Yang .
        8. Akilan, T., Wu, Q.M.J., Yang, Y., et al: ‘Fusion of transfer learning features and its application in image classification’. IEEE Canadian Conf. Electrical and Computer Engineering (CCECE), 2017, pp. 15.
        . IEEE Canadian Conf. Electrical and Computer Engineering (CCECE) , 1 - 5
    9. 9)
      • P. Li , Q. Wang , H. Zeng .
        9. Li, P., Wang, Q., Zeng, H., et al: ‘Local log-Euclidean multivariate Gaussian descriptor and its application to image classification’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (4), pp. 803817.
        . IEEE Trans. Pattern Anal. Mach. Intell. , 4 , 803 - 817
    10. 10)
      • H. Hoashi , T. Joutou , K. Yanai .
        10. Hoashi, H., Joutou, T., Yanai, K.: ‘Image recognition of 85 food categories by feature fusion’. Proc. Second Workshop on Multimedia for Cooking and Eating Activities, 2010.
        . Proc. Second Workshop on Multimedia for Cooking and Eating Activities
    11. 11)
      • D.-C. Park .
        11. Park, D.-C.: ‘Multiple feature-based classifier and its application to image classification’. IEEE Int. Conf. Data Mining Workshops, 2010, pp. 6571.
        . IEEE Int. Conf. Data Mining Workshops , 65 - 71
    12. 12)
      • K. Yu , T. Zhang .
        12. Yu, K., Zhang, T.: ‘Improved local coordinate coding using local tangents’. Proc. 27th Int. Conf. Machine Learning (ICML), 2010, pp. 12151222.
        . Proc. 27th Int. Conf. Machine Learning (ICML) , 1215 - 1222
    13. 13)
      • J. Zhang , M. Marszalek , S. Lazebink .
        13. Zhang, J., Marszalek, M., Lazebink, S., et al: ‘Local features and kernels for classification of texture and object categories: a comprehensive study’, Int. J. Comput. Vis., 2007, 73, (2), pp. 213238.
        . Int. J. Comput. Vis. , 2 , 213 - 238
    14. 14)
      • P.-V. Gehler , S. Nowozin .
        14. Gehler, P.-V., Nowozin, S.: ‘On feature combination for multiclass object classification’. Proc. IEEE 12th Int. Conf. Computer Vision (ICCV), 2009.
        . Proc. IEEE 12th Int. Conf. Computer Vision (ICCV)
    15. 15)
      • F.-S. Khan , J. van de Weijer , M. Vanrell .
        15. Khan, F.-S., van de Weijer, J., Vanrell, M.: ‘Modulating shape features by color attention for object recognition’, Int. J. Comput. Vis., 2012, 98, pp. 4964.
        . Int. J. Comput. Vis. , 49 - 64
    16. 16)
      • J. Xiao , K. Hays , A. Ehinger .
        16. Xiao, J., Hays, K., Ehinger, A., et al: ‘Sun database: large-scale scene recognition from abbey to zoo’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010, pp. 34853492.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 3485 - 3492
    17. 17)
      • M.D. Zeiler , R. Fergus .
        17. Zeiler, M.D., Fergus, R.: ‘Visualizing and understanding convolutional networks’, 2014, pp. 818833.
        . , 818 - 833
    18. 18)
      • M. Dixit , S. Chen , D. Gao .
        18. Dixit, M., Chen, S., Gao, D., et al: ‘Scene classification with semantic Fisher vectors’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 29742983.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 2974 - 2983
    19. 19)
      • G. Lin , H. Zhu , X. Kang .
        19. Lin, G., Zhu, H., Kang, X., et al: ‘Feature structure fusion modelling for classification’, IET Image Process., 2015, 9, (10), pp. 883888.
        . IET Image Process. , 10 , 883 - 888
    20. 20)
      • X. Huang , Y. Xu , L. Yang .
        20. Huang, X., Xu, Y., Yang, L.: ‘Local visual similarity descriptor for describing local region’. Int. Conf. Machine Vision, 2017, p. 103410S.
        . Int. Conf. Machine Vision , 103410S
    21. 21)
      • Q.-H. Zhu , Z.-Z. Wang , X.-J. Mao .
        21. Zhu, Q.-H., Wang, Z.-Z., Mao, X.-J., et al: ‘Spatial locality-preserving feature coding for image classification’, Appl. Intell., 2017, 47, (1), pp. 148157.
        . Appl. Intell. , 1 , 148 - 157
    22. 22)
      • M. Sun , T.-X. Han , M.-C. Liu .
        22. Sun, M., Han, T.-X., Liu, M.-C., et al: ‘Latent model ensemble with auto-localization’. Proc. Int. Conf. Pattern Recognition (ICPR), 2016.
        . Proc. Int. Conf. Pattern Recognition (ICPR)
    23. 23)
      • S.H. Khan , M. Hayat , M. Bennamoun .
        23. Khan, S.H., Hayat, M., Bennamoun, M., et al: ‘A discriminative representation of convolutional features for indoor scene recognition’, IEEE Trans. Image Process., 2016, 25, (7), pp. 33723383.
        . IEEE Trans. Image Process. , 7 , 3372 - 3383
    24. 24)
      • M. Hayat , S.H. Khan , M. Bennamoun .
        24. Hayat, M., Khan, S.H., Bennamoun, M., et al: ‘A spatial layout and scale invariant feature representation for indoor scene classification’, IEEE Trans. Image Process., 2016, 25, (10), pp. 48294841.
        . IEEE Trans. Image Process. , 10 , 4829 - 4841
    25. 25)
      • K. He , X. Zhang , S. Ren .
        25. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)
    26. 26)
      • J. Dai , Y. Li , K. He .
        26. Dai, J., Li, Y., He, K., et al: ‘R-FCN: object detection via region-based fully convolutional networks’, Adv. Neural Inf. Process. Syst., 2016, pp. 379387.
        . Adv. Neural Inf. Process. Syst. , 379 - 387
    27. 27)
      • M. Oquab , L. Bottou , I. Laptev .
        27. Oquab, M., Bottou, L., Laptev, I., et al: ‘Learning and transferring mid-level image representations using convolutional neural networks’. Proc. IEEE Computer Vision and Pattern Recognition (CVPR), 2014, pp. 17171724.
        . Proc. IEEE Computer Vision and Pattern Recognition (CVPR) , 1717 - 1724
    28. 28)
      • Y. Zhang , B. Shi .
        28. Zhang, Y., Shi, B.: ‘Improving pooling method for regularization of convolutional networks based on the failure probability density’, Opt. – Int. J. Light Electron Opt., 2017, 145, (Suppl. C), pp. 258265.
        . Opt. – Int. J. Light Electron Opt. , 258 - 265
    29. 29)
      • J. Snoek , O. Rippel , K. Swersky .
        29. Snoek, J., Rippel, O., Swersky, K., et al: ‘Scalable Bayesian optimization using deep neural networks’. Int. Conf. Machine Learning (ICML), 2015, pp. 21712180.
        . Int. Conf. Machine Learning (ICML) , 2171 - 2180
    30. 30)
      • J. Yang , M.-H. Yang .
        30. Yang, J., Yang, M.-H.: ‘Learning hierarchical image representation with sparsity, saliency and locality’. Proc. British Machine Vision Conf., 2011, pp. 111.
        . Proc. British Machine Vision Conf. , 1 - 11
    31. 31)
      • A. Thangarajah , Q.J. Wu , Y. Yimin .
        31. Thangarajah, A., Wu, Q.J., Yimin, Y.: ‘Fusion-based foreground enhancement for background subtraction using multivariate multi-model Gaussian distribution’, Inf. Sci., 2018, 430, pp. 414431.
        . Inf. Sci. , 414 - 431
    32. 32)
      • D. Ai , G. Duan , X. Han .
        32. Ai, D., Duan, G., Han, X., et al: ‘Multiple feature and fusion based on generalized n-dimensional independent component analysis’. Int. Conf. Pattern Recognition (ICPR), 2012, vol. 21, pp. 971974.
        . Int. Conf. Pattern Recognition (ICPR) , 971 - 974
    33. 33)
      • S. Bahrampour , N.M. Nasrabadi , A. Ray .
        33. Bahrampour, S., Nasrabadi, N.M., Ray, A., et al: ‘Multimodal task-driven dictionary learning for image classification’, IEEE Trans. Image Process., 2016, 25, pp. 2438.
        . IEEE Trans. Image Process. , 24 - 38
    34. 34)
      • A. Thangarajah , Q.M.J. Wu , A. Safaei .
        34. Thangarajah, A., Wu, Q.M.J., Safaei, A., et al: ‘A late fusion approach for harnessing multi-CNN model high-level features’. IEEE Int. Conf. Systems, Man, and Cybernetics, SMC, 2017, pp. 566571.
        . IEEE Int. Conf. Systems, Man, and Cybernetics, SMC , 566 - 571
    35. 35)
      • S. Cai , L. Zhang , W. Zuo .
        35. Cai, S., Zhang, L., Zuo, W., et al: ‘A probabilistic collaborative representation based approach for pattern classification’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 29502959.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 2950 - 2959
    36. 36)
      • S. Chen , J. Yang , L. Luo .
        36. Chen, S., Yang, J., Luo, L., et al: ‘Low-rank latent pattern approximation with applications to robust image classification’, IEEE Trans. Image Process., 2017, 26, (11), pp. 55195530.
        . IEEE Trans. Image Process. , 11 , 5519 - 5530
    37. 37)
      • C. Dong , C.-C. Loy , K. He .
        37. Dong, C., Loy, C.-C., He, K., et al: ‘Image super-resolution using deep convolutional networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, pp. 295307.
        . IEEE Trans. Pattern Anal. Mach. Intell. , 295 - 307
    38. 38)
      • Z. Gao , P. Fatih , L. Hongdong .
        38. Gao, Z., Fatih, P., Hongdong, L.: ‘Robust visual tracking with deep convolutional neural network based object proposals on pets’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2633.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 26 - 33
    39. 39)
      • S. Wenling , S. Kihyuk , A. Diogo .
        39. Wenling, S., Kihyuk, S., Diogo, A., et al: ‘The extraordinary link between deep neural networks and the nature of the universe’, MIT Technol. Rev., 2016.
        . MIT Technol. Rev.
    40. 40)
      • A. Krizhevsky , G. Hinton .
        40. Krizhevsky, A., Hinton, G.: ‘Learning multiple layers of features from tiny images’. Technical Report, University of Toronto, 2009.
        .
    41. 41)
      • A. Krizhevsky , I. Sutskever , G. Hinton .
        41. Krizhevsky, A., Sutskever, I., Hinton, G.: ‘ImageNet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems (NIPS), 2012, pp. 10971105.
        . Advances in Neural Information Processing Systems (NIPS) , 1097 - 1105
    42. 42)
      • C. Szegedy , W. Liu , Y. Jia .
        42. Szegedy, C., Liu, W., Jia, Y., et al: ‘Going deeper with convolutions’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2015, pp. 19.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 1 - 9
    43. 43)
      • C. Szegedy , V. Vanhoucke , S. Ioffe .
        43. Szegedy, C., Vanhoucke, V., Ioffe, S., et al: ‘Rethinking the inception architecture for computer vision’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2016, pp. 28182826.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) , 2818 - 2826
    44. 44)
      • Y. Luo , D. Tao , Y. Wen .
        44. Luo, Y., Tao, D., Wen, Y., et al: ‘Tensor canonical correlation analysis for multi-view dimension reduction’, IEEE Trans. Knowl. Data Eng., 2015, 27, (11), pp. 31113124.
        . IEEE Trans. Knowl. Data Eng. , 11 , 3111 - 3124
    45. 45)
      • C.-W. Hsu , C.-J. Lin .
        45. Hsu, C.-W., Lin, C.-J.: ‘A comparison of methods for multiclass support vector machines’, IEEE Trans. Neural Netw., 2002, 13, (2), pp. 415425.
        . IEEE Trans. Neural Netw. , 2 , 415 - 425
    46. 46)
      • L. Fei-Fei , R. Fergus , P. Perona .
        46. Fei-Fei, L., Fergus, R., Perona, P.: ‘Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories’, Comput. Vis. Image Underst., 2007, 106, (1), pp. 5970.
        . Comput. Vis. Image Underst. , 1 , 59 - 70
    47. 47)
      • G. Griffin , A. Holub , P. Perona .
        47. Griffin, G., Holub, A., Perona, P.: ‘The Caltech-256: Caltech’. Technical Report, 2007, 7694.
        .
    48. 48)
      • A. Quattoni , A. Torralba .
        48. Quattoni, A., Torralba, A.: ‘Recognizing indoor scenes’. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
        . Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)
    49. 49)
      • M. Everingham , S.M.A. Eslami , L. Van Gool .
        49. Everingham, M., Eslami, S.M.A., Van Gool, L., et al: ‘The Pascal visual object classes challenge: a retrospective’, Int. J. Comput. Vis., 2015, 111, (1), pp. 98136.
        . Int. J. Comput. Vis. , 1 , 98 - 136
    50. 50)
      • J. Snoek , H. Larochelle , R.P. Adams .
        50. Snoek, J., Larochelle, H., Adams, R.P.: ‘Practical Bayesian optimization of machine learning algorithms’, Adv. Neural Inf. Process. Syst., 2012, 25, pp. 29512959.
        . Adv. Neural Inf. Process. Syst. , 2951 - 2959
    51. 51)
      • B. Zhou , A. Lapedriza , J. Xiao .
        51. Zhou, B., Lapedriza, A., Xiao, J., et al: ‘Learning deep features for scene recognition using places database’, Adv. Neural Inf. Process. Syst., 2014, 27, pp. 487495.
        . Adv. Neural Inf. Process. Syst. , 487 - 495
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2017.0232
Loading

Related content

content/journals/10.1049/iet-ipr.2017.0232
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address