CNN with coarse-to-fine layer for hierarchical classification

CNN with coarse-to-fine layer for hierarchical classification

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Most of the traditional convolution neural network (CNN)-based classification models are flat classifiers, which have an underlying assumption that all classes are equally difficult to distinguish. However, visual separability between different object categories is highly uneven in the real world. Recently, hierarchical classification has been proven effective for CNNs, more and more attempts have been made to exploit category hierarchies in CNN models. In this study, the authors propose a novel hierarchical CNN architecture, called coarse-to-fine CNN. It is simple, with a proposed coarse-to-fine layer on the top of a generic CNN. The coarse-to-fine layer is inspired by the Bayesian equation, where the coarse prediction can affect the fine prediction directly. Arbitrary CNNs can perform the hierarchical classification by adding the proposed layer. The training of a coarse-to-fine CNN is end-to-end, it can be optimised by typical stochastic gradient descent. In the test phase, it outputs multiple hierarchical predictions simultaneously. Experimental results on the benchmark datasets MNIST, CIFAR-10, and CIFAR-100 show clear advantages over the compared baselines.


    1. 1)
      • 1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, 2012, pp. 10971105.
    2. 2)
      • 2. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, CoRR, 2014, abs/1409.1556.
    3. 3)
      • 3. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’, CoRR, 2015, abs/1512.03385.
    4. 4)
      • 4. Girshick, R.B., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’, CoRR, 2013, abs/1311.2524.
    5. 5)
      • 5. He, K., Gkioxari, G., Dollár, P., et al: ‘Mask R-CNN’, arXiv preprint arXiv:170306870, 2017.
    6. 6)
      • 6. Long, J., Shelhamer, E., Darrell, T.: ‘Fully convolutional networks for semantic segmentation’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 34313440.
    7. 7)
      • 7. Fu, R., Li, B., Gao, Y., et al: ‘Fully automatic figure-ground segmentation algorithm based on deep convolutional neural network and GrabCut’, IET Image Process., 2016, 10, (12), pp. 937942.
    8. 8)
      • 8. Silla, C.N.Jr., Freitas, A.A.: ‘A survey of hierarchical classification across different application domains’, Data Min. Knowl. Discov., 2011, 22, (1-2), pp. 3172.
    9. 9)
      • 9. Barutcuoglu, Z.: ‘Hierarchical shape classification using Bayesian aggregation’. Proc. IEEE Int. Conf. Shape Modeling and Applications, Matsushima, Japan, 2006.
    10. 10)
      • 10. Silla, C.N.Jr., Freitas, A.A.: ‘A global-model naive Bayes approach to the hierarchical prediction of protein functions’. 2009 Ninth IEEE Int. Conf. Data Mining, Miami, FL, USA, 2009, pp. 992997.
    11. 11)
      • 11. Jia, Y., Abbott, J.T., Austerweil, J.L., et al: ‘Visual concept learning: combining machine vision and Bayesian generalization on concept hierarchies’. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 18421850.
    12. 12)
      • 12. Gal, Y., Ghahramani, Z.: ‘Bayesian convolutional neural networks with Bernoulli approximate variational inference’, arXiv preprint arXiv:150602158, 2015.
    13. 13)
      • 13. Blundell, C., Cornebise, J., Kavukcuoglu, K., et al: ‘Weight uncertainty in neural networks’. Proc. 32nd Int. Conf. Int. Conf. Machine Learning, 2015, pp. 16131622.
    14. 14)
      • 14. Tieleman, T., Hinton, G.: ‘Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude’, COURSERA: Neural Netw. Mach. Learn., 2012, 4, (2), pp. 2631.
    15. 15)
      • 15. Kingma, D., Ba, J.: ‘Adam: a method for stochastic optimization’, arXiv preprint arXiv:14126980, 2014.
    16. 16)
      • 16. Ioffe, S., Szegedy, C.: ‘Batch normalization: accelerating deep network training by reducing internal covariate shift’. Int. Conf. Machine Learning, 2015, pp. 448456.
    17. 17)
      • 17. Liu, W., Wen, Y., Yu, Z., et al: ‘Large-margin softmax loss for convolutional neural networks.’. Int. Conf. Machine Learning, 2016, pp. 507516.
    18. 18)
      • 18. Srivastava, N., Salakhutdinov, R.R.: ‘Discriminative transfer learning with tree-based priors’. Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 20942102.
    19. 19)
      • 19. Deng, J., Ding, N., Jia, Y., et al: ‘Large-scale object classification using label relation graphs’. European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 4864.
    20. 20)
      • 20. Xiao, T., Zhang, J., Yang, K., et al: ‘Error-driven incremental learning in deep convolutional neural network for large-scale image classification’. ACM Multimedia, Orlando, FL, USA, 2014.
    21. 21)
      • 21. Yan, Z., Zhang, H., Piramuthu, R., et al: ‘HD-CNN: hierarchical deep convolutional neural networks for large scale visual recognition’. Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 27402748.
    22. 22)
      • 22. Zhu, X., Bain, M.: ‘B-CNN: branch convolutional neural network for hierarchical classification’, arXiv preprint arXiv:170909890, 2017.
    23. 23)
      • 23. Ruder, S.: ‘An overview of multi-task learning in deep neural networks’, arXiv preprint arXiv:170605098, 2017.
    24. 24)
      • 24. Gangaputra, S., Geman, D.: ‘A design principle for coarse-to-fine classification’. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conf., 2006, vol. 2, pp. 18771884.
    25. 25)
      • 25. Jia, Y., Shelhamer, E., Donahue, J., et al: ‘Caffe: convolutional architecture for fast feature embedding’, CoRR, 2014, abs/1408.5093.
    26. 26)
      • 26. LeCun, Y., Bottou, L., Bengio, Y., et al: ‘Gradient-based learning applied to document recognition’, Proc. IEEE, 1998, 86, (11), pp. 22782324.
    27. 27)
      • 27. Krizhevsky, A., Hinton, G.: ‘Learning multiple layers of features from tiny images’. Technical Report, University of Toronto, 2009.

Related content

This is a required field
Please enter a valid email address