Trimmed categorical cross-entropy for deep learning with label noise

Trimmed categorical cross-entropy for deep learning with label noise

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
Electronics Letters — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Deep learning methods are nowadays considered as state-of-the-art approach in many sophisticated problems, such as computer vision, speech understanding or natural language processing. However, their performance relies on the quality of large annotated datasets. If the data are not well-annotated and label noise occur, such data-driven models become less reliable. In this Letter, the authors present very simple way to make the training process robust to noisy labels. Without changing network architecture and learning algorithm, the authors apply modified error measure that improves network generalisation when trained with label noise. Preliminary results obtained for deep convolutional neural networks, trained with novel trimmed categorical cross-entropy loss function, revealed its improved robustness for several levels of label noise.


    1. 1)
      • 1. Bengio, Y., Lamblin, P., Popovici, D., et al: ‘Greedy layer-wise training of deep networksNIPS 2006, Vancouver, BC, Canada, December 2006, pp. 153160.
    2. 2)
      • 2. Salakhutdinov, R., Hinton, G.E.: ‘Semantic hashing’. Proc. of the 2007 Workshop on Information Retrieval and applications of Graphical Models (SIGIR 2007), 2007, Amsterdam.
    3. 3)
      • 3. Vahdat, A.: ‘Toward robustness against label noise in training deep discriminative neural networks’. Neural Information Processing Systems (NIPS), Long Beach, CA, USA, December 2017.
    4. 4)
    5. 5)
    6. 6)
      • 6. El-Melegy, M., Essai, M., Ali, A.: ‘Robust training of artificial feedforward neural networks, in Abraham, A.E., Vasilakos, A., Pedrycz, A., Hassanien, W. (Eds.): ‘Foundations of Computational, Intelligence Volume 1, Studies in Computational Intelligence’, vol. 201, Springer, Berlin, Heidelberg, 2009, pp. 217242.
    7. 7)
    8. 8)
    9. 9)
    10. 10)
    11. 11)
      • 11. Guan, M.Y., Gulshan, V., Dai, A.M., et al: ‘Who said what: modeling individual labelers improves classification’, 2017, arXiv:1703.08774.
    12. 12)
      • 12. Joulin, A., van der Maaten, L., Jabri, A., et al: ‘: learning visual features from large weakly supervised data’. European Conf. on Computer Vision, (ECCV), Amsterdam, Netherlands, October 2016.
    13. 13)
      • 13. Misra, I., Lawrence, Z.C., Mitchell, M., et al: ‘Seeing through the human reporting bias: visual classifiers from noisy human-centric labels’. Computer Vision and Pattern Recognition (CVPR), Las Vegas, CA, USA, 26 June–1 July 2016.
    14. 14)
      • 14. Reed, S., Lee, H., Anguelov, D., et al: ‘Training deep neural networks on noisy labels with boot- strapping’, 2014, arXiv preprint arXiv:1412.6596.
    15. 15)
      • 15. Van Horn, G., Branson, S., Farrell, R., et al: ‘Building a bird recognition app and large scale dataset with citizen scientists: the fine print in fine-grained dataset collection’. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015.
    16. 16)
      • 16. Natarajan, N., Inderjit, S.D., Ravikumar, P.K., et al: ‘Learning with noisy labels’. Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, CA, USA, December 2013.
    17. 17)
      • 17. Sukhbaatar, S., Bruna, J., Paluri, M., et al: ‘Training convolutional networks with noisy labels’, 2014, arXiv preprint arXiv:1406.2080.
    18. 18)
      • 18. Veit, A., Alldrin, N., Chechik, G., et al: ‘Learning from noisy large-scale datasets with minimal supervision’. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 2017.
    19. 19)
      • 19. Xiao, T., Xia, T., Yang, Y., et al: ‘Learning from massive noisy labeled data for image classification’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015, pp. 26912699.
    20. 20)
      • 20. Ghosh, A., Kumar, H., Sastry, P.S.: ‘Robust Loss Functions under Label Noise for Deep Neural Networks’, 2017, arXiv:1712.09482v1.
    21. 21)
      • 21. Patrini, G., Rozza, A., Menon, A., et al: ‘Making neural networks robust to label noise: a loss correction approach’. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 2017.
    22. 22)
      • 22. LeCun, Y., Cortes, C.: ‘MNIST handwritten digit database’, Available at
    23. 23)
      • 23. Krizhevsky, A.: ‘Learning multiple layers of features from tiny images’. Technical report, 2009.
    24. 24)
      • 24. Kingma, D., Ba, J.: ‘Adam: a method for stochastic optimization’, 2014, arXiv preprint arXiv:1412.6980.

Related content

This is a required field
Please enter a valid email address