access icon free Survey on deep learning methods in human action recognition

A study on one of the most important issues in a human action recognition task, i.e. how to create proper data representations with a high-level abstraction from large dimensional noisy video data, is carried out. Most of the recent successful studies in this area are mainly focused on deep learning. Deep learning methods have gained superiority to other approaches in the field of image recognition. In this survey, the authors first investigate the role of deep learning in both image and video processing and recognition. Owing to the variety and plenty of deep learning methods, the authors discuss them in a comparative form. For this purpose, the authors present an analytical framework to classify and to evaluate these methods based on some important functional measures. Furthermore, a categorisation of the state-of-the-art approaches in deep learning for human action recognition is presented. The authors summarise the significantly related works in each approach and discuss their performance.

Inspec keywords: learning (artificial intelligence); video signal processing; image motion analysis; image recognition; data structures

Other keywords: video processing; data representations; video recognition; deep learning methods; noisy video data; image recognition; human action recognition

Subjects: Video signal processing; File organisation; Image recognition; Knowledge engineering techniques; Computer vision and image processing techniques

References

    1. 1)
      • 22. Deng, L., Yu, D.: ‘Deep learning: methods and applications’, Found. Trends Signal Process., 2014, 7, (3-4), pp. 197387.
    2. 2)
      • 78. Hutchinson, B., Deng, L., Yu, D.: ‘Tensor deep stacking networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 19441957.
    3. 3)
      • 39. Schmidhuber, J.: ‘Deep learning in neural networks: an overview’, Neural Netw., 2015, 61, pp. 85117.
    4. 4)
      • 31. Deng, L.: ‘A tutorial survey of architectures, algorithms, and applications for deep learning’, APSIPA Trans. Signal Inf. Process., 2014, 3, p. e2.
    5. 5)
      • 70. Wu, Z., Wang, X., Jiang, Y.G., et al: ‘Modeling spatial-temporal clues in a hybrid deep learning framework for video classification’. Proc. 23rd ACM Int. Conf. on Multimedia, 2015.
    6. 6)
      • 7. Karpathy, A., Toderici, G., Shetty, S., et al: ‘Large-scale video classification with convolutional neural networks’. 2014 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014.
    7. 7)
      • 65. Yang, Y., Shu, G., Shah, M.: ‘Semi-supervised learning of feature hierarchies for object detection in a video’. 2013 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.
    8. 8)
      • 44. Dean, J., Corrado, G., Monga, R., et al: ‘Large scale distributed deep networks’. Advances in Neural Information Processing Systems, 2012.
    9. 9)
      • 100. Tran, D., Bourdev, L., Fergus, R., et al: ‘Learning spatiotemporal features with 3d convolutional networks’. Proc. IEEE Int. Conf. on Computer Vision, 2015.
    10. 10)
      • 49. Du, Y., Wang, W., Wang, L.: ‘Hierarchical recurrent neural network for skeleton based action recognition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2015.
    11. 11)
      • 101. Sun, L., Jia, K., Yeung, D.Y., et al: ‘Human action recognition using factorized spatio-temporal convolutional networks’. Proc. IEEE Int. Conf. on Computer Vision, 2015.
    12. 12)
      • 53. Schulz, H., Behnke, S.: ‘Deep learning’, KI-Künstliche Intelligenz, 2012, 26, (4), pp. 357363.
    13. 13)
      • 96. Deng, L., Yu, D., Platt, J.: ‘Scalable stacking and learning for building deep architectures’. 2012 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2012.
    14. 14)
      • 105. Wang, L., Qiao, Y., Tang, X.: ‘Action recognition with trajectory-pooled deep-convolutional descriptors’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2015.
    15. 15)
      • 114. Escorcia, V., Heilbron, F.C., Niebles, J.C., et al: ‘DAPS: deep action proposals for action understanding’. European Conf. on Computer Vision, 2016.
    16. 16)
      • 58. Kavukcuoglu, K., Sermanet, P., Boureau, Y.L., et al: ‘Learning convolutional feature hierarchies for visual recognition’. Advances in Neural Information Processing Systems, 2010.
    17. 17)
      • 21. Deng, L.: ‘Three classes of deep learning architectures and their applications: a tutorial survey’, APSIPA Trans. Signal Inf. Process., 2012, 10.1017/atsip.2013.9. Available online: https://www.microsoft.com/en-us/research/publication/three-classes-of-deep-learning-architectures-and-their-applications-a-tutorial-survey/ (accessed 7 July 2016).
    18. 18)
      • 28. Wang, H., Schmid, C.: ‘Action recognition with improved trajectories’. Proc. IEEE Int. Conf. on Computer Vision, 2013.
    19. 19)
      • 89. Salakhutdinov, R., Tenenbaum, J.B., Torralba, A.: ‘Learning with hierarchical-deep models’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 19581971.
    20. 20)
      • 33. Vogt, R.L., Chinn, K.B., Kotta, P., et al: ‘Science & technology review June 2016’ (Lawrence Livermore National Lab. (LLNL), Livermore, CA, 2016).
    21. 21)
      • 19. Längkvist, M., Karlsson, L., Loutfi, A.: ‘A review of unsupervised feature learning and deep learning for time-series modeling’, Pattern Recognit. Lett., 2014, 42, pp. 1124.
    22. 22)
      • 86. Misra, I., Zitnick, C.L., Hebert, M.: ‘Shuffle and learn: unsupervised learning using temporal order verification’. European Conf. on Computer Vision, 2016.
    23. 23)
      • 68. Dahl, G.E., Yu, D., Deng, L., et al: ‘Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition’, IEEE Trans. Audio, Speech Lang. Process., 2012, 20, (1), pp. 3042.
    24. 24)
      • 102. Li, C., Chen, C., Zhang, B., et al: ‘Deep spatio-temporal manifold network for action recognition’, arXiv preprint arXiv:1705.03148, 2017.
    25. 25)
      • 11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, 2012.
    26. 26)
      • 1. Bengio, Y.: ‘Deep learning of representations: looking forward’. Statistical Language and Speech Processing, 2013, pp. 137.
    27. 27)
      • 25. Cho, H., Lee, H., Jiang, Z.: ‘Evaluation of LC-KSVD on UCF101 action dataset’. THUMOS: ICCV Workshop on Action Recognition with a Large Number of Classes, 2013.
    28. 28)
      • 99. Ji, S., Xu, W., Yang, M., et al: ‘3D convolutional neural networks for human action recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (1), pp. 221231.
    29. 29)
      • 103. Wang, L., Xiong, Y., Wang, Z., et al: ‘Towards good practices for very deep two-stream ConvNets’, arXiv preprint arXiv:1507.02159, 2015.
    30. 30)
      • 18. Chen, G., Clarke, D., Giuliani, M., et al: ‘Combining unsupervised learning and discrimination for 3D action recognition’, Signal Process., 2015, 110, pp. 6781.
    31. 31)
      • 16. Zeiler, M.D.: ‘Hierarchical convolutional deep learning in computer vision’ (New York University, 2013).
    32. 32)
      • 2. Bengio, Y., Courville, A., Vincent, P.: ‘Representation learning: a review and new perspectives’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 17981828.
    33. 33)
      • 27. Wang, H., Kläser, A., Schmid, C., et al: ‘Dense trajectories and motion boundary descriptors for action recognition’, Int. J. Comput. Vis., 2013, 103, (1), pp. 6079.
    34. 34)
      • 92. Bengio, Y., Lamblin, P., Popovici, D., et al: ‘Greedy layer-wise training of deep networks’, Advances in Neural Information Processing Systems, 2007, 19, p. 153.
    35. 35)
      • 104. Feichtenhofer, C., Pinz, A., Zisserman, A.: ‘Convolutional two-stream network fusion for video action recognition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2016.
    36. 36)
      • 38. Donahue, J., Anne Hendricks, L., Guadarrama, S., et al: ‘Long-term recurrent convolutional networks for visual recognition and description’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2015.
    37. 37)
      • 43. Kawaguchi, K.: ‘Deep learning without poor local minima’. Advances in Neural Information Processing Systems, 2016.
    38. 38)
      • 41. Bengio, Y.: ‘Learning deep architectures for AI’, Found. Trends Mach. Learn., 2009, 2, (1), pp. 1127.
    39. 39)
      • 81. Raina, R., Madhavan, A., Ng, A.Y.: ‘Large-scale deep unsupervised learning using graphics processors’. Proc. 26th Annual International Conf. on Machine Learning, 2009.
    40. 40)
      • 76. Deng, L., Yu, D.: ‘Deep learning for signal and information processing’ (2013), Microsoft Research Monograph Microsoft Research One Microsoft Way Redmond, WA 98052, https://www.microsoft.com/en-us/research/publication/deep-learning-for-signal-and-information-processing/.
    41. 41)
      • 32. Dosovitskiy, A., Fischer, P., Springenberg, J.T., et al: ‘Discriminative unsupervised feature learning with exemplar convolutional neural networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (9), pp. 17341747.
    42. 42)
      • 108. Li, Q, et al: ‘Action recognition by learning deep multi-granular spatio-temporal video representation’. Proc. 2016 ACM Int. Conf. on Multimedia Retrieval, 2016.
    43. 43)
      • 66. Zhu, X.: ‘Semi-supervised learning literature survey’, Computer Science, University of Wisconsin-Madison, 2006, 2, (3), p. 4.
    44. 44)
      • 12. Le, Q.V.: ‘Building high-level features using large scale unsupervised learning’. 2013 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2013.
    45. 45)
      • 62. Ngiam, J., Chen, Z., Koh, P.W., et al: ‘Learning deep energy models’. Proc. 28th Int. Conf. on Machine Learning (ICML-11), 2011.
    46. 46)
      • 42. Erhan, D., Manzagol, P.A., Bengio, Y., et al: ‘The difficulty of training deep architectures and the effect of unsupervised pre-training’. Int. Conf. on Artificial Intelligence and Statistics, 2009.
    47. 47)
      • 95. Charalampous, K., Gasteratos, A.: ‘A tensor-based deep learning framework’, Image Vis. Comput., 2014, 32, (11), pp. 916929.
    48. 48)
      • 57. Rezende, D., Danihelka, I., Gregor, K., et al: ‘One-shot generalization in deep generative models’. Int. Conf. on Machine Learning, 2016.
    49. 49)
      • 14. Rifai, S., Bengio, Y., Courville, , et al: ‘Disentangling factors of variation for facial expression recognition’. Computer Vision–ECCV 2012, 2012, pp. 808822.
    50. 50)
      • 8. Simonyan, K., Zisserman, A.: ‘Two-stream convolutional networks for action recognition in videos’. Advances in Neural Information Processing Systems, 2014.
    51. 51)
      • 83. Ouyang, Y., Liu, W., Rong, W., et al: ‘Autoencoder-based collaborative filtering’. Int. Conf. on Neural Information Processing, 2014.
    52. 52)
      • 61. Zhu, W., Hu, J., Sun, G., et al: ‘A key volume mining deep framework for action recognition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2016.
    53. 53)
      • 4. Coates, A., Ng, A.Y., Lee, H.: ‘An analysis of single-layer networks in unsupervised feature learning’. Int. Conf. on Artificial Intelligence and Statistics, 2011.
    54. 54)
      • 112. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, arXiv preprint arXiv:1409.1556, 2014.
    55. 55)
      • 60. Wen, Y., Zhang, K., Li, Z., et al: ‘A discriminative feature learning approach for deep face recognition’. European Conf. on Computer Vision, 2016.
    56. 56)
      • 79. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: ‘Extreme learning machine: theory and applications’, Neurocomputing, 2006, 70, (1), pp. 489501.
    57. 57)
      • 56. Nair, V., Hinton, G.E.: ‘3D object recognition with deep belief nets’. Advances in Neural Information Processing Systems, 2009.
    58. 58)
      • 13. Peng, X., Zou, C., Qiao, Y., et al: ‘Action recognition with stacked fisher vectors’. Computer Vision–ECCV 2014, 2014, pp. 581595.
    59. 59)
      • 48. Yang, Y.: ‘Learning hierarchical representations for video analysis using deep learning’ (University of Central Florida, Orlando, FL, 2013).
    60. 60)
      • 84. Wang, W., Huang, Y., Wang, Y., et al: ‘Generalized autoencoder: a neural network framework for dimensionality reduction’. 2014 IEEE Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW), 2014.
    61. 61)
      • 46. Srivastava, N., Hinton, G.E., Krizhevsky, A., et al: ‘Dropout: a simple way to prevent neural networks from overfitting’, J. Mach. Learn. Res., 2014, 15, (1), pp. 19291958.
    62. 62)
      • 87. Kingma, D.P., Mohamed, S., Rezende, D.J., et al: ‘Semi-supervised learning with deep generative models’. Advances in Neural Information Processing Systems, 2014.
    63. 63)
      • 52. Sun, Y., Wang, X., Tang, X.: ‘Hybrid deep learning for face verification’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (10), pp. 19972009.
    64. 64)
      • 110. Yue-Hei Ng, J., Hausknecht, M., Vijayanarasimhan, S., et al: ‘Beyond short snippets: deep networks for video classification’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2015.
    65. 65)
      • 47. Baccouche, M., Mamalet, F., Wolf, C., et al: ‘Sequential deep learning for human action recognition’. Human Behavior Understanding, 2011, pp. 2939.
    66. 66)
      • 37. Srinivas, S., Sarvadevabhatla, R.K., Mopuri, K.R., et al: ‘A taxonomy of deep convolutional neural nets for computer vision’, arXiv preprint arXiv:1601.06615, 2016.
    67. 67)
      • 88. Weston, J., Ratle, F., Mobahi, H., et al: ‘Deep learning via semi-supervised embedding’, in ‘Neural networks: tricks of the trade’ (Springer, Berlin Heidelberg, 2012), pp. 639655.
    68. 68)
      • 73. Jia, X., Li, K., Li, X., et al: ‘A novel semi-supervised deep learning framework for affective state recognition on EEG signals’. 2014 IEEE Int. Conf. on Bioinformatics and Bioengineering (BIBE), 2014.
    69. 69)
      • 113. Veeriah, V., Zhuang, N., Qi, G.-J.: ‘Differential recurrent neural networks for action recognition’. Proc. IEEE Int. Conf. on Computer Vision, 2015.
    70. 70)
      • 80. Fukushima, K.: ‘Improved generalization ability using constrained neural network architectures’. Proc. 1993 Int. Joint Conf. on Neural Networks, 1993. IJCNN'93-Nagoya, 1993.
    71. 71)
      • 23. Wu, D., Sharma, N., Blumenstein, M.: ‘Recent advances in video-based human action recognition using deep learning: a review’. 2017 Int. Joint Conf. on Neural Networks (IJCNN), 2017.
    72. 72)
      • 9. Wang, L., Xiong, Y., Wang, Z., et al: ‘Temporal segment networks: towards good practices for deep action recognition’. European Conf. on Computer Vision, 2016.
    73. 73)
      • 111. Lev, G., Sadeh, G., Klein, B., et al: ‘RNN fisher vectors for action recognition and image annotation’. European Conf. on Computer Vision, 2016.
    74. 74)
      • 50. Rahmani, H., Mian, A., Shah, M.: ‘Learning a deep model for human action recognition from novel viewpoints’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, abs/1602.00828, doi: 10.1103/PhysRevD.94.065007.
    75. 75)
      • 106. Bilen, H., Fernando, B., Gavves, E., et al: ‘Dynamic image networks for action recognition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2016.
    76. 76)
      • 29. Kuehne, H., Jhuang, H., Stiefelhagen, R., et al: ‘HMDB51: a large video database for human motion recognition’. High Performance Computing in Science and Engineering ‘12, 2013, pp. 571582.
    77. 77)
      • 91. Taylor, G.W., Fergus, R., LeCun, Y., et al: ‘Convolutional learning of spatio-temporal features’. Computer Vision–ECCV 2010, 2010, pp. 140153.
    78. 78)
      • 55. Hinton, G.E., Salakhutdinov, R.R.: ‘Reducing the dimensionality of data with neural networks’, Science, 2006, 313, (5786), pp. 504507.
    79. 79)
      • 63. Le, Q.V., Zou, W.Y., Yeung, S.Y., et al: ‘Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis’. 2011 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011.
    80. 80)
      • 6. LeCun, Y., Bengio, Y., Hinton, G.: ‘Deep learning’, Nature, 2015, 521, (7553), pp. 436444.
    81. 81)
      • 24. Soomro, K., Zamir, A.R., Shah, M.: ‘UCF101: a dataset of 101 human actions classes from videos in the wild’, arXiv preprint arXiv: 1212.0402, 2012.
    82. 82)
      • 5. Bengio, Y., Courville, A.C., Vincent, P.: ‘Unsupervised feature learning and deep learning: a review and new perspectives’, CoRR, 2012, 1, abs/1206.5538.
    83. 83)
      • 94. Tan, C.C., Eswaran, C.: ‘Performance comparison of three types of autoencoder neural networks’. Second Asia Int. Conf. on Modeling & Simulation, 2008 (AICMS 08), 2008.
    84. 84)
      • 34. Salakhutdinov, R.: ‘Learning deep generative models’ (University of Toronto, 2009).
    85. 85)
      • 20. Dosovitskiy, A., Springenberg, J.T., Riedmiller, M., et al: ‘Discriminative unsupervised feature learning with convolutional neural networks’. Advances in Neural Information Processing Systems, 2014.
    86. 86)
      • 107. Liu, J., Shahroudy, A., Xu, D., et al: ‘Spatio-temporal LSTM with trust gates for 3D human action recognition’. European Conf. on Computer Vision, 2016.
    87. 87)
      • 64. Kallenberg, M., Petersen, K., Nielsen, M., et al: ‘Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring’, IEEE Trans. Med. Imaging, 2016, 35, (5), pp. 13221331.
    88. 88)
      • 30. Feichtenhofer, C., Pinz, A., Wildes, R.: ‘Spatiotemporal residual networks for video action recognition’. Advances in Neural Information Processing Systems, 2016.
    89. 89)
      • 45. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2016.
    90. 90)
      • 109. Srivastava, N., Mansimov, E., Salakhudinov, R.: ‘Unsupervised learning of video representations using LSTMS’. Int. Conf. on Machine Learning, 2015.
    91. 91)
      • 26. Shabani, A.H., Clausi, D.A., Zelek, J.S.: ‘Evaluation of local spatio-temporal salient feature detectors for human action recognition’. 2012 Ninth Conf. on Computer and Robot Vision (CRV), 2012.
    92. 92)
      • 40. Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., et al: ‘Deep learning applications and challenges in big data analytics’, J. Big Data, 2015, 2, (1), p. 1.
    93. 93)
      • 85. Wang, M., Sha, F., Jordan, M.I.: ‘Unsupervised kernel dimension reduction’. Advances in Neural Information Processing Systems, 2010.
    94. 94)
      • 10. Varol, G., Laptev, I., Schmid, C.: ‘Long-term temporal convolutions for action recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, doi: 10.1109/TPAMI.2017.2712608.
    95. 95)
      • 59. Zeiler, M.D., Fergus, R.: ‘Stochastic pooling for regularization of deep convolutional neural networks’, arXiv preprint arXiv:1301.3557, 2013.
    96. 96)
      • 98. Park, E., Han, X., Berg, T.L., et al: ‘Combining multiple sources of knowledge in deep CNNs for action recognition’. 2016 IEEE Winter Conf. on Applications of Computer Vision (WACV), 2016.
    97. 97)
      • 3. Reed, S., Sohn, K., Zhang, Y., et al: ‘Learning to disentangle factors of variation with manifold interaction’. Proc. 31st Int. Conf. on Machine Learning (ICML-14), 2014.
    98. 98)
      • 74. Bornschein, J., Bengio, Y.: ‘Reweighted wake-sleep’, arXiv preprint arXiv:1406.2751, 2014.
    99. 99)
      • 36. Sun, L., Jia, K., Chan, T.H., et al: ‘DL-SFA: deeply-learned slow feature analysis for action recognition’. 2014 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2014.
    100. 100)
      • 82. Ng, A.: ‘Sparse autoencoder’, CS294A Lecture notes, 72, 2011.
    101. 101)
      • 15. Ciresan, D., Meier, U., Schmidhuber, J.: ‘Multi-column deep neural networks for image classification’. 2012 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2012.
    102. 102)
      • 93. Khanna, R., Awad, M.: ‘Efficient learning machines: theories, concepts, and applications for engineers and system designers’ (Apress, 2015).
    103. 103)
      • 71. Erhan, D., Bengio, Y., Courville, A., et al: ‘Why does unsupervised pre-training help deep learning?’, J. Mach. Learn. Res., 2010, 11, pp. 625660.
    104. 104)
      • 67. Gan, J., Li, L., Zhai, Y., et al: ‘Deep self-taught learning for facial beauty prediction’, Neurocomputing, 2014, 144, pp. 295303.
    105. 105)
      • 75. Deng, L.: ‘An overview of deep-structured learning for information processing’. Proc. Asian-Pacific Signal & Information Processing Annual Summit and Conf. (APSIPA-ASC), 2011.
    106. 106)
      • 69. Sun, Y., Wang, X., Tang, X.: ‘Hybrid deep learning for face verification’. 2013 IEEE Int. Conf. on Computer Vision (ICCV), 2013.
    107. 107)
      • 54. Koohzadi, M., Keyvanpour, M.R.: ‘An analytical framework for event mining in video data’, Artif. Intell. Rev., 2014, 41, (3), pp. 401413.
    108. 108)
      • 35. Dinh, L., Krueger, D., Bengio, Y.: ‘NICE: non-linear independent components estimation’, arXiv preprint arXiv: 1410.8516, 2014.
    109. 109)
      • 97. Sharma, S., Kiros, R., Salakhutdinov, R.: ‘Action recognition using visual attention’, arXiv preprint arXiv:1511.04119, 2015.
    110. 110)
      • 77. Hinton, G.E.: ‘Learning multiple layers of representation’, Trends Cogn. Sci., 2007, 11, (10), pp. 428434.
    111. 111)
      • 17. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Human-level control through deep reinforcement learning’, Nature, 2015, 518, (7540), pp. 529533.
    112. 112)
      • 72. Ranzato, M.A., Susskind, J., Mnih, V., et al: ‘On deep generative models with applications to recognition’. 2011 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011.
    113. 113)
      • 51. Higgins, I., Matthey, L., Glorot, X., et al: ‘Early visual concept learning with unsupervised deep learning’, arXiv preprint arXiv:1606.05579, 2016.
    114. 114)
      • 90. Miao, Y., Metze, F.: ‘Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training’, 2013.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2016.0355
Loading

Related content

content/journals/10.1049/iet-cvi.2016.0355
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading