http://iet.metastore.ingenta.com
1887

Action recognition based on motion of oriented magnitude patterns and feature selection

Action recognition based on motion of oriented magnitude patterns and feature selection

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Here, the authors introduce a novel system which incorporates the discriminative motion of oriented magnitude patterns (MOMP) descriptor into simple yet efficient techniques. The authors’ descriptor both investigates the relations of the local gradient distributions in neighbours among consecutive image sequences and characterises information changing across different orientations. The proposed system has two main contributions: (i) the authors adopt feature post-processing principal component analysis followed by vector of locally aggregated descriptors encoding to de-correlate MOMP descriptor and reduce the dimension in order to speed up the algorithm; (ii) then the authors include the feature selection (i.e. statistical dependency, mutual information, and minimal redundancy maximal relevance) to find out the best feature subset to improve the performance and decrease the computational expense in classification through support vector machine techniques. Experiment results on four data sets, Weizmann (98.4%), KTH (96.3%), UCF Sport (82.0%), and HMDB51 (31.5%), prove the efficiency of the authors’ algorithm.

References

    1. 1)
      • 1. Ji, S., Xu, W., Yang, M., et al: ‘3d convolutional neural networks for human action recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (1), pp. 221231.
    2. 2)
      • 2. Le, Q.V., Zou, W.Y., Yeung, S.Y., et al: ‘Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis’. 2011 IEEE Conf. on. Computer Vision and Pattern Recognition (CVPR). IEEE, Colorado Springs, CO, USA, 2011, pp. 33613368.
    3. 3)
      • 3. Simonyan, K., Zisserman, A.: ‘Two-stream convolutional networks for action recognition in videos’. Advances in Neural Information Processing Systems, Montreal, Quebec, Canada, 2014, pp. 568576.
    4. 4)
      • 4. Tran, D., Bourdev, L., Fergus, R., et al: ‘Learning spatiotemporal features with 3d convolutional networks’. IEEE Int. Conf. on Computer Vision (ICCV). IEEE, Santiago, Chile, 2015, pp. 44894497.
    5. 5)
      • 5. Feichtenhofer, C., Pinz, A., Zisserman, A.: ‘Convolutional two-stream network fusion for video action recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, 2016, pp. 19331941.
    6. 6)
      • 6. Zhu, W., Hu, J., Sun, G., et al: ‘A key volume mining deep framework for action recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 19911999.
    7. 7)
      • 7. Girdhar, R., Ramanan, D., Gupta, A., et al: ‘Actionvlad: learning spatio-temporal aggregation for action classification’, arXiv preprint arXiv:170402895, 2017.
    8. 8)
      • 8. Russakovsky, O., Deng, J., Su, H., et al: ‘Imagenet large scale visual recognition challenge’, Int. J. Comput. Vis., 2015, 115, (3), pp. 211252.
    9. 9)
      • 9. Jhuang, H., Garrote, H., Poggio, E., et al: ‘A large video database for human motion recognition’. Proc. of IEEE Int. Conf. on Computer Vision, Barcelona, Spain, 2011, vol. 4, p. 6.
    10. 10)
      • 10. Soomro, K., Zamir, A.R., Shah, M.: ‘Ucf101: a dataset of 101 human actions classes from videos in the wild’, arXiv preprint arXiv:12120402, 2012.
    11. 11)
      • 11. Laptev, I.: ‘On space-time interest points’, Int. J. Comput. Vis., 2005, 64, (2–3), pp. 107123.
    12. 12)
      • 12. Laptev, I., Marszałek, M., Schmid, C., et al: ‘Learning realistic human actions from movies’. 2008. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). IEEE, Anchorage, AK, USA, 2008, pp. 18.
    13. 13)
      • 13. Wang, H., Kläser, A., Schmid, C., et al: ‘Dense trajectories and motion boundary descriptors for action recognition’, Int. J. Comput. Vis., 2013, 103, (1), pp. 6079.
    14. 14)
      • 14. Wang, H., Schmid, C.: ‘Action recognition with improved trajectories’. Proc. of the IEEE Int. Conf. on Computer Vision, Sydney, Australia, 2013, pp. 35513558.
    15. 15)
      • 15. Scovanner, P., Ali, S., Shah, M.: ‘A 3-dimensional sift descriptor and its application to action recognition’. Proc. of the 15th ACM Int. Conf. on Multimedia. ACM, Augsburg, Germany, 2007, pp. 357360.
    16. 16)
      • 16. Faraki, M., Palhang, M., Sanderson, C.: ‘Log-Euclidean bag of words for human action recognition’, IET Comput. Vis., 2014, 9, (3), pp. 331339.
    17. 17)
      • 17. Rahman, S., Cho, S.Y., Leung, M.: ‘Recognising human actions by analysing negative spaces’, IET Comput. Vis., 2012, 6, (3), pp. 197213.
    18. 18)
      • 18. Clapes, A., Tuytelaars, T., Escalera, S.: ‘Darwintrees for action recognition’. The IEEE Int. Conf. on Computer Vision (ICCV), 2017, pp. 31693178.
    19. 19)
      • 19. Klaser, A., Marszałek, M., Schmid, C.: ‘A spatio-temporal descriptor based on 3d-gradients’. BMVC 19th British Machine Vision Conf. British Machine Vision Association, Leeds, UK, 2008, p. 275-1.
    20. 20)
      • 20. Dalal, N., Triggs, B., Schmid, C.: ‘Human detection using oriented histograms of flow and appearance’. European Conf. on Computer Vision. Springer, Graz, Austria, 2006, pp. 428441.
    21. 21)
      • 21. Jégou, H., Douze, M., Schmid, C., et al: ‘Aggregating local descriptors into a compact image representation’. 2010 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE, San Francisca, CA, USA, 2010, pp. 33043311.
    22. 22)
      • 22. Das, S.: ‘Filters, wrappers and a boosting-based hybrid for feature selection’. Int. Conf. on Machine Learning (ICML), Citeseer, 2001, vol. 1, pp. 7481.
    23. 23)
      • 23. Phan, H.H., Vu, N.S., Nguyen, V.L., et al: ‘Motion of oriented magnitudes patterns for human action recognition’. Int. Symp. on Visual Computing. Springer, Las Vegas, NV, USA, 2016, pp. 168177.
    24. 24)
      • 24. Donahue, J., Anne-Hendricks, L., Guadarrama, S., et al: ‘Long-term recurrent convolutional networks for visual recognition and description’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 26252634.
    25. 25)
      • 25. Lazaridou, A., Pham, N.T., Baroni, M.: ‘Combining language and vision with a multimodal skip-gram model’, arXiv preprint arXiv:150102598, 2015.
    26. 26)
      • 26. Lei, J., Li, G., Zhang, J., et al: ‘Continuous action segmentation and recognition using hybrid convolutional neural network-hidden Markov model model’, IET Comput. Vis., 2016, 10, (6), pp. 537544.
    27. 27)
      • 27. Bobick, A.F., Davis, J.W.: ‘The recognition of human movement using temporal templates’, IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23, (3), pp. 257267.
    28. 28)
      • 28. Yilmaz, A., Shah, M.: ‘Actions sketch: a novel action representation’. Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conf. on IEEE, 2005, vol. 1, pp. 984989.
    29. 29)
      • 29. Blank, M., Gorelick, L., Shechtman, E., et al: ‘Actions as spacetime shapes’. Computer Vision, 2005. ICCV 2005. Tenth IEEE Int. Conf. on, IEEE, Beijing, China, 2005, vol. 2, pp. 13951402.
    30. 30)
      • 30. Wang, L., Suter, D.: ‘Recognizing human activities from silhouettes: motion subspace and factorial discriminative graphical model’. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conf. on. IEEE, Minneapolis, MN, USA, 2007, pp. 18.
    31. 31)
      • 31. Wang, H., Ullah, M.M., Klaser, A., et al: ‘Evaluation of local spatio-temporal features for action recognition’. BMVC 2009-British Machine Vision Conf. BMVA Press, London, UK, 2009, p. 124-1.
    32. 32)
      • 32. Kantorov, V., Laptev, I.: ‘Efficient feature extraction, encoding and classification for action recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbis, OH, USA, 2014, pp. 25932600.
    33. 33)
      • 33. Peng, H., Long, F., Ding, C.: ‘Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy’, IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27, (8), pp. 12261238.
    34. 34)
      • 34. Schuldt, C., Laptev, I., Caputo, B.: ‘Recognizing human actions: a local SVM approach’. Pattern Recognition, 2004. ICPR 2004. Proc. of the 17th Int. Conf. on, IEEE, 2004, vol. 3, pp. 3236.
    35. 35)
      • 35. Rodriguez, M.D., Ahmed, J., Shah, M.: ‘Action mach a spatio-temporal maximum average correlation height filter for action recognition’. 2008 IEEE Conf. on. Computer Vision and Pattern Recognition (CVPR), IEEE, Anchorage, AK, USA, 2008, pp. 18.
    36. 36)
      • 36. Niebles, J.C., Fei-Fei, L.: ‘A hierarchical model of shape and appearance for human action classification’. IEEE Conf. on. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE, Minneapolis, MN, USA, 2007, pp. 18.
    37. 37)
      • 37. Ali, S., Basharat, A., Shah, M.: ‘Chaotic invariants for human action recognition’. Computer Vision, 2007. ICCV 2007. IEEE 11th Int. Conf. on. IEEE, Rio de Janeiro, Brazil, 2007, pp. 18.
    38. 38)
      • 38. Jhuang, H., Serre, T., Wolf, L., et al: ‘A biological inspired system for human action classification’. Proc. Int. Conf. on Computer Vision, Rio de Janeiro, Brazil, 2007, vol. 2.
    39. 39)
      • 39. Ikizler, N., Duygulu, P.: ‘Human action recognition using distribution of oriented rectangular patches’, Human Motion–Underst., Model., Capture Animat., 2007, pp. 271284.
    40. 40)
      • 40. Tran, D., Sorokin, A.: ‘Human activity recognition with metric learning’, European Conf. Computer Vision–ECCV 2008, 2008, Marseilles, France, pp. 548561.
    41. 41)
      • 41. Gorelick, L., Blank, M., Shechtman, E., et al: ‘Actions as space-time shapes’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (12), pp. 22472253.
    42. 42)
      • 42. Yang, W., Wang, Y., Mori, G.: ‘Human action recognition from a single clip per action’. Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th Int. Conf. on. IEEE, Kyoto, Japan, 2009, pp. 482489.
    43. 43)
      • 43. Dollár, P., Rabaud, V., Cottrell, G., et al: ‘Behavior recognition via sparse spatio-temporal features’. Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005. IEEE, Beijing, China, 2005, pp. 6572.
    44. 44)
      • 44. Yeffet, L., Wolf, L.: ‘Local trinary patterns for human action recognition’. Computer Vision, 2009 IEEE 12th Int. Conf. on. IEEE, Kyoto, Japan, 2009, pp. 492497.
    45. 45)
      • 45. Kovashka, A., Grauman, K.: ‘Learning a hierarchy of discriminative space-time neighborhood features for har’. 2010 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 2010, pp. 20462053.
    46. 46)
      • 46. Taylor, G.W., Fergus, R., LeCun, Y., et al: ‘Convolutional learning of spatiotemporal features’. European Conf. on Computer Vision. Springer, Heraklion, Crete, Greece, 2010, pp. 140153.
    47. 47)
      • 47. Wang, J., Chen, Z., Wu, Y.: ‘Action recognition with multiscale spatio-temporal contexts’. Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conf. on. IEEE, Colorado Springs, CO, USA, 2011, pp. 31853192.
    48. 48)
      • 48. Sadanand, S., Corso, J.J.: ‘Action Bank: a high-level representation of activity in video’. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conf. on. IEEE, Providence, RI, USA, 2012, pp. 12341241.
    49. 49)
      • 49. Jiang, Z., Lin, Z., Davis, L.: ‘Recognizing human actions by learning and matching shape-motion prototype trees’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (3), pp. 533547.
    50. 50)
      • 50. Kliper-Gross, O., Gurovich, Y., Hassner, T., et al: ‘Motion interchange patterns for action recognition in unconstrained videos’. European Conf. on Computer Vision. Springer, Florence, Italy, 2012, pp. 256269.
    51. 51)
      • 51. Zhou, Q., Wang, G.: ‘Atomic action features: a new feature for action recognition’. European Conf. on Computer Vision. Springer, Florence, Italy, 2012, pp. 291300.
    52. 52)
      • 52. Shabani, A.H., Zelek, J.S., Clausi, D.A.: ‘Multiple scale-specific representations for improved human action recognition’, Pattern Recognit. Lett., 2013, 34, (15), pp. 17711779.
    53. 53)
      • 53. Iosifidis, A., Tefas, A., Pitas, I.: ‘Discriminant bag of words based representation for human action recognition’, Pattern Recognit. Lett., 2014, 49, pp. 185192.
    54. 54)
      • 54. Liu, L., Shao, L., Li, X., et al: ‘Learning spatio-temporal representations for action recognition: a genetic programming approach’, IEEE Trans. Cybern., 2016, 46, (1), pp. 158170.
    55. 55)
      • 55. Kläser, A.: ‘Learning human actions in video’. PhD thesis, Université de Grenoble, 2010.
    56. 56)
      • 56. Al-Ghamdi, M., Zhang, L., Gotoh, Y.: ‘Spatio-temporal sift and its application to human action classification’. European Conf. on Computer Vision. Springer, Florence, Italy, 2012, pp. 301310.
    57. 57)
      • 57. Bilen, H., Fernando, B., Gavves, E., et al: ‘Dynamic image networks for action recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 30343042.
    58. 58)
      • 58. Varol, G., Laptev, I., Schmid, C.: ‘Long-term temporal convolutions for action recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2017.
    59. 59)
      • 59. Feichtenhofer, C., Pinz, A., Wildes, R.P.: ‘Spatiotemporal multiplier networks for video action recognition’. 2017 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, USA, 2017, pp. 74457454.
    60. 60)
      • 60. Misra, I., Zitnick, C.L., Hebert, M.: ‘Shuffle and learn: unsupervised learning using temporal order verification’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 527544.
    61. 61)
      • 61. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770778.
    62. 62)
      • 62. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, 2012, pp. 10971105.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0282
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0282
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address