Skeleton-based human activity recognition for elderly monitoring systems
- Author(s): Youssef Hbali 1 ; Sara Hbali 1 ; Lahoucine Ballihi 2 ; Mohammed Sadgal 1
-
-
View affiliations
-
Affiliations:
1:
Computer Systems Engineering Laboratory , Cadi Ayyad University , B.P. 2390, Avenue Prince My Abdellah, Marrakech , Morocco ;
2: LRIT-CNRST URAC 29, Faculty of Sciences , Mohammed V University In Rabat , Rabat , Morocco
-
Affiliations:
1:
Computer Systems Engineering Laboratory , Cadi Ayyad University , B.P. 2390, Avenue Prince My Abdellah, Marrakech , Morocco ;
- Source:
Volume 12, Issue 1,
February
2018,
p.
16 – 26
DOI: 10.1049/iet-cvi.2017.0062 , Print ISSN 1751-9632, Online ISSN 1751-9640
There is a significantly increasing demand for monitoring systems for elderly people in the health-care sector. As the aging population increases, patient privacy violations and the cost of elderly assistance have driven the research community toward computer vision and image processing to design and deploy new systems for monitoring the elderly in the authors’ society and turning their living houses into smart environments. By exploiting recent advances and the low cost of three-dimensional (3D) depth sensors such as Microsoft Kinect, the authors propose a new skeleton-based approach to describe the spatio-temporal aspects of a human activity sequence, using the Minkowski and cosine distances between the 3D joints. We trained and validated their approach on the Microsoft MSR 3D Action and MSR Daily Activity 3D datasets using the Extremely Randomised Trees algorithm. The results are very promising, demonstrating that the trained model can be used to build a monitoring system for the elderly using open-source libraries and a low-cost depth sensor.
Inspec keywords: image sensors; home computing; handicapped aids; image sequences; image recognition
Other keywords: MSR Daily Activity 3D dataset; low-cost depth sensor; skeleton-based approach; computer vision; aging population; open-source libraries; Minkowski distance; image processing; elderly monitoring systems; skeleton-based human activity recognition; Microsoft MSR 3D Action dataset; extremely randomised trees algorithm; human activity sequence; 3D depth sensors; cosine distance; Microsoft Kinect
Subjects: Computer assistance for persons with handicaps; Image sensors; Home computing; Aids for the handicapped; Computer vision and image processing techniques; Image recognition
References
-
-
1)
-
13. Geurts, P., Ernst, D., Wehenkel, L.: ‘Extremely randomized trees’, Mach. Learn., 2006, 63, (1), pp. 3–42.
-
-
2)
-
27. Hussein, M.E., Torki, M., Gowayyed, M.A., et al: ‘Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations’, IJCAI, 2013, 13, pp. 2466–2472.
-
-
3)
-
31. Yang, X., Zhang, C., Tian, Y.: ‘Recognizing actions using depth motion maps-based histograms of oriented gradients’. Proc. of the 20th ACM Int. Conf. Multimedia, 2012, pp. 1057–1060.
-
-
4)
-
37. Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: ‘A discussion on the validation tests employed to compare human action recognition methods using the msr action3d dataset’, 2014, arXiv:1407.7390, https://arxiv.org/abs/1407.7390.
-
-
5)
-
6. Peng, X., Wang, L., Wang, X., et al: ‘Bag of visual words and fusion methods for action recognition: comprehensive study and good practice’, Comput. Vis. Image Underst., 2016, 150, pp. 109–125.
-
-
6)
-
15. Auvinet, E., Rougier, C., Meunier, J., et al: ‘Multiple cameras fall dataset’. Tech. Rep., 1350, DIRO-Université de Montréal, 2010.
-
-
7)
-
12. Liu, M., Liu, M., Han, S., et al: ‘Tracking-based 3d human skeleton extraction from stereo video camera toward an on-site safety and ergonomic analysis’, Constr. Innov., 2016, 16, (3), pp. 348–367.
-
-
8)
-
16. Ye, M., Yang, R.: ‘Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera’. Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, 2014, pp. 2345–2352.
-
-
9)
-
18. Ding, M., Fan, G.: ‘Articulated and generalized Gaussian kernel correlation for human pose estimation’, IEEE Trans. Image Process., 2016, 25, (2), pp. 776–789.
-
-
10)
-
2. Jalal, A., Kim, Y.-H., Kim, Y.-J., et al: ‘Robust human activity recognition from depth video using spatiotemporal multi-fused features’, Pattern Recognit., 2017, 61, pp. 295–308.
-
-
11)
-
22. Ofli, F., Chaudhry, R., Kurillo, G., et al: ‘Sequence of the most informative joints (smij): a new representation for human skeletal action recognition’, J. Vis. Commun. Image Represent., 2014, 25, (1), pp. 24–38.
-
-
12)
-
26. Cal, X., Zhou, W., Wu, L., et al: ‘Effective active skeleton representation for low latency human action recognition’, IEEE Trans. Multimed., 2016, 18, (2), pp. 141–154.
-
-
13)
-
34. Yu, G., Liu, Z., Yuan, J.: ‘Discriminative orderlet mining for real-time recognition of human-object interaction’. Asian Conf. Computer Vision, 2014, pp. 50–65.
-
-
14)
-
20. Johansson, G.: ‘Visual perception of biological motion and a model for its analysis’, Percept. Psychophys., 1973, 14, (2), pp. 201–211.
-
-
15)
-
8. Tian, Y., Ruan, Q., An, G., et al: ‘Action recognition using local consistent group sparse coding with spatio-temporal structure’. Proc. of the 2016 ACM on Multimedia Conf., 2016, pp. 317–321.
-
-
16)
-
14. Li, W., Zhang, Z., Liu, Z.: ‘Action recognition based on a bag of 3d points’. 2010 IEEE Computer Society Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), 2010, pp. 9–14.
-
-
17)
-
42. Seidenari, L., Varano, V., Berretti, S., et al: ‘Recognizing actions from depth cameras as weakly aligned multi-part bag-of-poses’. Proc. of the IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2013, pp. 479–485.
-
-
18)
-
7. Zhen, X., Shao, L.: ‘Action recognition via spatio-temporal local features: a comprehensive study’, Image Vis. Comput., 2016, 50, pp. 1–13.
-
-
19)
-
32. Zhao, Y., Liu, Z., Yang, L., et al: ‘Combing rgb and depth map features for human activity recognition’. Signal & Information Processing Association Annual Summit and Conf. (APSIPA ASC), 2012 Asia-Pacific, 2012, pp. 1–4.
-
-
20)
-
38. Wang, J., Liu, Z., Chorowski, J., et al: ‘Robust 3d action recognition with random occupancy patterns’. Computer Vision – ECCV 2012, 2012, pp. 872–885.
-
-
21)
-
11. Boulahia, S.Y., Anquetil, E., Kulpa, R., et al: ‘Hif3d: handwriting-inspired features for 3d skeleton-based action recognition’. 2016 23rd Int. Conf. Pattern Recognition (ICPR), 2016, pp. 985–990.
-
-
22)
-
1. Motiian, S., Pergami, P., Guffey, K., et al: ‘Automated extraction and validation of childrens gait parameters with the kinect’, Biomed. Eng., 2015, 14, (1), p. 112.
-
-
23)
-
19. Motiian, S., Siyahjani, F., Almohsen, R., et al: ‘Online human interaction detection and recognition with multiple cameras’, IEEE Trans. Circuits Syst. Video Technol., 2017, 27, (3), pp. 649–663.
-
-
24)
-
9. Le, Q.V., Zou, W.Y., Yeung, S.Y., et al: ‘Learning hierarchical invariant spatiotemporal features for action recognition with independent subspace analysis’. 2011 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011, pp. 3361–3368.
-
-
25)
-
23. Yang, X., Tian, Y.L.: ‘Eigenjoints-based action recognition using naive-Bayes-nearest-neighbor’. 2012 IEEE Computer Society Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), 2012, pp. 14–19.
-
-
26)
-
35. Chaaraoui, A., Padilla-Lopez, J., Flórez-Revuelta, F.: ‘Fusion of skeletal and silhouette-based features for human action recognition with rgb-d devices’. Proc. of the IEEE Int. Conf. Computer Vision Workshops, 2013, pp. 91–97.
-
-
27)
-
24. Ben Tamou, A., Ballihi, L., Aboutajdine, D.: ‘Automatic learning of articulated skeletons based on mean of 3d joints for efficient action recognition’, Int. J. Pattern Recogn. Artif. Intell., 2017, 31, (04), p. 1750008.
-
-
28)
-
21. Zanfir, M., Leordeanu, M., Sminchisescu, C.: ‘The moving pose: an efficient 3d kinematics descriptor for low-latency action recognition and detection’. Proc. of the IEEE Int. Conf. Computer Vision, 2013, pp. 2752–2759.
-
-
29)
-
36. Wang, J., Liu, Z., Wu, Y., et al: ‘Mining actionlet ensemble for action recognition with depth cameras’. 2012 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 1290–1297.
-
-
30)
-
41. Althloothi, S., Mahoor, M.H., Zhang, X., et al: ‘Human activity recognition using multi-features and multiple kernel learning’, Pattern Recognit., 2014, 47, (5), pp. 1800–1812.
-
-
31)
-
25. Keceli, A.S., Can, A.B.: ‘Recognition of basic human actions using depth information’, Int. J. Pattern Recogn. Artif. Intell., 2014, 28, (02), p. 1450004.
-
-
32)
-
40. Chen, H., Wang, G., Xue, J.-H., et al: ‘A novel hierarchical framework for human action recognition’, Pattern Recognit., 2016, 55, pp. 148–159.
-
-
33)
-
3. Pöhlmann, S.T., Harkness, E.F., Taylor, C.J., et al: ‘Evaluation of Kinect 3d sensor for healthcare imaging’, J. Med. Biol. Eng., 2016, 36, (6), pp. 857–870.
-
-
34)
-
4. Du, Y., Wong, Y., Liu, Y., et al: ‘Markerless 3d human motion capture with monocular image sequence and height-maps’. European Conf. Computer Vision, 2016, pp. 20–36.
-
-
35)
-
28. Xia, L., Aggarwal, J.: ‘Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera’. Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 2834–2841.
-
-
36)
-
30. Oreifej, O., Liu, Z.: ‘Hon4d: histogram of oriented 4d normals for activity recognition from depth sequences’. Proc. of the IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 716–723.
-
-
37)
-
39. Xia, L., Chen, C.-C., Aggarwal, J.: ‘View invariant human action recognition using histograms of 3d joints’. 2012 IEEE Computer Society Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), 2012, pp. 20–27.
-
-
38)
-
17. Shotton, J., Sharp, T., Kipman, A., et al: ‘Real-time human pose recognition in parts from single depth images’, Commun. ACM, 2013, 56, (1), pp. 116–124.
-
-
39)
-
5. Liu, Z., Zhang, C., Tian, Y.: ‘3d-based deep convolutional neural network for action recognition with depth sequences’, Image Vis. Comput., 2016, 55, pp. 93–100.
-
-
40)
-
10. Tang, D., Chang, H., Tejani, A., et al: ‘Latent regression forest: structured estimation of 3d hand poses’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (7), pp. 1374–1387.
-
-
41)
-
29. Vieira, A., Nascimento, E., Oliveira, G., et al: ‘Stop: space-time occupancy patterns for 3d action recognition from depth map sequences’. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2012, pp. 252–259.
-
-
42)
-
33. Amor, B.B., Su, J., Srivastava, A.: ‘Action recognition using rate-invariant analysis of skeletal shape trajectories’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (1), pp. 1–13.
-
-
1)