Your browser does not support JavaScript!

Human pose estimation method based on single depth image

Human pose estimation method based on single depth image

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Many of current human pose estimation methods based on depth images require training stage. However, the training stage costs huge work on making samples. And many methods for human pose occlusion condition cannot work well. In this study, a novel approach to estimate human pose with a depth image called model-based recursive matching (MRM) is introduced. A human skeleton model with customised parameters is created based on T-pose to fit different body types. The authors use depth image and 3D point cloud corresponding to input. In contrast to previous work, the proposed method avoids training step and can give an accurate estimation in the case of the human occlusion condition. They demonstrate the method by comparing to the method Kinect offered by using random forest on 20 human poses. And the ground truth of coordinates of pose joint is made by the motion capture system. The result shows that the proposed method not only works well on the general human pose but also can deal with human occlusion better. And the authors’ method can be also applied to the disabled people and other creatures.


    1. 1)
      • 4. Holte, M.B., Moeslund, T.B., Fihl, P.: ‘Fusion of range and intensity information for view invariant gesture recognition’. Computer Vision and Pattern Recognition, Anchorage, AK, USA, 2008, pp. 17.
    2. 2)
      • 6. Zuffi, S., Freifeld, O., Black, M.J.: ‘From pictorial structures to deformable structures’. IEEE Conf. Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 35463553.
    3. 3)
      • 22. Xia, L., Chen, C., Aggarwal, J.K.: ‘View invariant human action recognition using histograms of 3D joints’. Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 2027.
    4. 4)
      • 17. Zuffi, S., Black, M.J.: ‘The stitched puppet: a graphical model of 3D human shape and pose’. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 35373546.
    5. 5)
      • 2. Bregler, C., Malik, J.: ‘Tracking people with twists and exponential maps’. Computer Vision and Pattern Recognition, Santa Barbara, CA, USA, 1998, pp. 815.
    6. 6)
      • 7. Shen, W., Zhao, K., Jiang, Y., et al: ‘Object skeleton extraction in natural images by fusing scale-associated deep side outputs’. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 222230.
    7. 7)
      • 20. Reyes, M., Dominguez, G., Escalera, S.: ‘Feature weighting in dynamic timewarping for gesture recognition in depth data’. Int. Conf. Computer Vision, Barcelona, Spain, 2011, pp. 11821188.
    8. 8)
      • 12. Siddiqui, M., Medioni, G.G.: ‘Human pose estimation from a single view point, real-time range sensor’. Computer Vision and Pattern Recognition, San Francisco, CA, USA, 2010, pp. 18.
    9. 9)
      • 1. Sarafianos, N., Boteanu, B., Ionescu, B., et al: ‘3D human pose estimation: a review of the literature and analysis of covariates’, Comput. Vis. Image Underst., 2016, 152, pp. 120.
    10. 10)
      • 24. Okada, R., Shirai, Y., Miura, J., et al: ‘Tracking a person with 3D motion by integrating optical flow and depth’, Syst. Comput. Jpn., 2001, 32, (7), pp. 2938.
    11. 11)
      • 14. Shotton, J., Sharp, T., Kipman, A.A., et al: ‘Real-time human pose recognition in parts from single depth images’, Commun. ACM, 2013, 56, (1), pp. 116124.
    12. 12)
      • 13. Anguelov, D., Taskarf, B., Chatalbashev, V., et al: ‘Discriminative learning of Markov random fields for segmentation of 3D scan data’. Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, pp. 169176.
    13. 13)
      • 21. Zhu, Y., Dariush, B., Fujimura, K.: ‘Controlled human pose estimation from depth image streams’. Computer Vision and Pattern Recognition, Anchorage, AK, USA, 2008, pp. 18.
    14. 14)
      • 25. Handrich, S., Al-Hamadi, A.: ‘Full-body human pose estimation by combining geodesic distances and 3D-point cloud Registration’ (Springer International Publishing, New York, NY, USA, 2015).
    15. 15)
      • 19. Grest, D., Woetzel, J., Koch, R.: ‘Nonlinear body pose estimation from depth images’, Lect. Notes Comput. Sci., 2005, 3663, pp. 285292.
    16. 16)
      • 15. Holt, B., Ong, E., Cooper, H., et al: ‘Putting the pieces together: connected poselets for human pose estimation’. Int. Conf. Computer Vision, Barcelona, Spain, 2011, pp. 11961201.
    17. 17)
      • 3. Shakhnarovich, G., Viola, P., Darrell, T.: ‘Fast pose estimation with parameter-sensitive hashing’. Int. Conf. Computer Vision, Nice, France, 2003, pp. 750757.
    18. 18)
      • 23. Schwarz, L.A., Mkhitaryan, A., Mateus, D., et al: ‘Human skeleton tracking from depth data using geodesic distances and optical flow’, Image Vis. Comput., 2012, 30, (3), pp. 217226.
    19. 19)
      • 9. Moeslund, T.B., Hilton, A., Kruger, V.: ‘A survey of advances in vision-based human motion capture and analysis’, Comput. Vis. Image Underst., 2006, 104, (2), pp. 90126.
    20. 20)
      • 10. Poppe, R.: ‘Vision-based human motion analysis: an overview’, Comput. Vis. Image Underst., 2007, 108, (1), pp. 418.
    21. 21)
      • 5. Yang, Y., Ramanan, D.: ‘Articulated pose estimation with flexible mixtures-of-parts’. Computer Vision and Pattern Recognition, Providence, RI, USA, 2011, pp. 13851392.
    22. 22)
      • 11. Chen, L., Wei, H., Ferryman, J.M.: ‘A survey of human motion analysis using depth imagery’, Pattern Recognit. Lett., 2013, 34, (15), pp. 19952006.
    23. 23)
      • 18. Nishi, K., Miura, J.: ‘Generation of human depth images with body part labels for complex human pose recognition’, Pattern Recognit., 2017, 71, pp. 402413.
    24. 24)
      • 16. Pons-Moll, G., Taylor, J., Shotton, J., et al: ‘Metric regression forests for correspondence estimation’, Int. J. Comput. Vis., 2015, 113, (3), pp. 163175.
    25. 25)
      • 8. Pavlakos, G., Zhou, X., Derpanis, K.G., et al: ‘Harvesting multiple views for marker-less 3D human pose annotations’. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017.

Related content

This is a required field
Please enter a valid email address