http://iet.metastore.ingenta.com
1887

Estimation of 3D human hand poses with structured pose prior

Estimation of 3D human hand poses with structured pose prior

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Here, the authors present multistage estimation model embedding with structured pose prior (SPP), a novel coarse-to-fine framework for real-time 3D hand estimation from single depth image. Authors’ main contributions can be summarised as follows: (i) The authors proposed SPP to enforce constraints of canonical hand pose instead of original hand pose. (ii) The authors are the first to adopt under-complete stacked denoising auto-encoder (SDA) to construct pose prior by mapping canonical hand pose to latent representation. In the case of enforcing constraints of canonical hand pose, the authors empirically validate that under-complete SDA outperforms over-complete SDA in improving the hand estimation accuracy. (iii) The authors propose candidate keypoints patches (CKP) as intermediate data to conduct further hand pose refinement. Experimental evaluation on two publically available datasets shows that authors’ model is competitive both in accuracy and computation time. Especially, authors’ method placed first in the location of palm key-point on both two datasets, and the high accuracy of hand palm key-point plays an important role in many applications, such as that manipulator can grasp objects to specific coordinates with the guiding of human hand palm.

References

    1. 1)
      • 1. Tang, D., Chang, H., Tejani, A., et al: ‘Latent regression forest: structured estimation of 3D hand poses’, IEEE Trans. Pattern Analysis Mach. Intell., 2016, 39, (7), pp. 13741387.
    2. 2)
      • 2. Oberweger, M., Lepetit, V.: ‘Deepprior++: improving fast and accurate 3D hand pose estimation’. IEEE Int. Conf. on Computer Vision, Venice, Italy, 2017, pp. 585594.
    3. 3)
      • 3. Ge, L., Liang, H., Yuan, J., et al: ‘3D convolutional neural networks for efficient and robust hand pose estimation from single depth images’. IEEE Conf. on Computer Vision and Pattern Recognition, Hawaii, USA, 2017, pp. 56795688.
    4. 4)
      • 4. Li, P., Ling, H., Li, X., et al: ‘3D hand pose estimation using randomized decision forest with segmentation index points’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 819827.
    5. 5)
      • 5. Sinha, A., Choi, C., Ramani, K.: ‘Deephand: robust hand pose estimation by completing a matrix imputed with deep features’. IEEE Conf. on Computer Vision and Pattern Recognition, Nevada, USA, 2016, pp. 41504158.
    6. 6)
      • 6. Li, C., Xie, C., Zhang, B., et al: ‘Deep fisher discriminant learning for Mobile hand gesture recognition’, Pattern Recognit., 2017, 77, pp. 276288.
    7. 7)
      • 7. Jang, Y., Noh, S.T., Chang, H.J., et al: ‘3D finger CAPE: clicking action and position estimation under self-occlusions in egocentric viewpoint’, IEEE Trans. Vis. Comput. Graphics, 2015, 21, (4), pp. 501510.
    8. 8)
      • 8. Garciahernando, G., Yuan, S., Baek, S., et al: ‘First-Person hand action benchmark with RGB-D videos and 3D hand pose annotations’. IEEE Conf. on Computer Vision and Pattern Recognition, Utah, USA, 2018.
    9. 9)
      • 9. Ge, L., Liang, H., Yuan, J, et al: ‘Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs’. IEEE Conf. on Computer Vision and Pattern Recognition, Nevada, USA, 2016, pp. 35933601.
    10. 10)
      • 10. Oberweger, M., Wohlhart, P., Lepetit, V.: ‘Training a feedback loop for hand pose estimation’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 33163324.
    11. 11)
      • 11. Oberweger, M., Wohlhart, P., Lepetit, V.: ‘Hands deep in deep learning for hand pose estimation’. Computer Vision Winter Workshop, Styria, Austria, 2015, pp. 2130.
    12. 12)
      • 12. Yuan, S., Ye, Q., Garciahernando, G., et al: ‘The 2017 hands in the million challenge on 3D hand pose estimation’. IEEE Int. Conf. on Computer Vision Workshop, Venice, Italy, 2017.
    13. 13)
      • 13. Tompson, J., Stein, M., Lecun, Y., et al: ‘Real-time continuous pose recovery of human hands using convolutional networks’, ACM Trans. Graph., 2014, 33, (5), pp. 110.
    14. 14)
      • 14. Tang, D., Chang H, J., Tejani, A., et al: ‘Latent regression forest: structured estimation of 3D articulated hand posture’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, 2014, pp. 37863793.
    15. 15)
      • 15. Baek, S., Kim, K.I., Kim, T.K.: ‘Augmented Skeleton space transfer for depth-based hand pose estimation’. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018.
    16. 16)
      • 16. Yuan, S., Garcia-Hernando, G., Stenger, B., et al: ‘Depth-based 3D hand pose estimation: from current achievements to future goals’. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018.
    17. 17)
      • 17. Riegler, G., Ferstl, D., Rüther, M., et al: ‘A framework for articulated hand pose estimation and evaluation’. Scandinavian Conf. on Image Analysis, Copenhagen, Denmark, 2015, pp. 4152.
    18. 18)
      • 18. Zimmermann, C., Brox, T.: ‘Learning to estimate 3D hand pose from single RGB images’. IEEE Int. Conf. on Computer Vision, Italy, 2017, pp. 49134921.
    19. 19)
      • 19. Rifai, S., Mesnil, G., Vincent, P., et al: ‘Higher order contractive auto-encoder’. Joint European Conf. on Machine Learning and Knowledge Discovery in Databases, Athens, Greece, 2011, pp. 645660.
    20. 20)
      • 20. Zhangm, B., Yangm, Y., Chen, C., et al: ‘Action recognition using 3D histograms of texture and A multi-class boosting classifier’, IEEE Trans. Image Process. A Publ. IEEE Signal Process. Soc., 2017, 26, (10), pp. 46484660.
    21. 21)
      • 21. Tekin, B., Katircioglu, I., Salzmann, M., et al: ‘Structured prediction of 3D human pose with deep neural networks’. British Machine Vision Conf., York, UK, 2016.
    22. 22)
      • 22. Oikonomidis, I., Kyriazis, N., Argyros, A.: ‘Efficient model-based 3d tracking of hand articulations using kinect’. Proc. of the British Machine Vision Conf., Scotland, UK, 2011, vol. 1, no. (2), p. 3.
    23. 23)
      • 23. Supancic, J.S., Rogez, G., Yang, Y., et al: ‘Depth-Based hand pose estimation: data, methods, and challenges’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 18681876.
    24. 24)
      • 24. Melax, S., Keselman, L., Orsten, S.: ‘Dynamics based 3D skeletal hand tracking’. Proc. of Graphics Interface, Regina, SK, Canada, 2013, pp. 184184.
    25. 25)
      • 25. Qian, C., Sun, X., Wei, Y., et al: ‘Realtime and robust hand tracking from depth’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, Ohio, USA, 2014, pp. 11061113.
    26. 26)
      • 26. Sridhar, S., Mueller, F., Oulasvirta, A., et al: ‘Fast and robust hand tracking using detection-guided optimization’. IEEE Conf. on Computer Vision and Pattern Recognition, Nevada, USA, 2016, pp. 32133221.
    27. 27)
      • 27. Tan D, J., Cashman, T., Taylor, J., et al: ‘Fits like a glove: rapid and reliable hand shape personalization’. IEEE Conf. on Computer Vision and Pattern Recognition, Nevada, USA, 2016, pp. 56105619.
    28. 28)
      • 28. Taylor, J., Bordeaux, L., Cashman, T., et al: ‘Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences’, ACM Trans. Graph., 2016, 35, (4), p. 143.
    29. 29)
      • 29. Shotton, J., Fitzgibbon, A., Cook, M., et al: ‘Real-time human pose recognition in parts from single depth images’. IEEE Conf. on Computer Vision and Pattern Recognition, Colorado Springs, Co, USA, 2011, pp. 12971304.
    30. 30)
      • 30. Girshick, R., Shotton, J., Kohli, P., et al: ‘Efficient regression of general-activity human poses from depth images’. IEEE Int. Conf. on Computer Vision, Barcelona, Spain, 2011, pp. 415422.
    31. 31)
      • 31. Sun, X., Wei, Y., Liang, S., et al: ‘Cascaded hand pose regression’. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 824832.
    32. 32)
      • 32. Yuan, S., Ye, Q., Stenger, B., et al: ‘Bighand2.2M benchmark: hand pose dataset and state of the Art analysis’. IEEE Conf. on Computer Vision and Pattern Recognition, Hawaii, USA, 2017, pp. 26052613.
    33. 33)
      • 33. Tang, D., Taylor, J., Kohli, P., et al: ‘Opening the black Box: hierarchical sampling optimization for estimating human hand pose’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 33253333.
    34. 34)
      • 34. Xu, C., Govindarajan, L.N., Zhang, Y., et al: ‘Lie-X: depth image based articulated object pose estimation, tracking, and action recognition on lie groups’, Int. J. Comput. Vis., 2017, 123, (3), pp. 454478.
    35. 35)
      • 35. Ye, Q., Yuan, S., Kim, T.K.: ‘Spatial attention deep net with partial PSO for hierarchical hybrid hand pose estimation’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 346361.
    36. 36)
      • 36. Yang, H., Zhang, J.: ‘Hand pose regression via a classification-guided approach’. Asian Conf. on Computer Vision, Taipei, Taiwan, 2016, pp. 452466.
    37. 37)
      • 37. Wan, C., Probst, T., Gool, L.V., et al: ‘Crossing nets: dual generative models with a shared latent space for hand pose estimation’. IEEE Conf. on Computer Vision and Pattern Recognition, Hawaii, USA, 2017.
    38. 38)
      • 38. Kingma, D.P, Ba, J.: ‘Adam: A method for stochastic optimization’. Int. Conf. for Learning Representations, San Diego, USA, 2015.
    39. 39)
      • 39. Vincent, P., Larochelle, H., Lajoie, I., et al: ‘Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion’, J. Mach. Learn. Res., 2010, 11, (2), pp. 33713408.
    40. 40)
      • 40. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Int. Conf. on Neural Information Processing Systems, Nevada, USA, 2012, pp. 10971105.
    41. 41)
      • 41. Guo, H., Wang, G., Chen, X., et al: ‘Region ensemble network: improving convolutional network for hand pose estimation’. IEEE Conf. on Image Processing, Beijing, China, 2017, pp. 45124516.
    42. 42)
      • 42. Wang, G., Chen, X., Guo, H., et al: ‘Region ensemble network: towards good practices for deep 3D hand pose estimation’, J. Vis. Commun. Image Represent., 2018, 55, pp. 404414.
    43. 43)
      • 43. Chen, X., Wang, G., Guo, H., et al: ‘Pose guided structured region ensemble network for cascaded hand pose estimation’. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018.
    44. 44)
      • 44. Wan, C., Probst, T., Van Gool, L., et al: ‘Dense 3D regression for hand pose estimation’. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018.
    45. 45)
      • 45. Moon, G., Chang, J.Y., Lee K, M.: ‘V2V-PoseNet: voxel-to-voxel prediction network for accurate 3D hand and human pose estimation from a single depth map’. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, 2018.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2018.5480
Loading

Related content

content/journals/10.1049/iet-cvi.2018.5480
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address