DR-Net: denoising and reconstruction network for 3D human pose estimation from monocular RGB videos

J.Y. Chang

DR-Net: denoising and reconstruction network for 3D human pose estimation from monocular RGB videos

View Fulltext

Author(s): J.Y. Chang¹
- Affiliations: 1: Department of Electronics and Communications Engineering , Kwangwoon University , Seoul , Republic of Korea
Source: Volume 54, Issue 2, 25 January 2018, p. 70 – 72
DOI: 10.1049/el.2017.3830 , Print ISSN 0013-5194, Online ISSN 1350-911X

Received 10/10/2017, Published 06/12/2017

A method is presented for accurately estimating 2D and 3D human poses by simultaneously performing 2D pose denoising and 3D pose reconstruction from noisy 2D human pose sequences. The proposed approach globally modifies the input 2D poses that are locally estimated by recent convolutional neural network-based methods. The denoised 2D poses are efficiently converted into 3D poses in a bottom-up manner using a feed-forward network rather than by optimisation, which is frequently used in existing methods. The proposed denoising and reconstruction network is used with existing 2D human pose estimators to provide state-of-the-art 3D human pose estimation results for large-scale real datasets.

References

1. 1)
  - 6. Ramakrishna, V., Kanade, T., Sheikh, Y.: ‘Reconstructing 3D human pose from 2D image landmarks’. Proc. European Conf. Computer Vision, Florence, Italy, October 2012, pp. 573–586.
2. 2)
  - 4. Kingma, D., Ba, J.: ‘Adam: a method for stochastic optimization’. Proc. Int. Conf. Learning Representations, San Diego, CA, USA, May 2015.
3. 3)
  - 7. Dai, Y., Li, H., He, M.: ‘A simple prior-free method for non-rigid structure-from-motion factorization’, Int. J. Comput. Vis., 2014, 107, (2), pp. 101–122 (doi: 10.1007/s11263-013-0684-2).
4. 4)
  - 2. Bogo, F., Kanazawa, A., Lassner, C., et al: ‘Keep it SMPL: automatic estimation of 3D human pose and shape from a single image’. Proc. European Conf. Computer Vision, Amsterdam, Netherlands, October 2016, pp. 561–578.
5. 5)
  - 3. Insafutdinov, E., Pishchulin, L., Andres, B., et al: ‘Deepercut: a deeper, stronger, and faster multi-person pose estimation model’. Proc. European Conf. Computer Vision, Amsterdam, Netherlands, October 2016, pp. 34–50.
6. 6)
  - 5. Ionescu, C., Papava, D., Olaru, V., et al: ‘Large scale datasets and predictive methods for 3D human sensing in natural environments’, Trans. Pattern Anal. Mach. Intell., 2014, 36, (7), pp. 1325–1339 (doi: 10.1109/TPAMI.2013.248).
7. 7)
  - 8. Zhou, X., Leonardos, S., Hu, X., et al: ‘3D shape estimation from 2D landmarks: a convex relaxation approach’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, October 2015, pp. 4447–4455.
8. 8)
  - 1. Zhou, X., Zhu, M., Leonardos, S., et al: ‘Sparseness meets deepness: 3D human pose estimation from monocular video’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 26–July 1, 2016.
9. 9)
  - 9. Tekin, B., Rozantsev, A., Lepetit, V., et al: ‘Direct prediction of 3D body poses from motion compensated sequences’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 26 – July 1, 2016.

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

DR-Net: denoising and reconstruction network for 3D human pose estimation from monocular RGB videos

References

Related content