Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Multi-view pose estimation with mixtures of parts and adaptive viewpoint selection

We propose a new method for human pose estimation which leverages information from multiple views to impose a strong prior on articulated pose. The novelty of the method concerns the types of coherence modelled. Consistency is maximised over the different views through different terms modelling classical geometric information (coherence of the resulting poses) as well as appearance information which is modelled as latent variables in the global energy function. Moreover, adequacy of each view is assessed and their contributions are adjusted accordingly. Experiments on the HumanEva and Utrecht multi-person motion datasets show that the proposed method significantly decreases the estimation error compared to single-view results.

References

    1. 1)
      • 23. Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Distance transforms of sampled functions.’, Theory Comput., 2012, 8, (1), pp. 415428.
    2. 2)
      • 25. Park, D., Ramanan, D.: ‘Articulated pose estimation with tiny synthetic videos’. Conf. Computer Vision and Pattern Recognition Workshop, Boston, MA, 2015, pp. 5866.
    3. 3)
      • 14. Burenius, M., Sullivan, J., Carlsson, S.: ‘3D pictorial structures for multiple view articulated pose estimation’. Conf. Computer Vision and Pattern Recognition, Portland, OR, 2013, pp. 36183625.
    4. 4)
      • 8. Pishchulin, L., Andriluka, M., Gehler, P., et al: ‘Poselet conditioned pictorial structures’. Conf. Computer Vision and Pattern Recognition, Portland, Oregon, 2013, pp. 588595.
    5. 5)
      • 21. Puwein, J., Ballan, L., Ziegler, R., et al: ‘Joint camera pose estimation and 3D human pose estimation in a multi-camera setup’. Asian Conf. Computer Vision, Singapore, 2014, pp. 473487.
    6. 6)
      • 10. Eichner, M., Ferrari, V.: ‘Appearance sharing for collective human pose estimation’. Asian Conf. Computer Vision, Daejeon, Korea, 2013, pp. 138151.
    7. 7)
      • 5. Sigal, L., Balan, A., Black, M.J.: ‘Combined discriminative and generative articulated pose and non-rigid shape estimation’. Neural Information Processing Systems, Vancouver, Canada, 2008, pp. 13371344.
    8. 8)
      • 37. Newell, A., Yang, K., Deng, J.: ‘Stacked hourglass networks for human pose estimation’. European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 483499.
    9. 9)
      • 36. Chu, X., Ouyang, W., Li, H., et al: ‘Structured feature learning for pose estimation’. Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 47154723.
    10. 10)
      • 7. Cherian, A., Mairal, J., Alahari, K., et al: ‘Mixing body-part sequences for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23612368.
    11. 11)
      • 35. Yang, W., Ouyang, W., Li, H., et al: ‘End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 30733082.
    12. 12)
      • 1. Yang, Y., Ramanan, D.: ‘Articulated human detection with flexible mixtures of parts’, IEEE Trans PAMI, 2013, 35, (12), pp. 28782890.
    13. 13)
      • 39. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, CoRR, 2014, abs/1409.1556.
    14. 14)
      • 45. Srivastava, N., Hinton, G., Krizhevsky, A., et al: ‘Dropout: a simple way to prevent neural networks from overfitting’, J. Mach. Learn. Res., 2014, 15, (1), pp. 19291958.
    15. 15)
      • 18. Zuffi, S., Black, M.J.: ‘The stitched puppet: a graphical model of 3D human shape and pose’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 35373546.
    16. 16)
      • 43. Neverova, N., Wolf, C., Taylor, G.W., et al: ‘Hand pose estimation through weakly-supervised learning of a rich intermediate representation’ (Pre-print: arxiv:151106728, 2015).
    17. 17)
      • 6. Zhang, D., Shah, M.: ‘Human pose estimation in videos’. Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 20122020.
    18. 18)
      • 30. Fan, X., Zheng, K., Lin, Y., et al: ‘Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 13471355.
    19. 19)
      • 9. Kiefel, M., Gehler, P.: ‘Human pose estimation with fields of parts’. European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 331346.
    20. 20)
      • 27. Bo, L., Sminchisescu, C., Kanaujia, A., et al: ‘Fast algorithms for large scale conditional 3D prediction’. Conf. Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008, pp. 18.
    21. 21)
      • 29. Ouyang, W., Chu, X., Wang, X.: ‘Multi-source deep learning for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23372344.
    22. 22)
      • 33. Chen, X., Yuille, A.L.: ‘Articulated pose estimation by a graphical model with image dependent pairwise relations’. Advances in Neural Information Processing Systems 27, Columbus, OH, 2014, pp. 17361744.
    23. 23)
      • 12. Cho, E., Kim, D.: ‘Accurate human pose estimation by aggregating multiple pose hypotheses using modified kernel density approximation’, IEEE Signal Process. Lett., 2015, 22, (4), pp. 445449.
    24. 24)
      • 26. Agarwal, A., Triggs, B.: ‘Recovering 3D human pose from monocular images’, IEEE Trans. PAMI, 2006, 28, (1), pp. 4458.
    25. 25)
      • 44. Fourure, D., Emonet, R., Fromont, E., et al: ‘Multi-task, multi-domain learning: application to semantic segmentation and pose regression’, 2017, 251, pp. 6880.
    26. 26)
      • 15. Schick, A., Stiefelhagen, R.: ‘3D pictorial structures for human pose estimation with supervoxels’. IEEE Winter Conf. Applications of Computer Vision, Hawaii, Hawaii, 2015, pp. 140147.
    27. 27)
      • 13. Sigal, L., Isard, M., Haussecker, H., et al: ‘Loose-limbed people: estimating 3D human pose and motion using non-parametric belief propagation’, IJCV, 2011, 98, (1), pp. 1548.
    28. 28)
      • 20. Kazemi, V., Burenius, M., Azizpour, H., et al: ‘Multi-view body part recognition with random forests’. British Machine Vision Conf., Bristol, UK, 2013.
    29. 29)
      • 31. Tompson, J.J., Jain, A., LeCun, Y., et al: ‘Joint training of a convolutional network and a graphical model for human pose estimation’. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 17991807.
    30. 30)
      • 34. Carreira, J., Agrawal, P., Fragkiadaki, K., et al: ‘Human pose estimation with iterative error feedback’. Conf. Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016, pp. 47334742.
    31. 31)
      • 24. Xiaohan.Nie, B., Xiong, C., Zhu, S.C.: ‘Joint action recognition and pose estimation from video’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 12931301.
    32. 32)
      • 4. Dantone, M., Gall, J., Leistner, C., et al: ‘Body parts dependent joint regressors for human pose estimation in still images’, IEEE Trans. PAMI, 2014, 36, (11), pp. 21312143.
    33. 33)
      • 42. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Conf. Computer Vision and Pattern Recognition, San Diego, CA, 2005, vol. 1, pp. 886893.
    34. 34)
      • 28. Urtasun, R., Darrell, T.: ‘Sparse probabilistic regression for activity-independent human pose inference’. Conf. Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008, pp. 18.
    35. 35)
      • 38. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al: ‘Object detection with discriminatively trained part-based models’, IEEE Trans. PAMI, 2010, 32, (9), pp. 16271645.
    36. 36)
      • 3. Sapp, B., Jordan, C., Taskar, B.: ‘Adaptive pose priors for pictorial structures’. Conf. Computer Vision and Pattern Recognition, San Francisco, CA, 2010, pp. 422429.
    37. 37)
      • 32. Toshev, A., Szegedy, C.: ‘Deeppose: human pose estimation via deep neural networks’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 16531660.
    38. 38)
      • 22. Amin, S., Andriluka, M., Rohrbach, M., et al: ‘Multi-view pictorial structures for 3D human pose estimation’. British Machine Vision Conf., Bristol, UK, 2013.
    39. 39)
      • 17. Canton Ferrer, C., Casas, J.R., Pardas, M.: ‘Voxel based annealed particle filtering for markerless 3D articulated motion capture’. 3DTV, Potsdam, Germany, 2009, pp. 14.
    40. 40)
      • 16. Belagiannis, V., Amin, S., Andriluka, M., et al: ‘3D pictorial structures revisited: multiple human pose estimation’, IEEE T on PAMI, 2015, PP, (99), pp. 11.
    41. 41)
      • 11. Wang, C., Wang, Y., Lin, Z., et al: ‘Robust estimation of 3D human poses from a single image’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23692376.
    42. 42)
      • 41. van der Aa, N.P., Luo, X., Giezeman, G.J., et al: ‘Umpm benchmark: a multi-person dataset with synchronized video and motion capture data for evaluation of articulated human motion and interaction’. HICV/Int. Conf. Computer Vision Workshops 2011, Barcelona, Spain, 2011, pp. 12641269.
    43. 43)
      • 2. Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Pictorial structures for object recognition’, IJCV, 2005, 61, (1), pp. 5579.
    44. 44)
      • 40. Sigal, L., Balan, A.O., Black, M.J.: ‘Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion’, IJCV, 2010, 87, (1–2), pp. 427.
    45. 45)
      • 19. Hofmann, M., Gavrila, D.M.: ‘Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation’. Conf. Computer Vision and Pattern Recognition, Miami, FL, 2009, pp. 22142221.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0146
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0146
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address