http://iet.metastore.ingenta.com
1887

Multi-view pose estimation with mixtures of parts and adaptive viewpoint selection

Multi-view pose estimation with mixtures of parts and adaptive viewpoint selection

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

We propose a new method for human pose estimation which leverages information from multiple views to impose a strong prior on articulated pose. The novelty of the method concerns the types of coherence modelled. Consistency is maximised over the different views through different terms modelling classical geometric information (coherence of the resulting poses) as well as appearance information which is modelled as latent variables in the global energy function. Moreover, adequacy of each view is assessed and their contributions are adjusted accordingly. Experiments on the HumanEva and Utrecht multi-person motion datasets show that the proposed method significantly decreases the estimation error compared to single-view results.

References

    1. 1)
      • Y. Yang , D. Ramanan .
        1. Yang, Y., Ramanan, D.: ‘Articulated human detection with flexible mixtures of parts’, IEEE Trans PAMI, 2013, 35, (12), pp. 28782890.
        . IEEE Trans PAMI , 12 , 2878 - 2890
    2. 2)
      • P.F. Felzenszwalb , D.P. Huttenlocher .
        2. Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Pictorial structures for object recognition’, IJCV, 2005, 61, (1), pp. 5579.
        . IJCV , 1 , 55 - 79
    3. 3)
      • B. Sapp , C. Jordan , B. Taskar .
        3. Sapp, B., Jordan, C., Taskar, B.: ‘Adaptive pose priors for pictorial structures’. Conf. Computer Vision and Pattern Recognition, San Francisco, CA, 2010, pp. 422429.
        . Conf. Computer Vision and Pattern Recognition , 422 - 429
    4. 4)
      • M. Dantone , J. Gall , C. Leistner .
        4. Dantone, M., Gall, J., Leistner, C., et al: ‘Body parts dependent joint regressors for human pose estimation in still images’, IEEE Trans. PAMI, 2014, 36, (11), pp. 21312143.
        . IEEE Trans. PAMI , 11 , 2131 - 2143
    5. 5)
      • L. Sigal , A. Balan , M.J. Black .
        5. Sigal, L., Balan, A., Black, M.J.: ‘Combined discriminative and generative articulated pose and non-rigid shape estimation’. Neural Information Processing Systems, Vancouver, Canada, 2008, pp. 13371344.
        . Neural Information Processing Systems , 1337 - 1344
    6. 6)
      • D. Zhang , M. Shah .
        6. Zhang, D., Shah, M.: ‘Human pose estimation in videos’. Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 20122020.
        . Int. Conf. Computer Vision , 2012 - 2020
    7. 7)
      • A. Cherian , J. Mairal , K. Alahari .
        7. Cherian, A., Mairal, J., Alahari, K., et al: ‘Mixing body-part sequences for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23612368.
        . Conf. Computer Vision and Pattern Recognition , 2361 - 2368
    8. 8)
      • L. Pishchulin , M. Andriluka , P. Gehler .
        8. Pishchulin, L., Andriluka, M., Gehler, P., et al: ‘Poselet conditioned pictorial structures’. Conf. Computer Vision and Pattern Recognition, Portland, Oregon, 2013, pp. 588595.
        . Conf. Computer Vision and Pattern Recognition , 588 - 595
    9. 9)
      • M. Kiefel , P. Gehler .
        9. Kiefel, M., Gehler, P.: ‘Human pose estimation with fields of parts’. European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 331346.
        . European Conf. Computer Vision , 331 - 346
    10. 10)
      • M. Eichner , V. Ferrari .
        10. Eichner, M., Ferrari, V.: ‘Appearance sharing for collective human pose estimation’. Asian Conf. Computer Vision, Daejeon, Korea, 2013, pp. 138151.
        . Asian Conf. Computer Vision , 138 - 151
    11. 11)
      • C. Wang , Y. Wang , Z. Lin .
        11. Wang, C., Wang, Y., Lin, Z., et al: ‘Robust estimation of 3D human poses from a single image’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23692376.
        . Conf. Computer Vision and Pattern Recognition , 2369 - 2376
    12. 12)
      • E. Cho , D. Kim .
        12. Cho, E., Kim, D.: ‘Accurate human pose estimation by aggregating multiple pose hypotheses using modified kernel density approximation’, IEEE Signal Process. Lett., 2015, 22, (4), pp. 445449.
        . IEEE Signal Process. Lett. , 4 , 445 - 449
    13. 13)
      • L. Sigal , M. Isard , H. Haussecker .
        13. Sigal, L., Isard, M., Haussecker, H., et al: ‘Loose-limbed people: estimating 3D human pose and motion using non-parametric belief propagation’, IJCV, 2011, 98, (1), pp. 1548.
        . IJCV , 1 , 15 - 48
    14. 14)
      • M. Burenius , J. Sullivan , S. Carlsson .
        14. Burenius, M., Sullivan, J., Carlsson, S.: ‘3D pictorial structures for multiple view articulated pose estimation’. Conf. Computer Vision and Pattern Recognition, Portland, OR, 2013, pp. 36183625.
        . Conf. Computer Vision and Pattern Recognition , 3618 - 3625
    15. 15)
      • A. Schick , R. Stiefelhagen .
        15. Schick, A., Stiefelhagen, R.: ‘3D pictorial structures for human pose estimation with supervoxels’. IEEE Winter Conf. Applications of Computer Vision, Hawaii, Hawaii, 2015, pp. 140147.
        . IEEE Winter Conf. Applications of Computer Vision , 140 - 147
    16. 16)
      • V. Belagiannis , S. Amin , M. Andriluka .
        16. Belagiannis, V., Amin, S., Andriluka, M., et al: ‘3D pictorial structures revisited: multiple human pose estimation’, IEEE T on PAMI, 2015, PP, (99), pp. 11.
        . IEEE T on PAMI , 99 , 1 - 1
    17. 17)
      • C. Canton Ferrer , J.R. Casas , M. Pardas .
        17. Canton Ferrer, C., Casas, J.R., Pardas, M.: ‘Voxel based annealed particle filtering for markerless 3D articulated motion capture’. 3DTV, Potsdam, Germany, 2009, pp. 14.
        . 3DTV , 1 - 4
    18. 18)
      • S. Zuffi , M.J. Black .
        18. Zuffi, S., Black, M.J.: ‘The stitched puppet: a graphical model of 3D human shape and pose’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 35373546.
        . Conf. Computer Vision and Pattern Recognition , 3537 - 3546
    19. 19)
      • M. Hofmann , D.M. Gavrila .
        19. Hofmann, M., Gavrila, D.M.: ‘Multi-view 3D human pose estimation combining single-frame recovery, temporal integration and model adaptation’. Conf. Computer Vision and Pattern Recognition, Miami, FL, 2009, pp. 22142221.
        . Conf. Computer Vision and Pattern Recognition , 2214 - 2221
    20. 20)
      • V. Kazemi , M. Burenius , H. Azizpour .
        20. Kazemi, V., Burenius, M., Azizpour, H., et al: ‘Multi-view body part recognition with random forests’. British Machine Vision Conf., Bristol, UK, 2013.
        . British Machine Vision Conf.
    21. 21)
      • J. Puwein , L. Ballan , R. Ziegler .
        21. Puwein, J., Ballan, L., Ziegler, R., et al: ‘Joint camera pose estimation and 3D human pose estimation in a multi-camera setup’. Asian Conf. Computer Vision, Singapore, 2014, pp. 473487.
        . Asian Conf. Computer Vision , 473 - 487
    22. 22)
      • S. Amin , M. Andriluka , M. Rohrbach .
        22. Amin, S., Andriluka, M., Rohrbach, M., et al: ‘Multi-view pictorial structures for 3D human pose estimation’. British Machine Vision Conf., Bristol, UK, 2013.
        . British Machine Vision Conf.
    23. 23)
      • P.F. Felzenszwalb , D.P. Huttenlocher .
        23. Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Distance transforms of sampled functions.’, Theory Comput., 2012, 8, (1), pp. 415428.
        . Theory Comput. , 1 , 415 - 428
    24. 24)
      • B. Xiaohan.Nie , C. Xiong , S.C. Zhu .
        24. Xiaohan.Nie, B., Xiong, C., Zhu, S.C.: ‘Joint action recognition and pose estimation from video’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 12931301.
        . Conf. Computer Vision and Pattern Recognition , 1293 - 1301
    25. 25)
      • D. Park , D. Ramanan .
        25. Park, D., Ramanan, D.: ‘Articulated pose estimation with tiny synthetic videos’. Conf. Computer Vision and Pattern Recognition Workshop, Boston, MA, 2015, pp. 5866.
        . Conf. Computer Vision and Pattern Recognition Workshop , 58 - 66
    26. 26)
      • A. Agarwal , B. Triggs .
        26. Agarwal, A., Triggs, B.: ‘Recovering 3D human pose from monocular images’, IEEE Trans. PAMI, 2006, 28, (1), pp. 4458.
        . IEEE Trans. PAMI , 1 , 44 - 58
    27. 27)
      • L. Bo , C. Sminchisescu , A. Kanaujia .
        27. Bo, L., Sminchisescu, C., Kanaujia, A., et al: ‘Fast algorithms for large scale conditional 3D prediction’. Conf. Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008, pp. 18.
        . Conf. Computer Vision and Pattern Recognition , 1 - 8
    28. 28)
      • R. Urtasun , T. Darrell .
        28. Urtasun, R., Darrell, T.: ‘Sparse probabilistic regression for activity-independent human pose inference’. Conf. Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008, pp. 18.
        . Conf. Computer Vision and Pattern Recognition , 1 - 8
    29. 29)
      • W. Ouyang , X. Chu , X. Wang .
        29. Ouyang, W., Chu, X., Wang, X.: ‘Multi-source deep learning for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 23372344.
        . Conf. Computer Vision and Pattern Recognition , 2337 - 2344
    30. 30)
      • X. Fan , K. Zheng , Y. Lin .
        30. Fan, X., Zheng, K., Lin, Y., et al: ‘Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Boston, MA, 2015, pp. 13471355.
        . Conf. Computer Vision and Pattern Recognition , 1347 - 1355
    31. 31)
      • J.J. Tompson , A. Jain , Y. LeCun .
        31. Tompson, J.J., Jain, A., LeCun, Y., et al: ‘Joint training of a convolutional network and a graphical model for human pose estimation’. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 17991807.
        . Neural Information Processing Systems , 1799 - 1807
    32. 32)
      • A. Toshev , C. Szegedy .
        32. Toshev, A., Szegedy, C.: ‘Deeppose: human pose estimation via deep neural networks’. Conf. Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 16531660.
        . Conf. Computer Vision and Pattern Recognition , 1653 - 1660
    33. 33)
      • X. Chen , A.L. Yuille .
        33. Chen, X., Yuille, A.L.: ‘Articulated pose estimation by a graphical model with image dependent pairwise relations’. Advances in Neural Information Processing Systems 27, Columbus, OH, 2014, pp. 17361744.
        . Advances in Neural Information Processing Systems 27 , 1736 - 1744
    34. 34)
      • J. Carreira , P. Agrawal , K. Fragkiadaki .
        34. Carreira, J., Agrawal, P., Fragkiadaki, K., et al: ‘Human pose estimation with iterative error feedback’. Conf. Computer Vision and Pattern Recognition, Las Vegas, Nevada, 2016, pp. 47334742.
        . Conf. Computer Vision and Pattern Recognition , 4733 - 4742
    35. 35)
      • W. Yang , W. Ouyang , H. Li .
        35. Yang, W., Ouyang, W., Li, H., et al: ‘End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation’. Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 30733082.
        . Conf. Computer Vision and Pattern Recognition , 3073 - 3082
    36. 36)
      • X. Chu , W. Ouyang , H. Li .
        36. Chu, X., Ouyang, W., Li, H., et al: ‘Structured feature learning for pose estimation’. Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 47154723.
        . Conf. Computer Vision and Pattern Recognition , 4715 - 4723
    37. 37)
      • A. Newell , K. Yang , J. Deng .
        37. Newell, A., Yang, K., Deng, J.: ‘Stacked hourglass networks for human pose estimation’. European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 483499.
        . European Conf. Computer Vision , 483 - 499
    38. 38)
      • P.F. Felzenszwalb , R.B. Girshick , D. McAllester .
        38. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al: ‘Object detection with discriminatively trained part-based models’, IEEE Trans. PAMI, 2010, 32, (9), pp. 16271645.
        . IEEE Trans. PAMI , 9 , 1627 - 1645
    39. 39)
      • K. Simonyan , A. Zisserman . (2014)
        39. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, CoRR, 2014, abs/1409.1556.
        .
    40. 40)
      • L. Sigal , A.O. Balan , M.J. Black .
        40. Sigal, L., Balan, A.O., Black, M.J.: ‘Humaneva: synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion’, IJCV, 2010, 87, (1–2), pp. 427.
        . IJCV , 4 - 27
    41. 41)
      • N.P. van der Aa , X. Luo , G.J. Giezeman .
        41. van der Aa, N.P., Luo, X., Giezeman, G.J., et al: ‘Umpm benchmark: a multi-person dataset with synchronized video and motion capture data for evaluation of articulated human motion and interaction’. HICV/Int. Conf. Computer Vision Workshops 2011, Barcelona, Spain, 2011, pp. 12641269.
        . HICV/Int. Conf. Computer Vision Workshops 2011 , 1264 - 1269
    42. 42)
      • N. Dalal , B. Triggs .
        42. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Conf. Computer Vision and Pattern Recognition, San Diego, CA, 2005, vol. 1, pp. 886893.
        . Conf. Computer Vision and Pattern Recognition , 886 - 893
    43. 43)
      • N. Neverova , C. Wolf , G.W. Taylor . (2015)
        43. Neverova, N., Wolf, C., Taylor, G.W., et al: ‘Hand pose estimation through weakly-supervised learning of a rich intermediate representation’ (Pre-print: arxiv:151106728, 2015).
        .
    44. 44)
      • D. Fourure , R. Emonet , E. Fromont .
        44. Fourure, D., Emonet, R., Fromont, E., et al: ‘Multi-task, multi-domain learning: application to semantic segmentation and pose regression’, 2017, 251, pp. 6880.
        . , 68 - 80
    45. 45)
      • N. Srivastava , G. Hinton , A. Krizhevsky .
        45. Srivastava, N., Hinton, G., Krizhevsky, A., et al: ‘Dropout: a simple way to prevent neural networks from overfitting’, J. Mach. Learn. Res., 2014, 15, (1), pp. 19291958.
        . J. Mach. Learn. Res. , 1 , 1929 - 1958
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0146
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0146
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address