In this study, the authors propose a new method for part-based human pose estimation. The key idea of the authors method is to improve the accuracies for leaf parts localisations – an issue that was largely ignored by the previous study – by incorporating both local and non-local contextual information into the model. In particular, they use the local contextual information to reduce or eliminate the influences of the noises, while the non-local contextual information helps to improve the detection accuracies of the leaf parts. Since more accurate parts localisations usually mean a more reasonable active set of spatial constraints, this potentially enhances the effectiveness of the subsequent optimisation procedure. Furthermore, they keep the basic structure of the tree-based model, hence taking advantage of its conceptual simplicity and computationally efficient inference. Their experiments on two challenging real-world datasets demonstrate the feasibility and the effectiveness of the proposed method.

References

1. 1)
  - 25. Johnson, S., Everingham, M.: ‘Combining discriminative appearance and segmentation cues for articulated human pose estimation’. In: ICCV2009: the 2009 IEEE 12th Int. Conf. on Computer Vision Workshops, Kyoto, Japan, September 2009, pp. 405–412.
2. 2)
  - 24. Andriluka, M., Roth, S., Schiele, B.: ‘Pictorial structures revisited: people detection and articulated pose estimation’. In: CVPR2009: the 2009 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Miami, USA, June 2009, pp. 1014–1021.
3. 3)
  - P.F. Felzenszwalb . Pictorial structures for object recognition. Int. J. Comput. Vis. , 1 , 55 - 79
4. 4)
  - 15. Weber, M., Welling, M., Perona, P.: ‘Unsupervised learning of models for recognition’. In: ECCV2000: the Sixth European Conf. on Computer Vision, Dublin, Ireland, June 2000, pp. 18–32.
5. 5)
  - 3. Fathi, A., Mori, G.: ‘Human pose estimation using motion exemplars’. In: CCV2007: the 2007 IEEE 11th Int. Conf. on Computer Vision, Rio de Janeiro, Brazil, October 2007, pp. 1–8.
6. 6)
  - 20. Tran, D., Forsyth, A.D.: ‘Improved human parsing with a full relational model’. In: ECCV2010: the 2010 11th European Conf. on Computer Vision, Heraklion, Greece, September 2010, pp. 227–240.
7. 7)
  - 19. Ukita, N.: ‘Articulated pose estimation with parts connectivity using discriminative local oriented contours’. In: CVPR2012: the 2012 IEEE Conf. on Computer Vision and Pattern Recognition, Providence, USA, June 2012, pp. 3154–3161.
8. 8)
  - 2. Rogez, G., Rihan, J., Ramalingam, S., Orrite, C., Torr, P.H.: ‘Randomized trees for human pose detection’. In: CVPR2008: the 2008 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Anchorage, USA, June 2008, pp. 1–8.
9. 9)
  - 7. Binford, O.T.: ‘Visual perception by computer’. Proc. of the 30 IEEE Conf. on Systems and Control, Miami, FL, 1971, pp. 262.
10. 10)
  - 18. Tian, Y., Zitnick, C.L., Narasimhan, S.G.: ‘Exploring the spatial hierarchy of mixture models for human pose estimation’. In: ECCV2012: the 12th European Conf. on Computer Vision, Florence, Italy, October 2012, pp. 256–269.
11. 11)
  - 9. Wang, Y., Tran, D., Liao, Z.: ‘Learning hierarchical poselets for human parsing’. In: CVPR2011: the 24th IEEE Conf. on Computer Vision and Pattern Recognition, Colorado Springs, USA, June 2011, pp. 1705–1712.
12. 12)
  - 6. Yang, Y., Ramanan, D.: ‘Articulated pose estimation with flexible mixtures-of-parts’. In: CVPR2011: the 24th IEEE Conf. on Computer Vision and Pattern Recognition, Colorado Springs, USA, June 2011, pp. 1385–1392.
13. 13)
  - 12. Wang, Y., Mori, G.: ‘Multiple tree models for occlusion and spatial constraints in human pose estimation’. In: ECCV2008: the Tenth European Conf. on Computer Vision, Marseille, France, October 2008, pp. 710–724.
14. 14)
  - 21. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. In: CVPR2005: the 2005 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Diego, USA, June 2005, pp. 886–893.
15. 15)
  - 4. Junior, J., Julio, C., Jung, C.R., Musse, S.R.: ‘Skeleton-based human segmentation in still images’. In: ICIP2012: the 19th IEEE Int. Conf. on Image Processing, Lake Buena Vista, USA, September 2012, pp. 141–144.
16. 16)
  - 8. Forsyth, D.A., Fleck, M.M.: ‘Body plans’. In: CVPR1997: the 1997 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, June 1997, pp. 678–683.
17. 17)
  - 14. Fergus, R., Perona, P., Zisserman, A.: ‘Object class recognition by unsupervised scale-invariant learning’. In: CVPR2003: the 2003 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Madison, USA, June 2003, pp. 264–271.
18. 18)
  - 26. Ikizler-Cinbis, N., Cinbis, R.G., Sclaroff, S.: ‘Learning actions from the web’. In: ICCV2009: the 12th Int. Conf. on Computer Vision, Kyoto, Japan, September 2009, pp. 995–1002.
19. 19)
  - 16. Singh, V.K., Nevatia, R., Huang, C.: ‘Efficient inference with multiple heterogeneous part detectors for human pose estimation’. In: ECCV2010: the 11th European Conf. on Computer Vision, Heraklion, Greece, September 2010, pp. 314–327.
20. 20)
  - 1. Ramanan, D.: ‘Learning to parse images of articulated bodies’. In: NIPS: the 20th Annual Conf. on Neural Information Processing Systems, Vancouver, Canada, December 2006, pp. 1129–1136.
21. 21)
  - 10. Sigal, L., Black, M.J.: ‘Measure locally, reason globally: occlusion-sensitive articulated pose estimation’. In: CVPR2006: the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, June 2006, pp. 2041–2048.
22. 22)
  - 13. Burl, M.C., Weber, M., Perona, P.: ‘A probabilistic approach to object recognition using local photometry and global geometry’. In: ECCV98: the Fifth European Conf. on Computer Vision, Freiburg, Germany, June 1998, pp. 628–641.
23. 23)
  - 22. Duan, K., Batra, D., Crandall, D.: ‘A multi-layer composite model for human pose estimation’. In: BMVC2012: the 2012 British Machine Vision Conf., Surrey, UK, September 2012, pp. 1–11.
24. 24)
  - 23. Pishchulin, L., Jain, A., Andriluka, M., Thormahlen, T., Schiele, B.: ‘Articulated people detection and pose estimation: reshaping the future’. In: CVPR2012: the 2012 IEEE Conf. on Computer Vision and Pattern Recognition, Providence, USA, June 2012, pp. 3178–3185.
25. 25)
  - 11. Lee, M.W., Cohen, I.: ‘Proposal maps driven mcmc for estimating human body pose in static images’. In: CVPR2004: the 2004 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Washington DC, USA, June 2004, pp. 334–341.
26. 26)
  - 17. Johnson, S., Everingham, M.: ‘Clustered pose and nonlinear appearance models for human pose estimation’. In: BMVC2010: the 2010 British Machine Vision Conf., Aberystwyth, UK, August 2010, pp. 1–11.

Part-based pose estimation with local and non-local contextual information

References

Related content