http://iet.metastore.ingenta.com
1887

Action recognition from mutually incoherent pose bases in static image

Action recognition from mutually incoherent pose bases in static image

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Action recognition in static image is challenging. The authors propose mutually incoherent pose bases which are implicit poselet co-occurrences and are learned by dictionary training to describe body pose. Poselets in a pose basis are not constrained in space and quantity, thus pose basis can describe body pose more flexibly than k-poselet. In their method, body pose in an image is represented by a sparse linear combination of pose bases because pose in an action varies while each image only captures a snapshot from a single viewpoint. In dictionary training, the challenge is how to stabilise the sparse representation which is the input of Support Vector Machine (SVM) for action recognition, because the original pose signal is ambiguous while dictionary is an over complete matrix. Their solution is to add cumulative coherence as penalty in objective function and induce pose bases become mutually incoherent. They evaluate the method on two popular datasets and experiment results show the pose representation has encouraging performance in action recognition. Furthermore, they empirically exploit the complementary role of the local pose feature with deep convolutional neural network features from holistic image. Experiment results demonstrate aggressive performance improvement by concatenating the two features.

References

    1. 1)
      • R. Poppe .
        1. Poppe, R.: ‘A survey on vision-based human action recognition’, Image Vis. Comput., 2010, 28, (6), pp. 976990.
        . Image Vis. Comput. , 6 , 976 - 990
    2. 2)
      • G. Guo , A. Lai .
        2. Guo, G., Lai, A.: ‘A survey on still image based human action recognition’, Pattern Recognit., 2014, 47, (10), pp. 33433361.
        . Pattern Recognit. , 10 , 3343 - 3361
    3. 3)
      • S. Lazebnik , C. Schmid , J. Ponce .
        3. Lazebnik, S., Schmid, C., Ponce, J.: ‘Beyond bags of features: spatial pyramid matching for recognizing natural scene categories’. Computer Vision and Pattern Recognition, 2006, pp. 21692178.
        . Computer Vision and Pattern Recognition , 2169 - 2178
    4. 4)
      • V. Delaitre , I. Laptev , J. Sivic .
        4. Delaitre, V., Laptev, I., Sivic, J.: ‘Recognizing human actions in still images: a study of bag-of-features and part-based representations’. British Machine Vision Conf., 2010, pp. 111.
        . British Machine Vision Conf. , 1 - 11
    5. 5)
      • W. Yang , Y. Wang , G. Mori .
        5. Yang, W., Wang, Y., Mori, G.: ‘Recognizing human actions from still images with latent poses’. Computer Vision and Pattern Recognition, 2010, pp. 20302037.
        . Computer Vision and Pattern Recognition , 2030 - 2037
    6. 6)
      • Y. Wang , D. Tran , Z. Liao .
        6. Wang, Y., Tran, D., Liao, Z., et al: ‘Discriminative hierarchical part-based models for human parsing and action recognition’, J. Mach. Learn. Res., 2012, 13, (1), pp. 30753102.
        . J. Mach. Learn. Res. , 1 , 3075 - 3102
    7. 7)
      • S. Maji , L. Bourdev , J. Malik .
        7. Maji, S., Bourdev, L., Malik, J.: ‘Action recognition from a distributed representation of pose and appearance’. Computer Vision and Pattern Recognition, 2011, pp. 31773184.
        . Computer Vision and Pattern Recognition , 3177 - 3184
    8. 8)
      • N. Ikizler , P. Duygulu .
        8. Ikizler, N., Duygulu, P.: ‘Histogram of oriented rectangles: a new pose descriptor for human action recognition’, Image Vis. Comput., 2009, 27, (10), pp. 15151526.
        . Image Vis. Comput. , 10 , 1515 - 1526
    9. 9)
      • B. Yao , F. Li .
        9. Yao, B., Li, F.: ‘Action recognition with exemplar based 2.5D graph matching’. European Conf. Computer Vision, 2012, pp. 173186.
        . European Conf. Computer Vision , 173 - 186
    10. 10)
      • P.-F. Felzenszwalb , D.-P. Huttenlocher .
        10. Felzenszwalb, P.-F., Huttenlocher, D.-P.: ‘Pictorial structures for object recognition’, Int. J. Comput. Vis., 2005, 61, (1), pp. 5579.
        . Int. J. Comput. Vis. , 1 , 55 - 79
    11. 11)
      • L. Bourdev , J. Malik .
        11. Bourdev, L., Malik, J.: ‘Poselet: body part detectors trained using 3D human pose annotations’. Int. Conf. Computer Vision, 2011, pp. 13651372.
        . Int. Conf. Computer Vision , 1365 - 1372
    12. 12)
      • L. Bourdev , S. Maji , T. Brox .
        12. Bourdev, L., Maji, S., Brox, T., et al: ‘Detecting people using mutually consistent poselet activations’. European Conf. Computer Vision, 2010, pp. 168181.
        . European Conf. Computer Vision , 168 - 181
    13. 13)
      • N. Dalal , B. Triggs .
        13. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Computer Vision and Pattern Recognition, 2005, pp. 886893.
        . Computer Vision and Pattern Recognition , 886 - 893
    14. 14)
      • B. Yao , X. Jiang , A. Khosal .
        14. Yao, B., Jiang, X., Khosal, A., et al: ‘Human action recognition by learning bases of action attributes and parts’. Int. Conf. Computer Vision, 2011, pp. 13311338.
        . Int. Conf. Computer Vision , 1331 - 1338
    15. 15)
      • Y. Qian , W. Chen , I. Shen .
        15. Qian, Y., Chen, W., Shen, I.: ‘Action recognition from pose signature in static image’, Int. J. Pattern Recognit. Artif. Intell., 2016, 30, (3), p. 1655010.
        . Int. J. Pattern Recognit. Artif. Intell. , 3 , 1655010
    16. 16)
      • G. Gkioxari , B. Hariharan , R. Girshick .
        16. Gkioxari, G., Hariharan, B., Girshick, R., et al: ‘Using k-poselet for detecting people and localizing their keypoints’. Computer Vision and Pattern Recognition, 2014, pp. 35823589.
        . Computer Vision and Pattern Recognition , 3582 - 3589
    17. 17)
      • B. Olshausen , D. Field .
        17. Olshausen, B., Field, D.: ‘Emergence of simple-cell receptive field properties by learning a sparse code for natural images’, Nature, 1996, 381, pp. 607609.
        . Nature , 607 - 609
    18. 18)
      • B. Olshausen , D. Field .
        18. Olshausen, B., Field, D.: ‘Sparse coding with an overcomplete basis set: a strategy employed by V1?’, Vis. Res., 1997, 37, (23), pp. 33113325.
        . Vis. Res. , 23 , 3311 - 3325
    19. 19)
      • A. Krizhevsky , I. Sutskever , G. Hinton .
        19. Krizhevsky, A., Sutskever, I., Hinton, G.: ‘ImageNet classification with deep convolutional neural networks’. Neural Information Processing Systems, 2012, pp. 10971105.
        . Neural Information Processing Systems , 1097 - 1105
    20. 20)
      • J. Deng , A. Berg , S. Satheesh .
        20. Deng, J., Berg, A., Satheesh, S., et al: ‘ImageNet large scale visual recognition competition 2012’, http://www.image-net.org/challenges/LSVRC/2012/ Accessed: April 5, 2017.
        .
    21. 21)
      • R. Girshick , J. Donahue , T. Darrell .
        21. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. Computer Vision and Pattern Recognition, 2014, pp. 580587.
        . Computer Vision and Pattern Recognition , 580 - 587
    22. 22)
      • Y. Guo , Y. Liu , A. Oerlemans .
        22. Guo, Y., Liu, Y., Oerlemans, A., et al: ‘Deep learning for visual understanding: a review’, Neurocomputing, 2016, 187, pp. 2748.
        . Neurocomputing , 27 - 48
    23. 23)
      • J. Donahue , Y. Jia , O. Vinyals .
        23. Donahue, J., Jia, Y., Vinyals, O., et al: ‘DeCAF: a deep convolutional activation feature for generic visual recognition’. Int. Conf. Machine Learning, 2014, pp. 18.
        . Int. Conf. Machine Learning , 1 - 8
    24. 24)
      • M.-A. Goodale , A.-D. Milner .
        24. Goodale, M.-A., Milner, A.-D.: ‘Separate visual pathways for perception and action’, Trends Neurosci., 1992, 15, (1), pp. 2025.
        . Trends Neurosci. , 1 , 20 - 25
    25. 25)
      • G. Chĺęron , I. Laptev , C. Schmid .
        25. Chĺęron, G., Laptev, I., Schmid, C.: ‘P-CNN: pose-based CNN features for action recognition’. Int. Conf. Computer Vision, 2015, pp. 32183226.
        . Int. Conf. Computer Vision , 3218 - 3226
    26. 26)
      • G. Glkioxari , J. Malik .
        26. Glkioxari, G., Malik, J.: ‘Finding action tubes’. Computer Vision and Pattern Recognition, 2015, pp. 759768.
        . Computer Vision and Pattern Recognition , 759 - 768
    27. 27)
      • Y. Qian , W. Chen , I. Shen .
        27. Qian, Y., Chen, W., Shen, I.: ‘Mutually incoherent pose bases for action recognition’. ICPR, 2016, pp. 823828.
        . ICPR , 823 - 828
    28. 28)
      • M.-A. Fischler , R.-A. Elschlager .
        28. Fischler, M.-A., Elschlager, R.-A.: ‘The representation and matching of pictorial structures’, IEEE Trans. Comput., 1973, 22, (1), pp. 6792.
        . IEEE Trans. Comput. , 1 , 67 - 92
    29. 29)
      • P.-F. Felzenszwalb , R.B. Girshick , D. McAllester .
        29. Felzenszwalb, P.-F., Girshick, R.B., McAllester, D., et al: ‘Object detection with discriminatively trained part-based models’, IEEE Trans. Pattern Anal. Mach. Intell., 2010, 32, (9), pp. 16271645.
        . IEEE Trans. Pattern Anal. Mach. Intell. , 9 , 1627 - 1645
    30. 30)
      • S. Zuffi , O. Freifeld , M.-J. Black .
        30. Zuffi, S., Freifeld, O., Black, M.-J.: ‘From pictorial structures to deformable structures’. Computer Vision and Pattern Recognition, 2013, pp. 35463553.
        . Computer Vision and Pattern Recognition , 3546 - 3553
    31. 31)
      • X. Chen , A.-L. Yuille .
        31. Chen, X., Yuille, A.-L.: ‘Articulated pose estimation by a graphical model with image dependent pairwise relations’. Neural Information Processing System, 2014, pp. 18.
        . Neural Information Processing System , 1 - 8
    32. 32)
      • W. Yang , W. Ouyang , H. Li .
        32. Yang, W., Ouyang, W., Li, H., et al: ‘End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation’. Computer Vision and Pattern Recognition, 2016, pp. 30733082.
        . Computer Vision and Pattern Recognition , 3073 - 3082
    33. 33)
      • D.-L. Donoho , J.-M. Johnstone .
        33. Donoho, D.-L., Johnstone, J.-M.: ‘Ideal spatial adaptation by wavelet shrinkage’, Biometrika, 1994, 81, (3), pp. 425455.
        . Biometrika , 3 , 425 - 455
    34. 34)
      • S. Mallat , Z. Zhang .
        34. Mallat, S., Zhang, Z.: ‘Matching pursuit in a time-frequency dictionary’, IEEE Trans. Signal Process., 1993, 41, (12), pp. 33973415.
        . IEEE Trans. Signal Process. , 12 , 3397 - 3415
    35. 35)
      • Y. Pati , R. Rezaiifar , P. Krishnaprasad .
        35. Pati, Y., Rezaiifar, R., Krishnaprasad, P.: ‘Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition’. Asilomar Conf. Signals, Systems and Computers, 1993, pp. 4044.
        . Asilomar Conf. Signals, Systems and Computers , 40 - 44
    36. 36)
      • M. Aharon , M. Elad , A Bruckstein .
        36. Aharon, M., Elad, M., Bruckstein, A: ‘K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation’, IEEE Trans. Signal Process., 2006, 54, (11), pp. 43114322.
        . IEEE Trans. Signal Process. , 11 , 4311 - 4322
    37. 37)
      • K. Huang , S. Aviyente .
        37. Huang, K., Aviyente, S.: ‘Sparse representation for signal classification’. Neural Information Processing System, 2006, pp. 609616.
        . Neural Information Processing System , 609 - 616
    38. 38)
      • J. Mairal , F. Bach , J. Ponce .
        38. Mairal, J., Bach, F., Ponce, J., et al: ‘Supervised dictionary learning’. Neural Information Processing System, 2008, pp. 18.
        . Neural Information Processing System , 1 - 8
    39. 39)
      • M. Yang , D. Zhang , X. Feng .
        39. Yang, M., Zhang, D., Feng, X.: ‘Fisher discrimination dictionary learning for sparse representation’. Int. Conf. Computer Vision, 2011, pp. 543550.
        . Int. Conf. Computer Vision , 543 - 550
    40. 40)
      • D.-S. Pham , S. Venkatesh .
        40. Pham, D.-S., Venkatesh, S.: ‘Joint learning and dictionary construction for pattern recognition’. Computer Vision and Pattern Recognition, 2008, pp. 18.
        . Computer Vision and Pattern Recognition , 1 - 8
    41. 41)
      • D.L. Donoho , M. Elad .
        41. Donoho, D.L., Elad, M.: ‘Optimally sparse representation in general (non-orthogonal) dictionaries via l1 minimization’, Proc. Natl. Acad. Sci., 2002, 100, (5), pp. 21972202.
        . Proc. Natl. Acad. Sci. , 5 , 2197 - 2202
    42. 42)
      • J.A. Tropp .
        42. Tropp, J.A.: ‘Greed is good: algorithmic results for sparse approximation’, IEEE Trans. IT, 2004, 50, (10), pp. 22312242.
        . IEEE Trans. IT , 10 , 2231 - 2242
    43. 43)
      • L. Bo , X. Ren , D. Fox .
        43. Bo, L., Ren, X., Fox, D.: ‘Multipath sparse coding using hierarchical matching pursuit’. Computer Vision and Pattern Recognition, 2013, pp. 660667.
        . Computer Vision and Pattern Recognition , 660 - 667
    44. 44)
      • M. Everingham , S.-M. Eslami , L.-V. Gool .
        44. Everingham, M., Eslami, S.-M., Gool, L.-V., et al: ‘The PASCAL visual object classes challenge: a retrospective’, Int. J. Comput. Vis., 2015, 111, (1), pp. 98136.
        . Int. J. Comput. Vis. , 1 , 98 - 136
    45. 45)
      • N.-I. Cinbis , R.-G. Cinbis , S. Sclaroff .
        45. Cinbis, N.-I., Cinbis, R.-G., Sclaroff, S.: ‘Learning actions from the web’. Computer Vision and Pattern Recognition, 2009, pp. 9951002.
        . Computer Vision and Pattern Recognition , 995 - 1002
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2017.0233
Loading

Related content

content/journals/10.1049/iet-cvi.2017.0233
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address