access icon free Multi-group–multi-class domain adaptation for event recognition

In this study, the authors propose a multi-group–multi-class domain adaptation framework to recognise events in consumer videos by leveraging a large number of web videos. The authors’ framework is extended from multi-class support vector machine by adding a novel data-dependent regulariser, which can force the event classifier to become consistent in consumer videos. To obtain web videos, they search them using several event-related keywords and refer the videos returned by one keyword search as a group. They also leverage a video representation which is the average of convolutional neural networks features of the video frames for better performance. Comprehensive experiments on the two real-world consumer video datasets demonstrate the effectiveness of their method for event recognition in consumer videos.

Inspec keywords: neural nets; video signal processing; support vector machines

Other keywords: multiclass SVM; Web videos; event recognition; neural networks; consumer videos; multigroup-multi-class domain adaptation; data-dependent regulariser; video representation

Subjects: Knowledge engineering techniques; Neural computing techniques; Optical, image and video signal processing; Video signal processing

References

    1. 1)
      • 19. Liu, X., Yuan, X., Yan, S., et al: ‘Multi-class semi-supervised SVMs with positiveness exclusive regularization’. IEEE Int. Conf. on Computer Vision, Barcelona, Spain, 2011, pp. 14351442.
    2. 2)
      • 4. Chen, L., Duan, L., Xu, D.: ‘Event recognition in videos by learning from heterogeneous web sources’. IEEE Conf. on Computer Vision and Pattern Recognition, Portland, USA, June 2013, pp. 26662673.
    3. 3)
      • 10. Yang, J., Yan, R., Hauptmann, A.G.: ‘Cross-domain video concept detection using adaptive SVMs’. Proc. of the 15th Int. Conf. on Multimedia, Augsburg, Germany, 2007, pp. 188197.
    4. 4)
      • 20. Saffari, A., Leistner, C., Bischof, H.: ‘Regularized multi-class semisupervised boosting’. IEEE Conf. on Computer Vision and Pattern Recognition, Miami, USA, June 2009, pp. 967974.
    5. 5)
      • 34. Laptev, I., Marszalek, M., Schmid, C., et al: ‘Learning realistic human actions from movies’. IEEE Conf. on Computer Vision and Pattern Recognition, Anchorage, USA, 2008, pp. 18.
    6. 6)
    7. 7)
    8. 8)
      • 29. Zien, A., De Bona, F., Ong, C.S.: ‘Training and approximation of a primal multiclass support vector machine’, ASMDA, 2007.
    9. 9)
    10. 10)
    11. 11)
      • 21. Donahue, J., Hoffman, J., Rodner, E., et al: ‘Semi-supervised domain adaptation with instance constraints’. IEEE Conf. on Computer Vision and Pattern Recognition, Portland, USA, June 2013, pp. 668675.
    12. 12)
    13. 13)
      • 12. Hoffman, J, Rodner, E., Donahue, J., et al: ‘Efficient learning of domain-invariant image representations’, arXiv preprint, arXiv:1301.3224, 2013.
    14. 14)
    15. 15)
      • 24. Jia, Y., Shelhamer, E., Donahue, J.: ‘Caffe: convolutional architecture for fast feature embedding’, arXiv preprint, arXiv:1408.5093, 2014.
    16. 16)
      • 33. Duan, L., Xu, D., Chang, S.F.: ‘Exploiting web images for event recognition in consumer videos: a multiple source domain adaptation approach’. IEEE Conf. on Computer Vision and Pattern Recognition, Providence, USA, 2012, pp. 13381345.
    17. 17)
      • 9. Jiang, Y., He, G., Chang, S., et al: ‘Consumer video understanding: a benchmark database and an evaluation of human and machine performance’. Proc. of the First ACM Int. Conf. on Multimedia Retrieval, Trento, Italy, 2011, p. 29.
    18. 18)
      • 23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems, Harrahs Lake Tahoe, USA, 2012, pp. 10971105.
    19. 19)
      • 15. Xu, J., Ramos, S., Vázquez, D., et al: ‘Cost-sensitive structured SVM for multi-category domain adaptation’. Int. Conf. on Pattern Recognition, Stockholm, Sweden, August 2014, pp. 38863891.
    20. 20)
      • 25. Belkin, M., Niyogi, P., Sindhwani, V.: ‘Manifold regularization: a geometric framework for learning from labeled and unlabeled examples’, J. Mach. Learn. Res., 2006, 7, pp. 23992434.
    21. 21)
    22. 22)
      • 17. Valizadegan, H., Jin, R., Jain, A.K.: ‘Semi-supervised boosting for multi-class classification’. Proc. of the European Conf. on Machine Learning and Knowledge Discovery in Databases, Antwerp, Belgium, 2008, pp. 522537.
    23. 23)
      • 22. Karpathy, A., Toderici, G., Shetty, S., et al: ‘Large-scale video classification with convolutional neural networks’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, USA, 2014, pp. 17251732.
    24. 24)
      • 5. Feng, Y., Wu, X., Wang, H., et al: ‘Multi-group adaptation for event recognition from videos’. Int. Conf. on Pattern Recognition, Stockholm, Sweden, August 2014, pp. 39153920.
    25. 25)
    26. 26)
    27. 27)
      • 8. Loui, A., Luo, J., Chang, S., et al: ‘Kodaks consumer video benchmark data set: concept definition and annotation’. Proc. of the Int. Workshop on Multimedia Information Retrieval, Augsburg, Germany, 2007, pp. 245254.
    28. 28)
      • 28. Do, T.M.T., Arti'eres, T.: ‘Large margin training for hidden Markov models with partially observed states’. Proc. of Int. Conf. on Machine Learning, Montreal, Canada, 2009, pp. 265272.
    29. 29)
      • 13. Wu, X., Wang, H., Liu, C., et al: ‘Cross-view action recognition over heterogeneous feature spaces’. IEEE Int. Conf. on Computer Vision, Sydney, Australia, December 2013, pp. 609616.
    30. 30)
      • 3. Ikizler-Cinbis, N., Cinbis, R.G., Sclaroff, S.: ‘Learning actions from the web’. IEEE Int. Conf. on Computer Vision, Kyoto, Japan, September 2009, pp. 9951002.
    31. 31)
    32. 32)
      • 6. Crammer, K., Singer, Y.: ‘On the algorithmic implementation of multiclass kernel-based vector machines’, J. Mach. Learn. Res., 2002, 2, pp. 265292.
    33. 33)
    34. 34)
      • 30. Gong, B., Shi, Y., Sha, F., et al: ‘Geodesic flow kernel for unsupervised domain adaptation’. IEEE Conf. on Computer Vision and Pattern Recognition, Providence, USA, 2012, pp. 20662073.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2014.0405
Loading

Related content

content/journals/10.1049/iet-cvi.2014.0405
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading