Visual object tracking is a challenging task due to two intractable problems: visual appearance representation and online update model. Existing approaches often operate appearance model based on hand-crafted features with discriminative feature selection. The tracking learning model is formulated as a binary classification. However, some issues remain to be addressed. First, there does not exist sufficient information for online feature selection. Second, these algorithms do not make use of structure information between object and background. In this study, the authors propose an algorithm named data driven tracker with an appearance model which exploits prior visual target representation by binary PCANet. The authors’ speed up strategy by binary operation on the convolution filters is efficient for tracking task with little performance loss. They formulate the learning model as multi-class task via online LPBoost. Their data-driven tracking (DDT) algorithm performs favourably on various challenging sequences by evaluating against state-of-the-art trackers.

References

1. 1)
  - 18. Zhang, K., Zhang, L., Liu, Q., et al: ‘Fast visual tracking via dense spatio-temporal context learning’. Proc. 13th European Conf. on Computer Vision (ECCV), Zurich, Switzerland, September 2014, pp. 127–141.
2. 2)
  - 1. Li, X., Hu, W., Shen, C., et al: ‘A survey of appearance models in visual object tracking’, ACM Trans. Intell. Syst. Technol., 2013, 4, (4), p. 58 (doi: 10.1145/2508037.2508039).
3. 3)
  - 8. Lee, H., Grosse, R., Ranganath, R., et al: ‘Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations’. Proc. 26th Annual Int. Conf. on Machine Learning (ACM), Montreal, Canada, June 2009, pp. 609–616.
4. 4)
  - 21. Saffari, A., Godec, M., Pock, T., et al: ‘Online multi-class LPBoost’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, June 2010, pp. 3570–3577.
5. 5)
  - 14. Zhang, K., Song, H.: ‘Real-time visual tracking via online weighted multiple instance learning’, Pattern Recognit., 2013, 46, pp. 397–411 (doi: 10.1016/j.patcog.2012.07.013).
6. 6)
  - 34. Adam, A., Rivlin, E., Shimshoni, I.: ‘Robust fragments-based tracking using the integral histogram’. Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, June 2006, pp. 798–805.
7. 7)
  - 6. Bengio, Y., Courville, A., Vincent, P.: ‘Representation learning: a review and new perspectives’, IEEE Trans. Pattern Anal. Mach. Intell., 2013, 35, (8), pp. 1798–1828 (doi: 10.1109/TPAMI.2013.50).
8. 8)
  - 3. Grabner, H., Grabner, M., Bischof, H.: ‘Real-time tracking via on-line boosting’. Proc. British Machine Vision Conf., Edinburgh, Scotland, September 2006, p. 6.
9. 9)
  - 31. Bordes, A., Bottou, L., Gallinari, P., et al: ‘Solving multiclass support vector machines with LaRank’. Proc. 24th Int. Conf. on Machine Learning, Corvallis, OR, USA, June 2007, pp. 89–96.
10. 10)
  - 30. Wright, S., Nocedal, J.: ‘Numerical optimization’ (Springer, New York, NY, 1999), vol. 2.
11. 11)
  - 10. Masnadi-Shirazi, H., Mahadevan, V., Vasconcelos, N.: ‘On the design of robust classifiers for computer vision’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, June 2010, pp. 779–786.
12. 12)
  - 13. Babenko, B., Yang, M.-H., Belongie, S.: ‘Visual tracking with online multiple instance learning’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, (CVPR), Miami, FL, USA, June 2009, pp. 983–990.
13. 13)
  - 24. Hare, S., Saffari, A., Torr, P.H.: ‘Efficient online structured output learning for keypoint-based object tracking’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, June 2012, pp. 1894–1901.
14. 14)
  - 1. Hinton, G., Osindero, S.: ‘A fast learning algorithm for deep belief nets’, Neural Comput., 2006, 18, (7), pp. 1527–1554 (doi: 10.1162/neco.2006.18.7.1527).
15. 15)
  - 20. Wang, N., Wang, J., Yeung, D.-Y.: ‘Online robust non-negative dictionary learning for visual tracking’. IEEE Int. Conf. on Computer Vision (ICCV), Sydney, Australia, September 2013, pp. 657–664.
16. 16)
  - 9. Wang, N., Yeung, D.-Y.: ‘Learning a deep compact image representation for visual tracking’, Presented at Neural Information Processing Systems, Lake Tahoe, CA, December 2013, pp. 809–817.
17. 17)
  - 29. Demiriz, A., Bennett, K.P., Shawe-Taylor, J.: ‘Linear programming boosting via column generation’, Mach. Learn., 2002, 46, (1–3), pp. 225–254 (doi: 10.1023/A:1012470815092).
18. 18)
  - 22. Everingham, M., Van Gool, L., Williams, C., et al: ‘The Pascal visual object classes challenge 2007 (VOC 2007) results’. 2008, http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
19. 19)
  - 35. Wu, Y., Lim, J., Yang, M.-H.: ‘Online object tracking: a benchmark’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, June 2013.
20. 20)
  - 11. Zhang, K., Zhang, L., Yang, M.: ‘Fast compressive tracking’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (10), pp. 2002–2015 (doi: 10.1109/TPAMI.2014.2315808).
21. 21)
  - 15. Henriques, J.F., Caseiro, R., Martins, P., et al: ‘Exploiting the circulant structure of tracking-by-detection with kernels’. Proc. 8th European Conf. on Computer Vision (ECCV), Florence, Italy, October 2012, pp. 702–715.
22. 22)
  - 25. Hall, P., Marshall, D., Martin, R.: ‘Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decomposition’, Image Vis. Comput., 2002, 20, (13), pp. 1009–1016 (doi: 10.1016/S0262-8856(02)00114-2).
23. 23)
  - 23. Chan, T.-H., Jia, K., Gao, S., et al: ‘PCANet: a simple deep learning baseline for image classification?’ arXiv preprint, arXiv:1404.3606, 2014.
24. 24)
  - 17. Hong, S., Han, B.: ‘Visual tracking by sampling tree-structured graphical models’. Computer Vision – Proc. 13th European Conf. on Computer Vision (ECCV), Zurich, Switzerland, September 2014, pp. 1–16.
25. 25)
  - 33. Kwon, J., Lee, K.M.: ‘Visual tracking decomposition’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, June 2010, pp. 1269–1276.
26. 26)
  - 5. Li, H., Shen, C., Shi, Q.: ‘Real-time visual tracking using compressive sensing’. Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, June 2011, pp. 1305–1312.
27. 27)
  - 11. Leistner, C., Saffari, A., Roth, P.M., et al: ‘On robustness of on-line boosting – a competitive study’. IEEE 12th Int. Conf. on Computer Vision Workshops (ICCV Workshops), Kyoto, Japan, September 2009, pp. 1362–1369.
28. 28)
  - 26. Nguyen, H.T., Ji, Q., Smeulders, A.W.: ‘Spatio-temporal context for robust multitarget tracking’, IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29, (1), pp. 52–64 (doi: 10.1109/TPAMI.2007.250599).
29. 29)
  - 19. Mei, X., Ling, H.: ‘Robust visual tracking using ℓ1 minimization’. 12th IEEE Int. Conf. on Computer Vision, Kyoto, Japan, September 2009, pp. 1436–1443.
30. 30)
  - 16. Hare, S., Saffari, A., Torr, P.H.: ‘Struck: Structured output tracking with kernels’. IEEE Int. Conf. on Computer Vision (ICCV), Barcelona, Spain, November 2011, pp. 263–270.
31. 31)
  - 27. Wen, L., Cai, Z., Lei, Z., et al: ‘Robust online learned spatio-temporal context model for visual tracking’, IEEE Trans. Image Process., 2014, 23, (2), pp. 785–796 (doi: 10.1109/TIP.2013.2293430).
32. 32)
  - 32. Zhang, K., Zhang, L., Yang, M.-H.: ‘Real-time compressive tracking’. Proc. 8th European Conf. on Computer Vision (ECCV), Florence, Italy, October 2012, pp. 864–877.
33. 33)
  - 5. Ross, D.A., Lim, J., Lin, R.S., Yang, M.H.: ‘Incremental learning for robust visual tracking’, Int. J. Comput. Vis., 2008, 77, (1), pp. 125–141 (doi: 10.1007/s11263-007-0075-7).
34. 34)
  - 12. Grabner, H., Leistner, C., Bischof, H.: ‘Semi-supervised on-line boosting for robust tracking’. Proc. 10th European Conf. on Computer Vision (ECCV), Marseille, France, October 2008, pp. 234–247.
35. 35)
  - 28. Yu, Q., Dinh, T.B., Medioni, G.: ‘Online tracking and reacquisition using co-trained generative and discriminative trackers’. Proc. 10th European Conf. on Computer Vision (ECCV), Marseille, France, October 2008, pp. 678–691.

Data driven visual tracking via representation learning and online multi-class LPBoost learning

References

Related content