access icon free Extensive exploration of comprehensive vehicle attributes using D-CNN with weighted multi-attribute strategy

As a classical machine learning method, multi-task learning (MTL) has been widely applied in computer vision technology. Due to deep convolutional neural network (D-CNN) having strong ability of feature representation, the combination of MTL and D-CNN has attracted much attention from researchers recently. However, this kind of combination has rarely been explored in the field of vehicle analysis. The authors propose a D-CNN enhanced with weighted multi-attribute strategy for extensive exploration of comprehensive vehicle attributes over surveillance images. Specifically, regarding to recognising vehicle model and make/manufacturer, several related attributes as auxiliary tasks are incorporated in the training process of D-CNN structure. The proposed strategy focuses more on the main task compared with traditional MTL methods, which has assigned different weights for the main task and auxiliary tasks rather than treating all involved tasks equally. To the extent of their knowledge, this is the first report relating to the combination of D-CNN and weighted MTL for exploration of comprehensive vehicle attributes. The following experiments will show that the proposed approach outperforms the state-of-the-art method for the vehicle recognition and improves the accuracy rate by about 10% for the analysis of other vehicle attributes on the recently public CompCars dataset.

Inspec keywords: learning (artificial intelligence); feedforward neural nets; object recognition

Other keywords: multitask learning; surveillance images; comprehensive vehicle attributes; deep convolutional neural network; make recognition; MTL methods; weighted multiattribute strategy; manufacturer recognition; vehicle model recognition; CompCars vehicle dataset; D-CNN

Subjects: Image recognition; Computer vision and image processing techniques; Neural computing techniques

References

    1. 1)
      • 28. Uijlings, J.R., van de Sande, K.E., Gevers, T., et al: ‘Selective search for object recognition’, Int. J. Comput. Vis., 2013, 104, (2), pp. 154171.
    2. 2)
      • 20. Li, S., Liu, Z.Q., Chan, A.B.: ‘Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network’. IEEE Conf. Computer Vision and Pattern Recognition Workshops (CVPR), June 2014, pp. 482489.
    3. 3)
      • 11. Lazebnik, S., Schmid, C., Ponce, J.: ‘Beyond bags of features: spatial pyramid matching for recognizing natural scene categories’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2006, vol. 2, pp. 21692178.
    4. 4)
      • 49. Yang, L., Luo, P., Change Loy, C., et al: ‘A large-scale car dataset for fine-grained categorization and verification’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2015, pp. 39733981.
    5. 5)
      • 48. Zhou, Y., Liu, L., Shao, L., et al: ‘DAVE: A unified framework for fast vehicle detection and annotation’. European Conf. Computer Vision (ECCV), October 2016, pp. 278293.
    6. 6)
      • 32. LeCun, Y., Bottou, L., Bengio, Y., et al: ‘Gradient-based learning applied to document recognition’, Proc. IEEE, 1998, 86, (11), pp. 22782324.
    7. 7)
      • 7. Lowe, D.G.: ‘Distinctive image features from scale-invariant keypoints’, Int. J. Comput. Vis., 2004, 60, (2), pp. 91110.
    8. 8)
      • 45. Li, C., Zhu, J., Chen, J.: ‘Bayesian max-margin multi-task learning with data augmentation’. Int. Conf. Machine learning (ICML), June 2014, pp. 415423.
    9. 9)
      • 3. Huang, Y., Wu, R., Sun, Y., et al: ‘Vehicle logo recognition system based on convolutional neural networks with a pretraining strategy’, IEEE Trans. Intell. Transp. Syst., 2015, 16, (4), pp. 19511960.
    10. 10)
      • 38. Han, L., Zhang, Y., Song, G., et al: ‘Encoding tree sparsity in multi-task learning: a probabilistic framework’. AAAI Conf. Artificial Intelligence (AI), June 2014, pp. 18541860.
    11. 11)
      • 41. Argyriou, A., Evgeniou, T., Pontil, M.: ‘Convex multi-task feature learning’, Mach. Learn., 2008, 73, (3), pp. 243272.
    12. 12)
      • 35. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2014, pp. 580587.
    13. 13)
      • 2. Sivaraman, S., Trivedi, M.M.: ‘A general active-learning framework for on-road vehicle recognition and tracking’, IEEE Trans. Intell. Transp. Syst., 2010, 11, (2), pp. 267276.
    14. 14)
      • 33. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Advances in Neural Information Processing Systems (NIPS), December 2012, pp. 10971105.
    15. 15)
      • 14. Dong, Z., Wu, Y., Pei, M., et al: ‘Vehicle type classification using a semisupervised convolutional neural network’, IEEE Trans. Intell. Transp. Syst., 2015, 16, (4), pp. 22472256.
    16. 16)
      • 17. Chang, C.C., Lin, C.J.: ‘LIBSVM: a library for support vector machines’, ACM Trans. Intell. Syst. Technol., 2011, 2, (3), p. 27.
    17. 17)
      • 29. ‘arXiv.org’, http://arxiv.org/abs/1301.3557, accessed 16 Jan 2013.
    18. 18)
      • 13. Nabeel, M.M., el Deen, M.F., El-Kader, S.: ‘Intelligent vehicle recognition based on wireless sensor network’, Int. J. Comput. Sci. Issues, 2013, 10, (4), pp. 164174.
    19. 19)
      • 5. Hsieh, J.W., Chen, L.C., Chen, D.Y.: ‘Symmetrical surf and its applications to vehicle detection and vehicle make and model recognition’, IEEE Trans. Intell. Transp. Syst., 2014, 15, (1), pp. 620.
    20. 20)
      • 10. Uijlings, J.R., Smeulders, A.W., Scha, R.J.: ‘Real-time bag of words, approximately’. ACM Int. Conf. Image and Video Retrieval (CIVR), July 2009, p. 6.
    21. 21)
      • 31. Goodfellow, I.J., Warde-Farley, D., Mirza, M., et al: ‘Maxout networks’. Int. Conf. Machine Learning (ICML), June 2013, pp. 13191327.
    22. 22)
      • 23. Chen, D., Mak, B., Leung, C.C., et al: ‘Joint acoustic modeling of triphones and trigraphemes by multi-task learning deep neural networks for low-resource speech recognition’. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), May 2014, pp. 55925596.
    23. 23)
      • 25. Zeiler, M.D., Krishnan, D., Taylor, G.W., et al: ‘Deconvolutional networks’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2010, pp. 25282535.
    24. 24)
      • 46. Zhang, Z., Luo, P., Loy, C.C., et al: ‘Facial landmark detection by deep multi-task learning’. European Conf. Computer Vision (ECCV), September 2014, pp. 94108.
    25. 25)
      • 30. Ouyang, W., Wang, X., Zeng, X., et al: ‘Deepid-net: Deformable deep convolutional neural networks for object detection’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2015, pp. 24032412.
    26. 26)
      • 4. Peng, H., Wang, X., Wang, H., et al: ‘Recognition of low-resolution logos in vehicle images based on statistical random sparse distribution’, IEEE Trans. Intell. Transp. Syst., 2015, 16, (2), pp. 681691.
    27. 27)
      • 43. Jacob, L., Vert, J.P., Bach, F.R.: ‘Clustered multi-task learning: a convex formulation’. Advances in Neural Information Processing Systems (NIPS), December 2009, pp. 745752.
    28. 28)
      • 12. Loy, G., Eklundh, J.O.: ‘Detecting symmetry and symmetric constellations of features’. European Conf. Computer Vision (ECCV), May 2006, pp. 508521.
    29. 29)
      • 8. Psyllos, A.P., Anagnostopoulos, C.N.E., Kayafas, E.: ‘Vehicle logo recognition using a SIFT-based enhanced matching scheme’, IEEE Trans. Intell. Transp. Syst., 2010, 11, (2), pp. 322328.
    30. 30)
      • 44. Kato, T., Kashima, H., Sugiyama, M., et al: ‘Multi-task learning via conic programming’. Advances in Neural Information Processing Systems (NIPS), December 2008, pp. 737744.
    31. 31)
      • 24. Seltzer, M.L., Droppo, J.: ‘Multi-task learning in deep neural networks for improved phoneme recognition’. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), May 2013, pp. 69656969.
    32. 32)
      • 26. Wan, L., Zeiler, M., Zhang, S., et al: ‘Regularization of neural networks using dropconnect’. Int. Conf. Machine Learning (ICML), June 2013, pp. 10581066.
    33. 33)
      • 34. ‘arXiv.org’, http://arxiv.org/abs/1312.4400, accessed 4 Mar 2014.
    34. 34)
      • 36. He, K., Zhang, X., Ren, S., et al: ‘Spatial pyramid pooling in deep convolutional networks for visual recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (9), pp. 19041916.
    35. 35)
      • 16. Pan, C., Yan, Z., Xu, X., et al: ‘Vehicle logo recognition based on deep learning architecture in video surveillance for intelligent traffic system’. IET Int. Conf. Smart and Sustainable City (ICSSC), August 2013, pp. 123126.
    36. 36)
      • 9. Yu, S., Zheng, S., Yang, H., et al: ‘Vehicle logo recognition based on bag-of-words’. Advanced Video and Signal Based Surveillance (AVSS), August 2013, pp. 353358.
    37. 37)
      • 1. Hu, C., Bai, X., Qi, L., et al: ‘Vehicle color recognition with spatial pyramid deep learning’, IEEE Trans. Intell. Transp. Syst., 2015, 16, (5), pp. 29252934.
    38. 38)
      • 47. Liu, X., Liu, W., Ma, H., et al: ‘Large-scale vehicle re-identification in urban surveillance videos’. IEEE Int. Conf. Multimedia and Expo (ICME), July 2016, pp. 16.
    39. 39)
      • 51. ‘arXiv.org’, http://arxiv.org/abs/1312.6229, accessed 21 December 2013.
    40. 40)
      • 19. Huang, J., Xia, W., Yan, S.: ‘Deep search with attribute-aware deep network’. ACM Int. Conf. Multimedia (MM), November 2014, pp. 731732.
    41. 41)
      • 18. Abdulnabi, A.H., Wang, G., Lu, J., et al: ‘Multi-task CNN model for attribute prediction’, IEEE Trans. Multimed., 2015, 17, (11), pp. 19491959.
    42. 42)
      • 40. Collobert, R., Weston, J.: ‘A unified architecture for natural language processing: deep neural networks with multitask learning’. Int. Conf. Machine Learning (ICML), July 2008, pp. 160167.
    43. 43)
      • 27. Zeiler, M.D., Fergus, R.: ‘Visualizing and understanding convolutional networks’. European Conf. Computer Vision (ECCV), September 2014, pp. 818833.
    44. 44)
      • 39. Ouyang, W., Chu, X., Wang, X.: ‘Multi-source deep learning for human pose estimation’. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), June 2014, pp. 23292336.
    45. 45)
      • 15. Chen, P., Bai, X., Liu, W.: ‘Vehicle color recognition on urban road by feature context’, IEEE Trans. Intell. Transp. Syst., 2014, 15, (5), pp. 23402346.
    46. 46)
      • 37. Caruana, R.: ‘Multitask learning, learning to learn’ (Springer Press, 1998), pp. 95133.
    47. 47)
      • 50. Prince, S.J.D., Elder, J.H.: ‘Probabilistic linear discriminant analysis for inferences about identity’. IEEE Int. Conf. Computer Vision (ICCV), October 2007, pp. 18.
    48. 48)
      • 42. Bonilla, E.V., Chai, K.M., Williams, C.: ‘Multi-task Gaussian process prediction’. Advances in Neural Information Processing Systems (NIPS), December 2007, pp. 153160.
    49. 49)
      • 22. Zhang, C., Zhang, Z.: ‘Improving multiview face detection with multi-task deep convolutional neural networks’. IEEE Winter Conf. Applications of Computer Vision (WACV), March 2014, pp. 10361041.
    50. 50)
      • 21. Yi, D., Lei, Z., Li, S.Z.: ‘Age estimation by multi-scale convolutional network’. Asian Conf. Computer Vision (ACCV), November 2014, pp. 144158.
    51. 51)
      • 6. Bosch, A., Zisserman, A., Muñoz, X.: ‘Scene classification via pLSA’. European Conf. Computer Vision (ECCV), May 2006, pp. 517530.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-its.2017.0066
Loading

Related content

content/journals/10.1049/iet-its.2017.0066
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading