Fast single shot multibox detector and its application on vehicle counting system

Fast single shot multibox detector and its application on vehicle counting system

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Intelligent Transport Systems — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Real-time vehicle detection and counting of multiple types is a difficult problem. To solve this problem, this study presents an efficient method based on single shot detection (SSD) to construct a vehicle detection and counting system. The proposed method named Fast-SSD first combines the Slim ResNet-34 with Single Shot MultiBox Detector. Then the authors limit the location prediction at each cell in the feature map and modify the detection network. When the input size of the picture is 300 × 300, Fast-SSD achieves the accuracy of 76.7 mAP on the PASCAL visual object classes 2007 test set. The network can be implemented at the speed of 20.8 FPS based on the GTX650Ti. Furthermore, they obtain the centre point of each type of vehicle which is detected by the Fast-SSD model in the image and set the virtual loop detectors to specify the detection range. The number of vehicles is calculated when the centre of the vehicle passes the virtual loop detector. Results show that the vehicle detection accuracy achieves 99.3% and the classification accuracy is 98.9%.


    1. 1)
      • 1. Steux, B., Laurgeau, C., Salesse, L., et al: ‘Fade: a vehicle detection and tracking system featuring monocular color vision and radar data fusion’. Intell. Veh. Symp., Versailles, France, 2002, pp. 632639.
    2. 2)
      • 2. Jo, Y., Jung, I.: ‘Analysis of vehicle detection with WSN-based ultrasonic sensors’, Sensors, 2014, 14, (8), pp. 1405014069.
    3. 3)
      • 3. Iwasaki, Y., Misumi, M., Nakamiya, T.: ‘Robust vehicle detection under various environmental conditions using an infrared thermal camera and its application to road traffic flow monitoring’, Sensors, 2013, 13, (6), p. 7756.
    4. 4)
      • 4. Tian, B., Morris, B.T., Tang, M., et al: ‘Hierarchical and networked vehicle surveillance in ITS: a survey’, IEEE Trans. Intell. Transp. Syst., 2017, 18, (1), pp. 2548.
    5. 5)
      • 5. Druzhkov, P.N., Kustikova, V.D.: ‘A survey of deep learning methods and software tools for image classification and object detection’, Pattern Recognit. Image Anal., 2016, 26, (1), pp. 915.
    6. 6)
      • 6. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 770778.
    7. 7)
      • 7. Liu, W., Anguelov, D., Erhan, D., et al: ‘SSD: Single Shot MultiBox Detector’. European Conf. on Computer Vision, Amsterdam, Holland, 2016, pp. 2137.
    8. 8)
      • 8. Ioffe, S., Szegedy, C.: ‘Batch normalization: accelerating deep network training by reducing internal covariate shift’. Int. Conf. on Machine Learning, Lille, France, 2015, pp. 448456.
    9. 9)
      • 9. Glorot, X., Bengio, Y.: ‘Understanding the difficulty of training deep feed forward neural networks’, J. Mach. Learn. Res., 2010, 9, (1), pp. 249256.
    10. 10)
      • 10. He, K., Zhang, X., Ren, S., et al: ‘Delving deep into rectifiers: surpassing human-level performance on ImageNet classification’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 10261034.
    11. 11)
      • 11. Mandellos, N.A., Keramitsoglou, I., Kiranoudis, C.T.: ‘A background subtraction algorithm for detecting and tracking vehicles’, Expert Syst. Appl., 2011, 38, (3), pp. 16191631.
    12. 12)
      • 12. Engel, J.I., Martin, J., Barco, R.: ‘A low-complexity vision-based system for real-time traffic monitoring’, IEEE Trans. Intell. Transp. Syst., 2017, 18, (5), pp. 12791288.
    13. 13)
      • 13. Cheung, S.S., Kamath, C.: ‘Robust techniques for background subtraction in urban traffic video’. Visual Communications and Image Processing, San Jose, USA, 2004, pp. 881892.
    14. 14)
      • 14. Cucchiara, R., Grana, C., Piccardi, M., et al: ‘Detecting moving objects, ghosts, and shadows in video streams’, IEEE Trans. Pattern Anal. Mach. Intell., 2003, 25, (10), pp. 13371342.
    15. 15)
      • 15. Chen, Z., Ellis, T., Velastin, S.A.: ‘Vehicle detection, tracking and classification in urban traffic’, Int. IEEE Conf. Intell. Transp. Syst., Anchorage, USA, 2012, pp. 951956.
    16. 16)
      • 16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’, Int. Conf. on Neural Information Processing Systems, Nevada, USA, 2012, pp. 10971105.
    17. 17)
      • 17. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, USA, 2014, pp. 580587.
    18. 18)
      • 18. Girshick, R.: ‘Fast R-CNN’. IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 14401448.
    19. 19)
      • 19. Ren, S., Girshick, R., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’, Int. Conf. Neural Inf. Process. Syst., 2017, 39, (6), pp. 11371149.
    20. 20)
      • 20. Redmon, J., Divvala, S., Girshick, R., et al: ‘You only look once: unified, real-time object detection’. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 779788.
    21. 21)
      • 21. Redmon, J., Farhadi, A.: ‘YOLO9000: better, faster, stronger’. IEEE Conf. on Computer Vision and Pattern Recognition, Hawaii, USA, 2017, pp. 65176525.
    22. 22)
      • 22. Fu C, Y., Liu, W., Ranga, A., et al: ‘DSSD: deconvolutional single shot detector’, arXiv preprint arXiv:1701.06659, 2017.
    23. 23)
      • 23. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’, arXiv preprint arXiv:1409.1556, 2014.
    24. 24)
      • 24. Chen, L.C., Papandreou, G., Kokkinos, I., et al: ‘Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs’, IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40, (4), pp. 834848.
    25. 25)
      • 25. Glorot, X., Bordes, A., Bengio, Y.: ‘Deep sparse rectifier neural networks’. Int. Conf. Artif. Intell. Stat., Ft. Lauderdale, USA, 2011, pp. 315323.
    26. 26)
      • 26. Russakovsky, O., Deng, J., Su, H., et al: ‘Imagenet large scale visual recognition challenge’, Int. J. Comput. Vis., 2015, 115, (3), pp. 211252.
    27. 27)
      • 27. Jia, Y., Shelhamer, E., Donahue, J., et al: ‘Caffe: convolutional architecture for fast feature embedding’. Proc. ACM Conf. on Multimedia, Orlando, USA, 2014, pp. 675678.
    28. 28)
      • 28. Beltran, J., Guindel, C., Moreno, F.M., et al: ‘Birdnet: a 3D object detection framework from LiDAR information’, arXiv preprint arXiv: 1805.01195, 2018.
    29. 29)
      • 29. Shen, J., Zuo, X., Li, J., et al: ‘A novel pixel neighborhood differential statistic feature for pedestrian and face detection’, Pattern Recognit., 2017, 63, (4), pp. 127138.
    30. 30)
      • 30. Fernández-Baldera, A., Buenaposada, J.M., Baumela, L.: ‘BAdacost: multi-class boosting with costs’, Pattern Recognit., 2018, 79, (4), pp. 467479.
    31. 31)
      • 31. Sochor, J., Juránek, R., Špaňhel, J., et al: ‘Brnocompspeed: review of traffic camera calibration and comprehensive dataset for monocular speed measurement’, arXiv preprint arXiv: 1702.06441, 2017.

Related content

This is a required field
Please enter a valid email address