access icon free Crowd counting by the dual-branch scale-aware network with ranking loss constraints

Image crowd counting is a challenging problem. This study proposes a new deep learning method that estimates crowd counting for the congested scene. The proposed network is composed of two major components: the first ten layers of VGG16 are used as the backbone network, and a dual-branch (named as Branch_S and Branch_D) network is proposed to be the second part of the network. Branch_S extracts low-level information (head blob) through a shallow fully convolutional network and Branch_D uses a deep fully convolutional network to extract high-level context features (faces and body). Features learnt from the two different branches can handle the problem of scale variation due to perspective effects and image size differences. Features of different scales extracted from the two branches are fused to generate predicted density map. On the basis of the fact that an original graph must contain more or equal number of persons than any of its sub-images, a ranking loss function utilising the constraint relationship inside an image is proposed. Moreover, the ranking loss is combined with Euclidean loss as the final loss function. Our approach is evaluated on three benchmark datasets, and better results are achieved compared with the state-of-the-art works.

Inspec keywords: learning (artificial intelligence); convolutional neural nets; graph theory; feature extraction

Other keywords: shallow fully convolutional network; dual-branch scale-aware network; ranking loss function; Branch_S; deep learning method; congested scene; Branch_D; density map; image crowd counting; image size differences; high-level context features; VGG16; ranking loss constraints; Euclidean loss; deep fully convolutional network

Subjects: Combinatorial mathematics; Combinatorial mathematics; Knowledge engineering techniques; Computer vision and image processing techniques; Image recognition; Neural computing techniques

References

    1. 1)
      • 15. Zeng, L., Xu, X., Cai, B., et al: ‘Multi-scale convolutional neural networks for crowd counting’. arXiv: 1702.02359 [cs], February 2017.
    2. 2)
      • 9. Viola, P., Jones, M.J.: ‘Robust real-time face detection’, Int. J. Comput. Vis., 2004, 57, (2), pp. 137154.
    3. 3)
      • 5. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al: ‘Object detection with discriminatively trained part-based models’, IEEE Trans. Pattern Anal. Mach. Intell., 2010, 32, (9), pp. 16271645.
    4. 4)
      • 30. Li, Y., Zhang, X., Chen, D.: ‘CSRNet: dilated convolutional neural networks for understanding the highly congested scenes’. arXiv:1802.10062 [cs], 2018.
    5. 5)
      • 25. Huang, S., Li, X., Zhang, Z., et al: ‘Body structure-aware deep crowd counting’, IEEE Trans. Image Process., 2018, 27, (3), pp. 10491059.
    6. 6)
      • 13. Wang, Z., Xiao, Z., Xie, K., et al: ‘In defense of single-column networks for crowd counting’. arXiv: 1808.06133 [cs], August 2018.
    7. 7)
      • 17. Liu, X., van de Weijer, J., Bagdanov, A.D.: ‘Leveraging unlabeled data for crowd counting by learning to rank’. arXiv:1803.03095 [cs], 2018.
    8. 8)
      • 27. Wang, C., Zhang, H., Yang, L., et al: ‘Deep people counting in extremely dense crowds’. Proc. 23rd ACM Int. Conf. Multimedia (MM ’15), Brisbane, Australia, 2015, pp. 12991302.
    9. 9)
      • 28. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘ImageNet classification with deep convolutional neural networks’. Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS'12), NY, USA, 2012, pp. 10971105.
    10. 10)
      • 19. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 2016, pp. 770778.
    11. 11)
      • 2. Chow, W.K., Candy Ng, M.Y.: ‘Waiting time in emergency evacuation of crowded public transport terminals’, Saf. Sci., 2008, 46, (5), pp. 844857.
    12. 12)
      • 7. Chan, A.B., Vasconcelos, N.: ‘Bayesian Poisson regression for crowd counting’. 2009 IEEE 12th Int. Conf. Computer Vision, Kyoto, Japan, September 2009, pp. 545551.
    13. 13)
      • 4. Sindagi, V.A., Patel, V.M.: ‘A survey of recent advances in CNN-based single image crowd counting and density estimation’, Pattern Recognit. Lett., 2018, 107, pp. 316.
    14. 14)
      • 3. Sime, J.D.: ‘Crowd psychology and engineering’, Saf. Sci., 1995, 21, (1), pp. 114.
    15. 15)
      • 11. Zhang, C., Li, H., Wang, X., et al: ‘Cross-scene crowd counting via deep convolutional neural networks’. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015, pp. 833841.
    16. 16)
      • 26. Sam, D.B., Venkatesh Babu, R.: ‘Top-down feedback for crowd counting convolutional neural network’. arXiv:1807.08881 [cs], July 2018.
    17. 17)
      • 21. Girshick, R.: ‘Fast R-CNarXiv N.:1504.08083 [cs]’. arXiv: 1504.08083, April 2015.
    18. 18)
      • 34. Oñoro-Rubio, D., López-Sastre, R.J.: ‘Towards perspective-free object counting with deep learning’. Computer Vision (ECCV 2016), Amsterdam, Netherlands, 2016, vol. 9911, pp. 615629.
    19. 19)
      • 32. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. arXiv:1409.1556 [cs], 2014.
    20. 20)
      • 20. Sun, K., Zhao, Y., Jiang, B., et al: ‘High-resolution representations for labeling pixels and regions’. arXiv:1904.04514 [cs], April 2019.
    21. 21)
      • 12. Deb, D., Ventura, J.: ‘An aggregated multicolumn dilated convolution network for perspective-free counting’. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, June 2018, pp. 308309.
    22. 22)
      • 22. Liu, W., Anguelov, D., Erhan, D., et al: ‘SSD: single shot MultiBox detector’. arXiv:1512.02325 [cs], 9905, 2016.
    23. 23)
      • 14. Zhang, Y., Zhou, D., Chen, S., et al: ‘Single-image crowd counting via multi-column convolutional neural network’. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 2016, pp. 589597.
    24. 24)
      • 10. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. 2005 IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, vol. 1, pp. 886893.
    25. 25)
      • 1. Abdelghany, A., Abdelghany, K., Mahmassani, H., et al: ‘Modeling framework for optimal evacuation of large-scale crowded pedestrian facilities’, Eur. J. Oper. Res., 2014, 237, (3), pp. 11051118.
    26. 26)
      • 36. Lin, T., Dollár, P., Girshick, R., et al: ‘Feature pyramid networks for object detection’.arXiv: 1612.03144, April 2017.
    27. 27)
      • 37. Sindagi, V.A., Patel, V.M.: ‘CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting’. arXiv:1707.09605 [cs], 2017.
    28. 28)
      • 8. Garcia-Bunster, G., Torres-Torriti, M., Oberli, C.: ‘Crowded pedestrian counting at bus stops from perspective transformations of foreground areas’, IET Comput. Vis., 2012, 6, (4), pp. 296305.
    29. 29)
      • 33. Cao, X., Wang, Z., Zhao, Y., et al: ‘Scale aggregation network for accurate and efficient crowd counting’. Computer Vision (ECCV 2018), Munich, Germany, 2018, vol. 11209, pp. 757773.
    30. 30)
      • 23. Redmon, J., Divvala, S., Girshick, R., et al: ‘You only look once: unified, real-time object detection’. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 2016, pp. 779788.
    31. 31)
      • 18. Sam, D.B., Surya, S., Venkatesh Babu, R.: ‘Switching convolutional neural network for crowd counting’. arXiv:1708.00199 [cs], 2017.
    32. 32)
      • 24. Shen, Z., Xu, Y., Ni, B., et al: ‘Crowd counting via adversarial cross-scale consistency pursuit’. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 2018, pp. 52455254.
    33. 33)
      • 31. Idrees, H., Tayyab, M., Athrey, K., et al: ‘Composition loss for counting, density map estimation and localization in dense crowds’. arXiv:1808.01050 [cs], 2018.
    34. 34)
      • 16. Zhang, L., Shi, M., Chen, Q.: ‘Crowd counting via scale-adaptive convolutional neural network’. arXiv:1711.04433 [cs], 2017.
    35. 35)
      • 35. Zhao, H., Shi, J., Qi, X., et al: ‘Pyramid scene parsing network’. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, July 2017, pp. 62306239.
    36. 36)
      • 38. Gao, J., Wang, Q., Li, X.: ‘PCC Net: perspective crowd counting via spatial convolutional network’. arXiv:1905.10085 [cs], 2019.
    37. 37)
      • 29. Boominathan, L., Kruthiventi, S.S.S., Venkatesh Babu, R.: ‘CrowdNet: a deep convolutional network for dense crowd counting’. arXiv:1608.06197 [cs], 2016.
    38. 38)
      • 6. Chan, A.B., John Liang, Z.-S., Vasconcelos, N.: ‘Privacy-preserving crowd monitoring: counting people without people models or tracking’. 2008 IEEE Conf. Computer Vision and Pattern Recognition, Anchorage, AK, USA, 2008, pp. 17.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2019.0704
Loading

Related content

content/journals/10.1049/iet-cvi.2019.0704
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading