Deep convolution network for dense crowd counting

Deep convolution network for dense crowd counting

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Image Processing — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Estimating the total number of people in a crowded situation is a challenging task due to numerous occlusions and perspective changes existing in crowd images. To address this issue, the authors have proposed a new deep learning framework for accurate and efficient crowd counting here. Inspired by multi-column convolutional neural network (MCNN) and contextual pyramid convolutional neural network (CP-CNN), the authors use a combination of a two branches, convolutional neutral network (CNN) and transposed convolutional layers, to generate a high-quality density map. The two-branch CNN for feature extraction generates a density map that is only a quarter of size of the original image Then a set of transposed convolutional layers and convolutional layers are combined with the network to make up for the detail loss of the density map conducted by stacked pooling. Compared with MCNN and CP-CNN, the authors’ approach employs fewer branches and simpler architecture. Experimental result shows that their approach achieves MAE 80.7 and MSE 131.2 in ShanghaiTech PartA dataset, MAE 15.6 and MSE 26.8 in ShanghaiTech PartB dataset, and MAE Average 7.1 in WorldExpo'10 dataset.


    1. 1)
      • 1. Oñoro, D., López-Sastre, R.J.: ‘Towards perspective-free object counting with deep learning’. European Conf. on Computer (ECCV 2016), Netherlands, 2016, pp. 615629.
    2. 2)
      • 2. Sam, D.B., Surya, S., Babu, R.V.: ‘Switching convolutional neural network for crowd counting’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, 2017, pp. 40314039.
    3. 3)
      • 3. Zeng, L., Xu, X., Cai, B., et al: ‘Multi-scale convolutional neural networks for crowd counting’. IEEE Int. Conf. on Image Processing (ICIP 2017), Beijing, 2017, pp. 465469.
    4. 4)
      • 4. Sindagi, V.A., Patel, V.M.: ‘Generating high-quality crowd density maps using contextual pyramid CNNs’. IEEE Int. Conf. on Computer Vision (ICCV 2017), Venice, 2017, pp. 18791888.
    5. 5)
      • 5. Zhang, L., Shi, M., Chen, Q.: ‘Crowd counting via scale-adaptive convolutional neural network’. IEEE Winter Conf. on Applications of Computer Vision (WACV 2018), Lake Tahoe, NV, 2018, pp. 11131121.
    6. 6)
      • 6. Shen, Z., Xu, Y., Ni, B., et al: ‘Crowd counting via adversarial cross-scale consistency pursuit’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2018), Salt Lake City, UT, 2018, pp. 52455254.
    7. 7)
      • 7. Sam, D.B., Babu, R.V.: ‘Top-down feedback for crowd counting convolutional neural network’, available at, accessed 24 July 2018.
    8. 8)
      • 8. Huang, S., Li, X., Zhang, Z., et al, et al: ‘Body structure aware deep crowd counting’, IEEE Trans. Image Process., 2018, 27, (3), pp. 10491059.
    9. 9)
      • 9. Sindagi, V.A., Patel, V.M.: ‘CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting’. 14th IEEE Int. Conf. on Advanced Video and Signal Based Surveillance (AVSS 2017), Lecce, 2017, pp. 16.
    10. 10)
      • 10. Zhang, Y., Zhou, D., Chen, S., et al: ‘Single-image crowd counting via multi-column convolutional neural network’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, 2016, pp. 589597.
    11. 11)
      • 11. Jiang, X., Xiao, Z., Zhang, B., et al: ‘Crowd counting and density estimation by trellis encoder–decoder network’, available at, accessed 3 March 2019.
    12. 12)
      • 12. Oh, M-h., Olsen, P.A., Ramamurthy, K.N.: ‘Crowd counting with decomposed uncertainty’, available at, accessed 15 March 2019.
    13. 13)
      • 13. Liu, N., Long, Y., Zou, C., et al: ‘ADCrowdnet: an attention-injective deformable convolutional network for crowd understanding’, available at, accessed 11 April 2019.
    14. 14)
      • 14. Liu, Y., Shi, M., Zhao, Q., et al: ‘Point in, box out: beyond counting persons in crowds’, available at, accessed 3 April 2019.
    15. 15)
      • 15. Hossain, M., Hosseinzadeh, M., Chanda, O., et al: ‘Crowd counting using scale-aware attention networks’. IEEE Winter Conf. on Applications of Computer Vision (WACV 2019)', Waikoloa Village, HI, USA, 2019, pp. 12801288.
    16. 16)
      • 16. Huang, S., Li, X., Cheng, Z., et al: ‘Stacked pooling: improving crowd counting by boosting scale invariance’, available at, accessed 22 August 2018.
    17. 17)
      • 17. Li, M., Zhang, Z., Huang, K., et al: ‘Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection’. 19th Int. Conf. on Pattern Recognition, Tampa, FL, 2008, pp. 14.
    18. 18)
      • 18. Tu, P., Sebastian, T., Doretto, G., et al: ‘Unified crowd segmentation’. European Conf. on Computer (ECCV 2008), Marseille, France, 2008, pp. 691704.
    19. 19)
      • 19. Topkaya, I.S., Erdogan, H., Porikli, F.: ‘Counting people by clustering person detector outputs’. 11th IEEE Int. Conf. Advanced Video and Signal Based Surveillance, Seoul, South Korea, 2014, pp. 313318.
    20. 20)
      • 20. Viola, P., Jones, M.J: ‘Robust real-time face detection’, Int. J. Comput. Vis., 2004, 57, (2), pp. 137154.
    21. 21)
      • 21. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR 2005), San Diego, CA, USA, 2005, vol. 1, pp. 886893.
    22. 22)
      • 22. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., et al: ‘Object detection with discriminatively trained part-based models’, IEEE Trans. Pattern Anal. Mach. Intell., 2010, 32, (9), pp. 16271645.
    23. 23)
      • 23. Chan, A.B., Vasconcelos, N.: ‘Bayesian Poisson regression for crowd counting’. IEEE 12th Int. Conf. Computer Vision, Kyoto, Japan, 2009, pp. 545551.
    24. 24)
      • 24. Lempitsky, V., Zisserman, A.: ‘Learning to count objects in images’. Adv. Neural Inf. Process. Syst., 2010, 1, pp. 13241332.
    25. 25)
      • 25. Pham, V., Kozakaya, T., Yamaguchi, O., et al: ‘COUNT forest: CO-voting uncertain number of targets using random forest for crowd density estimation’. IEEE Int. Conf. Comput. Vis., 2015, 1, pp. 32533261.
    26. 26)
      • 26. Zhang, C., Li, H., Wang, X., et al: ‘Cross-scene crowd counting via deep convolutional neural networks’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, 2015, pp. 833841.
    27. 27)
      • 27. Idrees, H., Saleemi, I., Seibert, C., et al: ‘Multi-source multi-scale counting in extremely dense crowd images’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2013), Portland, OR, 2013, pp. 25472554.
    28. 28)
      • 28. Chan, A.B., Liang, Z-S.J., Vasconcelos, N.: ‘Privacy preserving crowd monitoring: counting people without people models or tracking’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 2008), Anchorage, AK, 2008, pp. 17.

Related content

This is a required field
Please enter a valid email address