http://iet.metastore.ingenta.com
1887

Multi-layer fusion techniques using a CNN for multispectral pedestrian detection

Multi-layer fusion techniques using a CNN for multispectral pedestrian detection

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computer Vision — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

In this study, a novel multi-layer fused convolution neural network (MLF-CNN) is proposed for detecting pedestrians under adverse illumination conditions. Currently, most existing pedestrian detectors are very likely to be stuck under adverse illumination circumstances such as shadows, overexposure, or nighttime. To detect pedestrians under such conditions, the authors apply deep learning for effective fusion of the visible and thermal information in multispectral images. The MLF-CNN consists of a proposal generation stage and a detection stage. In the first stage, they design an MLF region proposal network and propose to use summation fusion method for integration of the two convolutional layers. This combination can detect pedestrians in different scales, even in adverse illumination. Furthermore, instead of extracting features from a single layer, they extract features from three feature maps and match the scale using the fused ROI pooling layers. This new multiple-layer fusion technique can significantly reduce the detection miss rate. Extensive evaluations of several challenging datasets well demonstrate that their approach achieves state-of-the-art performance. For example, their method performs 28.62% better than the baseline method and 11.35% better than the well-known faster R-CNN halfway fusion method in detection accuracy on KAIST multispectral pedestrian dataset.

References

    1. 1)
      • 1. Wang, X., Wang, M., Wei, L.: ‘Scene-specific pedestrian detection for static video surveillance’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (2), pp. 361374.
    2. 2)
      • 2. Xiaokai, L.: ‘Pedestrian re-identification via coarse-to-fine ranking’, IET Comput. Vis., 2016, 10, (5), pp. 366373.
    3. 3)
      • 3. Tang, S., Andriluka, M., Schiele, B.: ‘Detection and tracking of occluded people’, Int. J. Comput. Vis., 2014, 110, (1), pp. 5869.
    4. 4)
      • 4. Du, X., El-Khamy, M., Lee, J., et al: ‘Fused DNN: a deep neural network fusion approach to fast and robust pedestrian detection’. Proc. IEEE Winter Conf. Applications of Computer Vision, Santa Rosa, CA, USA, March 2017, pp. 953961.
    5. 5)
      • 5. Cai, Z., Fan, Q., Feris, R.S., et al: ‘A unified multi-scale deep convolutional neural network for fast object detection’. Proc. European Conf. Computer Vision, Amsterdam, The Netherlands, October 2016, pp. 354370.
    6. 6)
      • 6. Zhang, L., Lin, L., Liang, X., et al: ‘Is faster R-CNN doing well for pedestrian detection?’. Proc. European Conf. Computer Vision, Amsterdam, The Netherlands, October 2016, pp. 443457.
    7. 7)
      • 7. Yang, B., Yan, J., Lei, Z., et al: ‘Convolutional channel features’. Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, December 2015, pp. 8290.
    8. 8)
      • 8. Tian, Y., Luo, P., Wang, X., et al: ‘Pedestrian detection aided by deep learning semantic tasks’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, June 2015, pp. 50795087.
    9. 9)
      • 9. Cai, Z., Saberian, M., Vasconcelos, N.: ‘Learning complexity-aware cascades for deep pedestrian detection’. Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, December 2015, pp. 33613369.
    10. 10)
      • 10. Tian, Y., Luo, P., Wang, X., et al: ‘Deep learning strong parts for pedestrian detection’. Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, December 2015, pp. 19041912.
    11. 11)
      • 11. Ngiam, J., Khosla, A., Kim, M., et al: ‘Multimodal deep learning’. Proc. Int. Conf. Machine Learning, Bellevue, Washington, USA, June 2011, pp. 689696.
    12. 12)
      • 12. Feichtenhofer, C., Pinz, A., Zisserman, A.P.: ‘Convolutional two-stream network fusion for video action recognition’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 2016, pp. 19331941.
    13. 13)
      • 13. Wang, L., Li, Y., Lazebnik, S.: ‘Learning deep structure-preserving image-text embeddings’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 2016, pp. 50055013.
    14. 14)
      • 14. Liu, J., Zhang, S., Wang, S., et al: ‘Multispectral deep neural networks for pedestrian detection’. Proc. British Machine Vision Conf., York, UK, September 2016, pp. 113.
    15. 15)
      • 15. König, D., Adam, M., Jarvers, C., et al: ‘Fully convolutional region proposal networks for multispectral person detection’. Proc. IEEE Workshop on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 2017, pp. 243250.
    16. 16)
      • 16. Choi, H., Kim, S., Park, K., et al: ‘Multi-spectral pedestrian detection based on accumulated object proposal with fully convolutional networks’. Proc. IEEE Int. Conf. Pattern Recognition, Cancun, Mexico, December 2016, pp. 621626.
    17. 17)
      • 17. Wagner, J., Fischer, V., Herman, M., et al: ‘Multispectral pedestrian detection using deep fusion convolutional neural networks’. Proc. European Symp. Artificial Neural Networks, Bruges, Belgium, April 2016, pp. 509514.
    18. 18)
      • 18. Hwang, S., Park, J., Kim, N., et al: ‘Multispectral pedestrian detection: benchmark dataset and baseline’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, June 2015, pp. 10371045.
    19. 19)
      • 19. Takumi, K., Watanabe, K., Ha, Q., et al: ‘Multispectral object detection for autonomous vehicles’. Proc. Thematic Workshops of ACM Multimedia, Mountain View, CA, USA, October 2017, pp. 3543.
    20. 20)
      • 20. Davis, J.W., Sharma, V.: ‘Background-subtraction using contour-based fusion of thermal and visible imagery’, Comput. Vis. Image Underst., 2007, 106, (2–3), pp. 162182.
    21. 21)
      • 21. Dalal, N., Triggs, B.: ‘Histograms of oriented gradients for human detection’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, San Diego, CA, USA, June 2005, pp. 886893.
    22. 22)
      • 22. Felzenszwalb, P., McAllester, D., Ramanan, D.: ‘A discriminatively trained, multiscale, deformable part model’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Anchorage, AK, USA, June 2008, pp. 18.
    23. 23)
      • 23. Dollár, P., Tu, Z., Perona, P., et al: ‘Integral channel features’. Proc. British Machine Vision Conf., London, UK, September 2009, pp. 111.
    24. 24)
      • 24. Dollár, P., Appel, R., Belongie, S., et al: ‘Fast feature pyramids for object detection’, IEEE Trans. Pattern Anal. Mach. Intell., 2014, 36, (8), PP. pp. 15321545.
    25. 25)
      • 25. Benenson, R., Omran, M., Hosang, J., et al: ‘Ten years of pedestrian detection, what have we learned?’. Proc. ECCV Workshop Vision for Road Scene Understanding and Autonomous Driving, Zurich, Switzerland, September 2014, pp. 613627.
    26. 26)
      • 26. Zhang, S., Bauckhage, C., Cremers, A.B.: ‘Informed Haar-like features improve pedestrian detection’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Columbus, OH, USA, June 2014, pp. 947954.
    27. 27)
      • 27. Woonhyun, N., Dollár, P., Hee, H.J.: ‘Local decorrelation for improved detection’. Proc. Int. Conf. Neural Information Processing Systems, Montreal, Canada, December 2014, pp. 424432.
    28. 28)
      • 28. Zhang, S., Benenson, R., Schiele, B.: ‘Filtered channel features for pedestrian detection’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, June 2015, pp. 17511760.
    29. 29)
      • 29. Xu, D., Ouyang, W., Ricci, E., et al: ‘Learning cross-modal deep representations for robust pedestrian detection’. Proc. IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 2017, pp. 42364244.
    30. 30)
      • 30. Ren, S., He, K., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (6), pp. 11371149.
    31. 31)
      • 31. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. Proc. Int. Conf. Learning Representations, San Diego, CA, May 2015, pp. 114.
    32. 32)
      • 32. Girshick, R.: ‘Fast R-CNN’. Proc. IEEE Int. Conf. Computer Vision, Santiago, Chile, December 2015, pp. 14401448.
    33. 33)
      • 33. Dollár, P., Wojek, C., Schiele, B., et al: ‘Pedestrian detection: an evaluation of the state of the art’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (4), pp. 743761.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cvi.2018.5315
Loading

Related content

content/journals/10.1049/iet-cvi.2018.5315
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address