access icon free ABiFN: Attention-based bi-modal fusion network for object detection at night time

Camera-based object detection in low-light/night-time conditions is a fundamental problem because of insufficient lighting. So far, a mid-level fusion of RGB and thermal images is done to complement each other's features. In this work, an attention-based bi-modal fusion network is proposed for a better object detection in the thermal domain by integrating a channel-wise attention module. The experimental results show that the proposed framework improves the mAP by 4.13 points on the FLIR dataset.

Inspec keywords: video signal processing; object detection; cameras; sensor fusion; image enhancement; image colour analysis; infrared imaging

Other keywords: night time; camera-based object detection; insufficient lighting; attention-based bi-modal fusion network; channel-wise attention module

Subjects: Optical, image and video signal processing; Computer vision and image processing techniques; Video signal processing

References

    1. 1)
      • 23. Wang, F., Jiang, M., Qian, C., et al: ‘Residual attention network for image classification’, arXiv preprint arXiv:1704.06904, 2017.
    2. 2)
      • 1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 10971105.
    3. 3)
      • 18. Chaitanya, D., Akolekar, N., Sharma, M.M., et al: ‘Borrow from anywhere: pseudo multi-modal object detection in thermal imagery’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 2019.
    4. 4)
      • 3. Girshick, R.: ‘Fast R-CNN’. Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 14401448.
    5. 5)
      • 5. He, K., Zhang, X., Ren, S., et al: ‘Spatial pyramid pooling in deep convolutional networks for visual recognition’. European Conf. on Computer Vision, Zurich, Switzerland, 2014, pp. 346361.
    6. 6)
      • 22. Qian, R., Tan, R.T., Yang, W., et al: ‘Attentive generative adversarial network for raindrop removal from a single image’. CVPR, Salt Lake City, UT, USA, 2018.
    7. 7)
    8. 8)
    9. 9)
      • 15. Berg, A.: ‘Detection and Tracking in Thermal Infrared Imagery’, Linkping Studies in Science and Technology, Thesis No. 1744, Linkping University, Sweden, 2016.
    10. 10)
      • 24. Sungmin, C., Choi, B., Kim, D.-H., et al: ‘Multi-domain attentive detection network’. IEEE Int. Conf. on Image Processing, Taipei, Taiwan, 2019.
    11. 11)
      • 7. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770778.
    12. 12)
      • 4. Ren, S., He, K., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’. Advances in neural information processing systems, Montreal, Canada, 2015, pp. 9199.
    13. 13)
      • 11. Vollmer, M., Mollmann, K.P.: ‘Infrared thermal imaging: fundamentals, research and applications’ (Wiley, Germany, 2017).
    14. 14)
      • 16. Berg, A., Ofjall, K., Ahlberg, J., et al: ‘Detecting rails and obstacles using a train-mounted thermal camera’, in Paulsen, R.R., Pedersen, K.S. (Eds.): ‘Image analysis’ (Springer, Switzerland, 2015), pp. 492503.
    15. 15)
    16. 16)
      • 21. Pumarola, A., Agudo, A., Martinez, A.M., et al: ‘Ganimation: anatomically-aware facial animation from a single image’. ECCV, Munich, Germany, 2018.
    17. 17)
    18. 18)
      • 8. Hwang, S., Park, J., Kim, N., et al: ‘Multispectral pedestrian detection: benchmark dataset and baseline’. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 10371045.
    19. 19)
      • 6. Redmon, J., Farhadi, A.: ‘Yolo9000: better, faster, stronger’, arXiv preprint, 2017.
    20. 20)
      • 10. Zin, T.T., Takahashi, H., Hama, H.: ‘Robust person detection using far infrared camera for image fusion’. Int. Conf. on Innovative Computing Information and Control, Kumamoto, Japan, 2007, pp. 310310.
    21. 21)
      • 25. https://www.flir.in/oem/adas/adas-dataset-form - last accessed on 10th June 2020.
    22. 22)
      • 17. Leykin, A., Ran, Y., Hammoud, R.: ‘Thermal-visible video fusion for moving target tracking and pedestrian classification’, 2007.
    23. 23)
    24. 24)
      • 2. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. Proc. of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 2014, pp. 580587.
    25. 25)
      • 9. Mangale, S., Khambete, M.: ‘Moving object detection using visible spectrum imaging and thermal imaging’. Int. Conf. on Industrial Instrumentation and Control, Pune, India, 2015, pp. 590593.
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2020.1952
Loading

Related content

content/journals/10.1049/el.2020.1952
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading