© The Institution of Engineering and Technology
Camera-based object detection in low-light/night-time conditions is a fundamental problem because of insufficient lighting. So far, a mid-level fusion of RGB and thermal images is done to complement each other's features. In this work, an attention-based bi-modal fusion network is proposed for a better object detection in the thermal domain by integrating a channel-wise attention module. The experimental results show that the proposed framework improves the mAP by 4.13 points on the FLIR dataset.
References
-
-
1)
-
23. Wang, F., Jiang, M., Qian, C., et al: ‘, 2017.
-
2)
-
1. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 1097–1105.
-
3)
-
18. Chaitanya, D., Akolekar, N., Sharma, M.M., et al: ‘Borrow from anywhere: pseudo multi-modal object detection in thermal imagery’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 2019.
-
4)
-
3. Girshick, R.: ‘Fast R-CNN’. Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 1440–1448.
-
5)
-
5. He, K., Zhang, X., Ren, S., et al: ‘Spatial pyramid pooling in deep convolutional networks for visual recognition’. European Conf. on Computer Vision, Zurich, Switzerland, 2014, pp. 346–361.
-
6)
-
22. Qian, R., Tan, R.T., Yang, W., et al: ‘Attentive generative adversarial network for raindrop removal from a single image’. CVPR, Salt Lake City, UT, USA, 2018.
-
7)
-
13. Yun, P., Tai, L., Wang, Y., et al: ‘Focal loss in 3D object detection’, IEEE Robot. Autom. Lett., 2019, 4, (2), pp. 1263–1270 (doi: 10.1109/LRA.2019.2894858).
-
8)
-
19. Itti, L., Koch, C., Niebur, E.: ‘A model of saliency-based visual attention for rapid scene analysis’, TPAMI, 1998, 20, (11), pp. 1254–1259 (doi: 10.1109/34.730558).
-
9)
-
10)
-
24. Sungmin, C., Choi, B., Kim, D.-H., et al: ‘Multi-domain attentive detection network’. IEEE Int. Conf. on Image Processing, Taipei, Taiwan, 2019.
-
11)
-
7. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.
-
12)
-
4. Ren, S., He, K., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’. Advances in neural information processing systems, Montreal, Canada, 2015, pp. 91–99.
-
13)
-
11. Vollmer, M., Mollmann, K.P.: ‘Infrared thermal imaging: fundamentals, research and applications’ (Wiley, Germany, 2017).
-
14)
-
16. Berg, A., Ofjall, K., Ahlberg, J., et al: ‘Detecting rails and obstacles using a train-mounted thermal camera’, in Paulsen, R.R., Pedersen, K.S. (Eds.): ‘Image analysis’ (Springer, Switzerland, 2015), pp. 492–503.
-
15)
-
20. Corbetta, M., Shulman, G.: ‘Control of goal-directed and stimulus-driven attention in the brain’, Nature Rev. Neurosci., 2002, 3, (3), p. 201 (doi: 10.1038/nrn755).
-
16)
-
21. Pumarola, A., Agudo, A., Martinez, A.M., et al: ‘Ganimation: anatomically-aware facial animation from a single image’. ECCV, Munich, Germany, 2018.
-
17)
-
14. Li, H., Wu, X.J.: ‘Densefuse: a fusion approach to infrared and visible images’, IEEE Trans. Image Process., 2018, 28, (5), pp. 2614–2623 (doi: 10.1109/TIP.2018.2887342).
-
18)
-
8. Hwang, S., Park, J., Kim, N., et al: ‘Multispectral pedestrian detection: benchmark dataset and baseline’. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1037–1045.
-
19)
-
6. Redmon, J., Farhadi, A.: ‘, 2017.
-
20)
-
10. Zin, T.T., Takahashi, H., Hama, H.: ‘Robust person detection using far infrared camera for image fusion’. Int. Conf. on Innovative Computing Information and Control, Kumamoto, Japan, 2007, pp. 310–310.
-
21)
-
22)
-
17. Leykin, A., Ran, Y., Hammoud, R.: ‘’, 2007.
-
23)
-
12. Sun, X., Ma, H., Sun, Y., et al: ‘A novel point cloud compression algorithm based on clustering’, IEEE Robot. Autom. Lett., 2019, 4, (2), pp. 2132–2139 (doi: 10.1109/LRA.2019.2900747).
-
24)
-
2. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. Proc. of the IEEE conference on computer vision and pattern recognition, Columbus, OH, USA, 2014, pp. 580–587.
-
25)
-
9. Mangale, S., Khambete, M.: ‘Moving object detection using visible spectrum imaging and thermal imaging’. Int. Conf. on Industrial Instrumentation and Control, Pune, India, 2015, pp. 590–593.
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2020.1952
Related content
content/journals/10.1049/el.2020.1952
pub_keyword,iet_inspecKeyword,pub_concept
6
6