ADFNet: accumulated decoder features for real-time semantic segmentation
- Author(s): Hyunguk Choi 1 ; Hoyeon Ahn 1 ; Joonmo Kim 2 ; Moongu Jeon 1
-
-
View affiliations
-
Affiliations:
1:
School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology , 123 Cheomdangwagi-ro, Buk-gu, Gwangju, 61005 , Republic of Korea ;
2: Department of Computer Engineering , Dankook University , 152, Jukjeon-ro, Suji-gu, Yongin-si, Gyeonggi-do, 16890 , Republic of Korea
-
Affiliations:
1:
School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology , 123 Cheomdangwagi-ro, Buk-gu, Gwangju, 61005 , Republic of Korea ;
- Source:
Volume 14, Issue 8,
December
2020,
p.
555 – 563
DOI: 10.1049/iet-cvi.2019.0289 , Print ISSN 1751-9632, Online ISSN 1751-9640
- « Previous Article
- Table of contents
- Next Article »
Semantic segmentation is one of the important technologies in autonomous driving, and ensuring its real-time and high performance is of utmost importance for the safety of pedestrians and passengers. To improve its performance using deep neural networks that operate in real-time, the authors propose a simple and efficient method called ADFNet using accumulated decoder features, ADFNet operates by only using the decoder information without skip connections between the encoder and decoder. They demonstrate that the performance of ADFNet is superior to that of the state-of-the-art methods, including that of the baseline network on the cityscapes dataset. Further, they analyse the results obtained via ADFNet using class activation maps and RGB representations for image segmentation results.
Inspec keywords: image segmentation; image colour analysis; neural nets
Other keywords: class activation maps; autonomous driving; RGB representations; accumulated decoder features; pedestrians; decoder information; encoder; real-time semantic segmentation; deep neural networks; ADFNet; image segmentation
Subjects: Neural computing techniques; Optical, image and video signal processing; Computer vision and image processing techniques
References
-
-
1)
-
40. Chen, B., Gong, C., Yang, J.: ‘Importance-aware semantic segmentation for autonomous vehicles’, IEEE Trans. Intell. Transp. Syst., 2018, (99), pp. 1–12.
-
-
2)
-
23. Wang, P., Chen, P., Yuan, Y., et al: ‘Understanding convolution for semantic segmentation’. 2018 IEEE Winter Conf. on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, 2018, pp. 1451–1460.
-
-
3)
-
18. Nock, R., Nielsen, F.: ‘Statistical region merging’, IEEE Trans. Pattern Anal. Mach. Intell., 2004, 26, (11), pp. 1452–1458.
-
-
4)
-
36. Cordts, M., Omran, M., Ramos, S., et al: ‘The cityscapes dataset for semantic urban scene understanding’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, 2016.
-
-
5)
-
29. Ronneberger, O., Fischer, P., Brox, T.: ‘U-net: convolutional networks for biomedical image segmentation’. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 2015, pp. 234–241.
-
-
6)
-
13. Liu, W., Anguelov, D., Erhan, D., et al: ‘Ssd: single shot multibox detector’. European Conf. on computer vision, Amsterdam, Netherlands, 2016, pp. 21–37.
-
-
7)
-
38. Mehta, S., Rastegari, M., Caspi, A., et al: ‘Espnet: efficient spatial pyramid of dilated convolutions for semantic segmentation’. Proc. of the European Conf. on Computer Vision (ECCV), Athens, Greece, 2018, pp. 552–568.
-
-
8)
-
2. Pinheiro, P.O., Lin, T.-Y., Collobert, R., et al: ‘Learning to refine object segments’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 75–91.
-
-
9)
-
10. Wu, Z., Shen, C., Van Den Hengel, A.: ‘Wider or deeper: revisiting the resnet model for visual recognition’, Pattern Recognit., 2019, 90, pp. 119–133.
-
-
10)
-
26. Long, J., Shelhamer, E., Darrell, T.: ‘Fully convolutional networks for semantic segmentation’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Boston, USA, 2015, pp. 3431–3440.
-
-
11)
-
30. Pohlen, T., Hermans, A., Mathias, M., et al: ‘Full-resolution residual networks for semantic segmentation in street scenes’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 2017, pp. 4151–4160.
-
-
12)
-
16. Comaniciu, D., Meer, P.: ‘Mean shift: A robust approach toward feature space analysis’, IEEE Trans. Pattern Anal. Mach. Intell., 2002, (5), pp. 603–619.
-
-
13)
-
21. Shotton, J., Winn, J., Rother, C., et al: ‘Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context’, Int. J. Comput. Vis., 2009, 81, (1), pp. 2–23.
-
-
14)
-
22. Perazzi, F., Khoreva, A., Benenson, R., et al: ‘Learning video object segmentation from static images’. The IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, July 2017.
-
-
15)
-
7. Ren, S., He, K., Girshick, R., et al: ‘Faster r-cnn: towards real-time object detection with region proposal networks’. Advances in Neural Information Processing Systems, 2015, pp. 91–99.
-
-
16)
-
39. Siam, M., Gamal, M., Abdel-Razek, M., et al: ‘Rtseg: real-time semantic segmentation comparative study’. 2018 25th IEEE Int. Conf. on Image Processing (ICIP), 2018, pp. 1603–1607.
-
-
17)
-
20. Shotton, J., Johnson, M., Cipolla, R.: ‘Semantic texton forests for image categorization and segmentation’. 2008 IEEE Conf. on Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008, pp. 1–8.
-
-
18)
-
6. Xia, G.-S., Bai, X., Ding, J., et al: ‘Dota: A large-scale dataset for object detection in aerial images’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 2018.
-
-
19)
-
15. Paszke, A., Chaurasia, A., Kim, S., et al: ‘Enet: a deep neural network architecture for real-time semantic segmentation’, arXiv preprint arXiv:1606.02147, 2016.
-
-
20)
-
27. Liang-Chieh, C., Papandreou, G., Kokkinos, I., et al: ‘Semantic image segmentation with deep convolutional nets and fully connected crfs’. Int. Conf. on Learning Representations, 2015.
-
-
21)
-
3. Papandreou, G., Chen, L.-C., Murphy, K.P., et al: ‘Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation’. Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 1742–1750.
-
-
22)
-
12. Rudra, K.P., Ujwal, P.B., Stephan, L., et al: ‘Contextnet: exploring context and detail for semantic segmentation in real-time’. Proc. of the British Machine Vision Conf. (BMVC), Newcastle, UK, 2018.
-
-
23)
-
24. Di, S., Zhang, H., Li, C.-G., et al: ‘Cross-domain traffic scene understanding: A dense correspondence-based transfer learning approach’, IEEE Trans. Intell. Transp. Syst., 2018, 19, (3), pp. 745–757.
-
-
24)
-
25. Altun, M., Celenk, M.: ‘Road scene content analysis for driver assistance and autonomous driving’, IEEE Trans. Intell. Transp. Syst., 2017, 18, (12), pp. 3398–3407.
-
-
25)
-
37. Kingma, D.P., Ba, J.: ‘Adam: A method for stochastic optimization’, arXiv preprint arXiv:1412.6980, 2014.
-
-
26)
-
14. Romera, E., Alvarez, J.M., Bergasa, L.M., et al: ‘Erfnet: efficient residual factorized convnet for real-time semantic segmentation’, IEEE Trans. Intell. Transp. Syst., 2018, 19, (1), pp. 263–272.
-
-
27)
-
33. Lin, G., Milan, A., Shen, C., et al: ‘Refinenet: multi-path refinement networks for high-resolution semantic segmentation’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 2017, pp. 1925–1934.
-
-
28)
-
34. Zhao, H., Qi, X., Shen, X., et al: ‘Icnet for real-time semantic segmentation on high-resolution images’. Proc. of the European Conf. on Computer Vision (ECCV), Munich, Germany, 2018, pp. 405–420.
-
-
29)
-
31. Ghiasi, G., Fowlkes, C.C.: ‘Laplacian pyramid reconstruction and refinement for semantic segmentation’. European Conf. on Computer Vision, Amsterdam, Netherlands, 2016, pp. 519–534.
-
-
30)
-
5. Hu, H., Gu, J., Zhang, Z., et al: ‘Relation networks for object detection’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 2018.
-
-
31)
-
9. Zhao, H., Shi, J., Qi, X., et al: ‘Pyramid scene parsing network’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, Hawaii, July 2017, pp. 2881–2890.
-
-
32)
-
17. Felzenszwalb, P.F., Huttenlocher, D.P.: ‘Efficient graph-based image segmentation’, Int. J. Comput. Vis., 2004, 59, (2), pp. 167–181.
-
-
33)
-
4. Liu, Z., Li, X., Luo, P., et al: ‘Semantic image segmentation via deep parsing network’. Proc. of the IEEE Int. Conf. on Computer Vision, Santiago, Chile, 2015, pp. 1377–1385.
-
-
34)
-
32. Yu, C., Wang, J., Peng, C., et al: ‘Bisenet: bilateral segmentation network for real-time semantic segmentation’. Proc. of the European Conf. on Computer Vision (ECCV), Munich, Germany, 2018, pp. 325–341.
-
-
35)
-
28. Badrinarayanan, V., Kendall, A., Cipolla, R.: ‘Segnet: a deep convolutional encoder-decoder architecture for image segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (12), pp. 2481–2495.
-
-
36)
-
35. Yu, F., Koltun, V.: ‘Multi-scale context aggregation by dilated convolutions’, arXiv preprint arXiv:1511.07122, 2015.
-
-
37)
-
19. Brostow, G.J., Shotton, J., Fauqueur, J., et al: ‘Segmentation and recognition using structure from motion point clouds’. European Conf. on Computer Vision, 2008, pp. 44–57.
-
-
38)
-
1. Pinheiro, P.O., Collobert, R., Dollár, P.: ‘Learning to segment object candidates’, Advances in Neural Information Processing Systems, 2015, pp. 1990–1998.
-
-
39)
-
11. Treml, M., Arjona-Media, J., Unterthiner, T., et al: ‘Speeding up semantic segmentation for autonomous driving’. MLITS, NIPS Workshop, Barcelona, Spain, 2016.
-
-
40)
-
8. Redmon, J., Divvala, S., Girshick, R., et al: ‘You only look once: unified, real-time object detection’. Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 779–788.
-
-
1)