Image semantic segmentation has always been a research hotspot in the field of robots. Its purpose is to assign different semantic category labels to objects by segmenting different objects. However, in practical applications, in addition to knowing the semantic category information of objects, robots also need to know the position information of objects to complete more complex visual tasks. Aiming at a complex indoor environment, this study designs an image semantic segmentation network framework of joint target detection. Using the parallel operation of adding semantic segmentation branches to the target detection network, it innovatively implements multi-vision task combining object classification, detection and semantic segmentation. By designing a new loss function, adjusting the training using the idea of transfer learning, and finally verifying it on the self-built indoor scene data set, the experiment proves that the method in this study is feasible and effective, and has good robustness.

References

1. 1)
  - 64. Tian, J.R., Cheng, W.T., Sun, Y., et al: ‘Gesture recognition based on multilevel multimodal feature fusion’, J. Intell. Fuzzy Syst., 2020, 38, (3), pp. 2539–2550.
2. 2)
  - 37. Qi, J.X., Jiang, G.Z., Li, G.F., et al: ‘Surface EMG hand gesture recognition system based on PCA and GRNN’, Neural Comput. Appl., 2019. https://doi.org/10.1007/s00521-019-04142-8.
3. 3)
  - 63. Lin, D., Zhang, R.M., Ji, Y.F., et al: ‘SCN: switchable context network for semantic segmentation of RGB-D images’, IEEE Trans. Cybern., 2018, 50, (3), pp. 1120–1131.
4. 4)
  - 32. Ren, S.Q., He, K.M., Girshick, R., et al: ‘Faster R-CNN: towards real-time object detection with region proposal networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (6), pp. 1137–1149.
5. 5)
  - 29. Lafferty, J., Mccallum, A., Pereira, F.: ‘Conditional random fields: probabilistic models for segmenting and labeling sequence data’. Int. Conf. on machine learning, Williamstown, MA, USA, 2001, pp. 282–289.
6. 6)
  - 3. Hu, J.B., Sun, Y., Li, G.F., et al: ‘Probability analysis for grasp planning facing the field of medical robotics’, Measurement, 2019, 141, pp. 227–234.
7. 7)
  - 14. Zand, M., Doraisamy, S., Halin, A.A., et al: ‘Ontology-based semantic image segmentation using mixture models and multiple CRFs’, IEEE Trans. Image Process., 2016, 25, (7), pp. 3233–3248.
8. 8)
  - 9. Liang, X.P., Huang, D.S.: ‘Image segmentation fusion using weakly supervised trace-norm multi-task learning method’, IET Image Process., 2018, 12, (7), pp. 1079–1085.
9. 9)
  - 51. Li, G.F., Wu, H., Jiang, G.Z., et al: ‘Dynamic gesture recognition in the internet of things’, IEEE Access, 2019, 7, pp. 23713–23724.
10. 10)
  - 55. Sun, Y., Xu, C., Li, G.F., et al: ‘Intelligent human computer interaction based on non redundant EMG signal’, Alexandria Eng. J., 2020, 59, (3), pp. 1149–1157.
11. 11)
  - 26. Noormohamadi, N., Adibi, P., Ehsani, S.M.S.: ‘Semantic image segmentation using an improved hierarchical graphical model’, IET Image Process., 2018, 12, (11), pp. 1943–1950.
12. 12)
  - 30. Girshick, R., Donahue, J., Darrell, T., et al: ‘Rich feature hierarchies for accurate object detection and semantic segmentation’. IEEE Conf. on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014, pp. 580–587.
13. 13)
  - 54. Lazaro, M., Illera, V., Sainz, J.: ‘The suffix priming effect: further evidence for an early morpho-orthographic segmentation process independent of its semantic content’, Q J. Exp. Psychol. (Colchester), 2016, 69, (1), pp pp. 197–208.
14. 14)
  - 46. Luo, B.W., Sun, Y., Li, G.F., et al: ‘Decomposition algorithm for depth image of human health posture based on brain health’, Neural Comput. Appl., 2020, 32, pp. 6327–6342.
15. 15)
  - 40. Zheng, S., Jayasumana, S., Romera-Paredes, B., et al: ‘Conditional random fields as recurrent neural networks’. Int. Conf. on computer vision, Santiago, Chile, 2015, pp. 1529–1537.
16. 16)
  - 43. Liao, Y.J., Sun, Y., Li, G.F., et al: ‘Simultaneous calibration: a joint optimization approach for multiple kinect and external cameras’, Sensors, 2017, 17, (7), p. 1491. https://doi.org/10.3390/s17071491.
17. 17)
  - 33. Chen, L.C., Papandreou, G., Kokkinos, I., et al: ‘DeepLab: Semantic image segmentation with deep convolutional nets and fully connected CRFs’, IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40, (4), pp. 834–848.
18. 18)
  - 67. Chen, T.G., Wu, S.W., Yang, J.J., et al: ‘Modeling of emergency supply scheduling problem based on reliability and its solution algorithm under variable road network after sudden-onset disasters’, Complexity, 2020, art. no. 7501891.
19. 19)
  - 4. Xia, Y.F., Lou, J.W., Dong, J.Y., et al: ‘Hybrid regression and isophote curvature for accurate eye center localization’, Multimedia Tools Appl., 2020, 79, pp. 805–824.
20. 20)
  - 12. Chen, T.G., Wu, S.W., Yang, J.J., et al: ‘Risk propagation model and its simulation of emergency logistics network based on material reliability’, Int. J. Environ. Res. Public Health, 2019, 16, (23), p. 4677, https://doi.org/10.3390/ijerph16234677.
21. 21)
  - 35. Lin, T.Y., Dollár, P., Girshick, R., et al: ‘Feature pyramid networks for object detection’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 936–944.
22. 22)
  - 2. Sun, Y., Li, C.Q., Li, G.F., et al: ‘Gesture recognition based on kinect and sEMG signal fusion’, Mobile Netw. Appl., 2018, 23, (4), pp. 797–805.
23. 23)
  - 45. Shelhamer, E., Long, J., Darrell, T.: ‘Fully convolutional networks for semantic segmentation’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 39, (4), pp. 640–651.
24. 24)
  - 60. Li, B., Sun, Y., Li, G.F., et al: ‘Gesture recognition based on modified adaptive orthogonal matching pursuit algorithm’, Cluster Comput., 2019, 22, (Supplement 1), pp. 503–512.
25. 25)
  - 18. Diebold, J., Nieuwenhuis, C., Cremers, D.: ‘Midrange geometric interactions for semantic segmentation constraints for continuous multi-label optimization’, Int. J. Comput. Vis., 2016, 117, (3), pp. 199–225.
26. 26)
  - 20. Li, G.F., Jiang, D., Zhou, Y.L., et al: ‘Human lesion detection method based on image information and brain signal’, IEEE Access, 2019, 7, pp. 11533–11542.
27. 27)
  - 57. Deng, J., Yang, Y., Tao, J., et al: ‘Spatial frequency multiplexed meta-holography and meta-nanoprinting’, ACS Nano, 2019, 13, (8), pp. 9237–9246.
28. 28)
  - 36. Li, G.F., Li, J.H., Ju, Z.J., et al: ‘A novel feature extraction method for machine learning based on surface electromyography from healthy brain’, Neural Comput. Appl., 2019, 31, (12), pp. 9013–9022.
29. 29)
  - 58. Yu, M.C., Li, G.F., Jiang, D., et al: ‘Application of pso-rbf neural network in gesture recognition of continuous surface EMG signals’, J. Intell. Fuzzy Syst., 2020, 38, (3), pp. 2469–2480.
30. 30)
  - 19. Liang, X.D., Lin, L., Yang, W., et al: ‘Clothes Co-parsing via joint image segmentation and labeling with application to clothing retrieval’, IEEE Trans. Multimed., 2016, 18, (6), pp. 1175–1186.
31. 31)
  - 53. Liu, C.W., Hou, J.Y., Wu, X.X., et al: ‘A discriminative structural model for joint segmentation and recognition of human actions’, Multimedia Tools Appl., 2018, 77, (24), pp. 31627–31645.
32. 32)
  - 17. Miao, W., Li, G.F., Jiang, G.Z., et al: ‘Optimal grasp planning of multi-fingered robotic hands: a review’, Appl. Comput. Math., 2015, 14, (3), pp. 238–247.
33. 33)
  - 15. Javidi, M., Pourreza, H.R., Harati, A.: ‘Vessel segmentation and microaneurysm detection using discriminative dictionary learning and sparse representation’, Comput. Methods Programs Biomed., 2017, 139, pp. 93–108.
34. 34)
  - 25. Li, G.F., Tang, H., Sun, Y., et al: ‘Hand gesture recognition based on convolution neural network’, Cluster Comput., 2019, 22, (Supplement 2), pp. 2719–2729.
35. 35)
  - 38. Chen, T.G., Shi, J.W., Yang, J.J., et al: ‘Enhancing network cluster synchronization capability based on artificial immune algorithm’, Human-centric Comput. Inform. Sci., 2019, 9, p. 3. https://doi.org/10.1186/s13673-019-0164-y.
36. 36)
  - 16. Yu, M.C., Li, G.F., Jiang, D., et al: ‘Hand medical monitoring system based on machine learning and optimal emg feature set’, Pers. Ubiquitous Comput., 2019, to appear, https://doi.org/10.1007/s00779-019-01285-2.
37. 37)
  - 28. Li, C.C., Li, G.F., Li, G.Z., et al: ‘Surface EMG data aggregation processing for intelligent prosthetic action recognition’, Neural Comput. Appl., 2020, 32, pp. 16795–16806, https://doi.org/10.1007/s00521-018-3909-z.
38. 38)
  - 65. Crispim-Junior, Z.Y., Uria, A.G., Strumia, C., et al: ‘Online recognition of daily activities by color-depth sensing and knowledge models’, Sensors, 2017, 17, (7), pp. 1–15.
39. 39)
  - 50. Zhang, J., Mu, Y.K., Feng, S.W., et al: ‘Image region annotation based on segmentation and semantic correlation analysis’, IET Image Process., 2018, 12, (8), pp. 1331–1337.
40. 40)
  - 49. Kemker, R., Salvaggio, C., Kanan, C.: ‘Algorithms for semantic segmentation of multispectral remote sensing imagery using deep learning’, ISPRS J. Photogramm. Remote Sens., 2018, 145, pp. 60–77.
41. 41)
  - 5. Zhu, H., Meng, F., Cai, J., et al: ‘Beyond pixels: A comprehensive survey from bottom-up to semantic image segmentation and cosegmentation’, J. Vis. Commun. Image Represent., 2016, 34, (2), pp. 12–27.
42. 42)
  - 59. Ravi, D., Bober, M., Farinella, G.M., et al: ‘Semantic segmentation of images exploiting DCT based features and random forest’, Pattern Recognit., 2016, 52, pp. 260–273.
43. 43)
  - 62. Luo, H., Wang, C., Wen, C.L., et al: ‘Patch-based semantic labeling of road scene using colorized Mobile LiDAR point clouds’, IEEE Trans. Intell. Transp. Syst., 2016, 17, (5), pp. 1286–1297.
44. 44)
  - 11. Qi, J.X., Jiang, G.Z., Li, G.F., et al: ‘Intelligent human-computer interaction based on surface emg gesture recognition’, IEEE Access, 2019, 7, pp. 61378–61387.
45. 45)
  - 39. Tan, C., Sun, Y., Li, G.F., et al: ‘Research on gesture recognition of smart data fusion features in the IoT’, Neural Comput. Appl., 2020, 32, pp. 16917–16929.
46. 46)
  - 66. Han, Z.Y., Wei, B.Z., Mercado, A., et al: ‘Spine-GAN: semantic segmentation of multiple spinal structures’, Med. Image Anal., 2018, 50, pp. 23–35.
47. 47)
  - 8. Seyedhosseini, M., Tasdizen, T.: ‘Semantic image segmentation with contextual hierarchical models’, IEEE Trans. Pattern Anal. Mach. Intell., 2016, 38, (5), pp. 951–964.
48. 48)
  - 13. Li, J.Y., Speier, W., Ho, K.C., et al: ‘An EM-based semi-supervised deep learning approach for semantic segmentation of histopathological images from radical prostatectomies’, Comput. Med. Imag. Graph., 2018, 69, pp. 125–133.
49. 49)
  - 21. Oh, S., Kang, H.B.: ‘Object detection and classification by decision-level fusion for intelligent vehicle systems’, Sensors, 2017, 17, (1), p. 207, https://doi.org/10.3390/s17010207.
50. 50)
  - 52. Huang, L., Fu, Q.B., Li, G.F., et al: ‘Improvement of maximum variance weight partitioning particle filter in urban computing and intelligence’, IEEE Access, 2019, 7, pp. 106527–106535.
51. 51)
  - 27. Chen, D.S., Li, G.F., Sun, Y., et al: ‘An interactive image segmentation method in hand gesture recognition’, Sensors, 2017, 17, (2), p. 253. https://doi.org/10.3390/s17020253.
52. 52)
  - 34. Chen, L.C., Papandreou, G., Kokkinos, I., et al: ‘Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs’, IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40, (4), pp. 834–848.
53. 53)
  - 41. Jiang, D., Zheng, Z.J., Li, G.F., et al: ‘Gesture recognition based on binocular vision’, Cluster Comput., 2019, 22, (Supplement 6), pp. 13261–13271.
54. 54)
  - 31. Girshick, R.: ‘Fast R-CNN’. IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 1440–1448.
55. 55)
  - 42. He, K.M., Zhang, X.Y., Ren, S.Q., et al: ‘Deep residual learning for image recognition’. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
56. 56)
  - 7. Cheng, W.T., Sun, Y., Li, G.F., et al: ‘Jointly network: a network based on cnn and rbm for gesture recognition’, Neural Comput. Appl., 2019, 31, (Supplement 1), pp. 309–323.
57. 57)
  - 61. Cheng, Y.W., Li, G.F., Li, J.H., et al: ‘Visualization of activated muscle area based on sEMG’, J. Intell. Fuzzy Syst., 2020, 38, (3), pp. 2623–2634.
58. 58)
  - 47. Feng, W., Nie, X., Zhang, Y., et al: ‘Unsupervised measure of Chinese lexical semantic similarity using correlated graph model for news story segmentation’, Neurocomputing, 2018, 318, pp. 236–247.
59. 59)
  - 1. Niu, C., Zhang, J., Wang, Q., et al: ‘Weakly supervised semantic segmentation for joint key local structure localization and classification of Aurora image’, IEEE Trans. Geosci. Remote Sens., 2018, 56, (12), pp. 7133–7146.
60. 60)
  - 44. Li, G.F., Zhang, L.L., Sun, Y., et al: ‘Towards the semg hand: internet of things sensors and haptic feedback application’, Multimedia Tools Appl., 2019, 78, (21), pp. 29765–29782.
61. 61)
  - 23. Jiang, D., Li, G.F., Sun, Y., et al: ‘Grip strength forecast and rehabilitative guidance based on adaptive neural fuzzy inference system using sEMG’, Pers. Ubiquitous Comput., 2019, to appear, https://doi.org/10.1007/s00779-019-01268-3.
62. 62)
  - 10. He, Y., Li, G.F., Liao, Y.J., et al: ‘Gesture recognition based on an improved local sparse representation classification algorithm’, Cluster Comput., 2019, 22, (Supplement 5), pp. 10935–10946.
63. 63)
  - 48. Lin, C.M., Tsai, C.Y., Lai, Y.C., et al: ‘Visual object recognition and pose estimation based on a deep semantic segmentation network’, IEEE Sens. J., 2018, 18, (22), pp. 9370–9381.
64. 64)
  - 69. Ma, R.Y., Zhang, L.L., Li, G.F., et al: ‘Grasping force prediction based on sEMG signals’, Alexandria Eng. J., 2020, 59, (3), pp. 1135–1147.
65. 65)
  - 24. Jiang, D., Li, G.F., Sun, Y., et al: ‘Gesture recognition based on skeletonization algorithm and CNN with ASL database’, Multimedia Tools Appl., 2019, 78, (21), pp. 29953–29970.
66. 66)
  - 56. Dimiccoli, M., Bolanos, M., Talavera, E., et al: ‘SR-clustering: semantic regularized clustering for egocentric photo streams segmentation’, Comput. Vis. Image Underst., 2017, 155, pp. 55–69.
67. 67)
  - 68. Liao, S.C., Li, G.F., Li, J.H., et al: ‘Multi-object intergroup gesture recognition combined with fusion feature and KNN algorithm’, J. Intell. Fuzzy Syst., 2020, 38, (3), pp. 2725–2735.
68. 68)
  - 6. Sickert, S., Rodner, E., Denzler, J.: ‘Semantic volume segmentation with iterative context integration for bio-medical image stacks’, Pattern Recognit. Image Anal., 2016, 26, (1), pp. 197–204.
69. 69)
  - 22. Chen, T.G., Li, Q.Q., Yang, J.J., et al: ‘Modeling of the public opinion polarization process with the considerations of individual heterogeneity and dynamic conformity’, Mathematics, 2019, 7, (10), p. 917, https://doi.org/10.3390/math7100917.

Jointly network image processing: multi-task image semantic segmentation of indoor scene based on CNN

References

Related content