The combination of a small unmanned ground vehicle (UGV) and a large unmanned carrier vehicle allows more flexibility in real applications such as rescue in dangerous scenarios. The autonomous recovery system, which is used to guide the small UGV back to the carrier vehicle, is an essential component to achieve a seamless combination of the two vehicles. This study proposes a novel autonomous recovery framework with a low-cost monocular vision system to provide accurate positioning and attitude estimation of the UGV during navigation. First, the authors introduce a light-weight convolutional neural network called UGV-KPNet to detect the keypoints of the small UGV form the images captured by a monocular camera. UGV-KPNet is computationally efficient with a small number of parameters and provides pixel-level accurate keypoints detection results in real-time. Then, six degrees of freedom (6-DoF) pose is estimated using the detected keypoints to obtain positioning and attitude information of the UGV. Besides, they are the first to create a large-scale real-world keypoints data set of the UGV. The experimental results demonstrate that the proposed system achieves state-of-the-art performance in terms of both accuracy and speed on UGV keypoint detection, and can further boost the 6-DoF pose estimation for the UGV.

References

1. 1)
  - 24. He, K., Gkioxari, G., Dollár, P., et al: ‘Mask r-cnn’. Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Venice, Italy, 2017, pp. 2961–2969.
2. 2)
  - 43. Wang, P., Chen, P., Yuan, Y., et al: ‘Understanding convolution for semantic segmentation’. 2018 IEEE Winter Conf. Appl. Comput. Vis. (WACV), Lake Tahoe, NV, USA, 2018, pp. 1451–1460.
3. 3)
  - 51. Cao, Z., Simon, T., Wei, S.E., et al: ‘Realtime multi-person 2d pose estimation using part affinity fields’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, USA, 2017, pp. 7291–7299.
4. 4)
  - 23. Lin, T.Y., Maire, M., Belongie, S., et al: ‘Microsoft coco: common objects in context’. Eur. Conf. on Comput. Vis. (ECCV). (Springer), Zurich, Switzerland, 2014, pp. 740–755.
5. 5)
  - 40. Lepetit, V., Moreno-Noguer, F., Fua, P.: ‘Epnp: an accurate o (n) solution to the pnp problem’, Int. J. Comput. Vis., 2009, 81, (2), p. 155.
6. 6)
  - 13. Li, J., Yuan, X., Zhao, C., et al: ‘Automatic withdrawal method based on 2d laser radar for small ground mobile robot’, Robot, 2017, 39, (5), p. 688.
7. 7)
  - 34. Pavlakos, G., Zhou, X., Chan, A., et al: ‘6-dof object pose from semantic keypoints’. Proc. IEEE Int. Conf. Robot. Autom. (ICRA). (IEEE), Singapore, 2017, pp. 2011–2018.
8. 8)
  - 20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ‘Imagenet classification with deep convolutional neural networks’. Proc. Adv. Neural Inf. Process. Syst. (NIPS), LakeTahoe, NV, USA, 2012, pp. 1097–1105.
9. 9)
  - 12. Pandey, A., Pandey, S., Parhi, D.: ‘Mobile robot navigation and obstacle avoidance techniques: a review’, Int. Rob. Auto. J., 2017, 2, (3), p. 00022.
10. 10)
  - 49. Howard, A.G., Zhu, M., Chen, B., et al: ‘Mobilenets: Efficient convolutional neural networks for mobile vision applications’, arXiv preprint arXiv:170404861, 2017.
11. 11)
  - 38. He, K., Zhang, X., Ren, S., et al: ‘Deep residual learning for image recognition’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Las Vegas, NV, USA, 2016, pp. 770–778.
12. 12)
  - 8. Hasan, K.M., Al-Nahid, A., Reza, K.J.: ‘Path planning algorithm development for autonomous vacuum cleaner robots’. Proc. Int. Conf. Info. Electronics & Vis. (ICIEV), Dhaka, Bangladesh, 2014, pp. 1–6.
13. 13)
  - 4. Sun, L., Yan, Z., Mellado, S.M., et al: ‘3dof pedestrian trajectory prediction learned from long-term autonomous mobile robot deployment data’. Proc. IEEE Int. Conf. Robot. Autom. (ICRA). (IEEE), Brisbane, Australia, 2018, pp. 5942–5948.
14. 14)
  - 44. Chen, L.C., Papandreou, G., Kokkinos, I., et al: ‘Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 40, (4), pp. 834–848.
15. 15)
  - 25. Redmon, J., Farhadi, A.: ‘Yolo9000: better, faster, stronger’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Honolulu, HI, USA, 2017, pp. 7263–7271.
16. 16)
  - 37. Guan, C.z.: ‘Realtime multi-person 2d pose estimation using shufflenet’. 14th Int. Conf. Comput. Science & Education (ICCSE). (IEEE), Dallas, TX, USA, 2019, pp. 17–21.
17. 17)
  - 33. Wu, H., Zhang, K., Tian, G.: ‘Simultaneous face detection and pose estimation using convolutional neural network cascade’, IEEE Access, 2018, 6, pp. 49563–49575.
18. 18)
  - 9. Yakoubi, M.A., Laskri, M.T.: ‘The path planning of cleaner robot for coverage region using genetic algorithms’, J. Innov. Digit. Ecosyst., 2016, 3, (1), pp. 37–43.
19. 19)
  - 32. Wu, Y., Hassner, T., Kim, K., et al: ‘Facial landmark detection with tweaked convolutional neural networks’, IEEE Trans. Pattern Anal. Mach. Intell., 2017, 40, (12), pp. 3067–3074.
20. 20)
  - 47. Noh, H., Hong, S., Han, B.: ‘Learning deconvolution network for semantic segmentation’. Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Santiago, Chile, 2015, pp. 1520–1528.
21. 21)
  - 18. Chiang, K.W., Duong, T., Liao, J.K.: ‘The performance analysis of a real-time integrated ins/gps vehicle navigation system with abnormal gps measurement elimination’, Sensors, 2013, 13, (8), pp. 10599–10622.
22. 22)
  - 35. Ren, S., He, K., Girshick, R., et al: ‘Faster r-cnn: towards real-time object detection with region proposal networks’. Proc. Adv. Neural Inf. Process. Syst. (NIPS), Montreal, Canada, 2015, pp. 91–99.
23. 23)
  - 16. Stančić, R., Graovac, S.: ‘The integration of strap-down ins and gps based on adaptive error damping’, Robot. Autom. Syst., 2010, 58, (10), pp. 1117–1129.
24. 24)
  - 14. Roweis, S.: ‘Levenberg-marquardt optimization’ (Notes University Of Toronto, Canada, 1996).
25. 25)
  - 36. Kocabas, M., Karagoz, S., Akbas, E.: ‘Multiposenet: fast multi-person pose estimation using pose residual network’. Proc. Eur. Conf. Comput. Vis., Munich, Germany, 2018, pp. 417–433.
26. 26)
  - 30. Li, J., Liu, Y., Gong, D., et al: ‘Rgbd based dimensional decomposition residual network for 3d semantic scene completion’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Long Beach, California, June 2019, pp. 7693–7702.
27. 27)
  - 52. Sandler, M., Howard, A., Zhu, M., et al: ‘Mobilenetv2: inverted residuals and linear bottlenecks’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Salt Lake City, UT, USA, 2018, pp. 4510–4520.
28. 28)
  - 22. Russakovsky, O., Deng, J., Su, H., et al: ‘Imagenet large scale visual recognition challenge’, Int. J. Comput. Vis. (IJCV), 2015, 115, (3), pp. 211–252.
29. 29)
  - 11. Baranzadeh, A., Savkin, A.V.: ‘A distributed control algorithm for area search by a multi-robot team’, Robotica, 2017, 35, (6), pp. 1452–1472.
30. 30)
  - 5. Paden, B., Čáp, M., Yong, S.Z., et al: ‘A survey of motion planning and control techniques for self-driving urban vehicles’, IEEE Trans. Intell. Veh., 2016, 1, (1), pp. 33–55.
31. 31)
  - 48. Rebuffi, S.A., Bilen, H., Vedaldi, A.: ‘Efficient parametrization of multi-domain deep neural networks’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Salt Lake City, UT, USA, 2018, pp. 8119–8127.
32. 32)
  - 31. Li, J., Liu, Y., Yuan, X., et al: ‘Depth based semantic scene completion with position importance aware loss’, IEEE Robot. Autom. Lett., 2020, 5, (1), pp. 219–226.
33. 33)
  - 41. Ma, N., Zhang, X., Zheng, H.T., et al: ‘Shufflenet v2: practical guidelines for efficient cnn architecture design’. Proc. Eur. Conf. Comput. Vis. (ECCV), Munich, Germany, 2018, pp. 116–131.
34. 34)
  - 27. Papandreou, G., Zhu, T., Kanazawa, N., et al: ‘Towards accurate multi-person pose estimation in the wild’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Honolulu, HI, USA, 2017, pp. 4903–4911.
35. 35)
  - 50. Chollet, F.: ‘Xception: deep learning with depthwise separable convolutions’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Honolulu, HI, USA, 2017, pp. 1251–1258.
36. 36)
  - 46. Chen, L.C., Zhu, Y., Papandreou, G., et al: ‘Encoder-decoder with atrous separable convolution for semantic image segmentation’. Proc. Eur. Conf. Comput. Vis., Munich, Germany, 2018, pp. 801–818.
37. 37)
  - 7. Luo, C., Yang, S.X.: ‘A bioinspired neural network for real-time concurrent map building and complete coverage robot navigation in unknown environments’, IEEE Trans. Neural Netw., 2008, 19, (7), pp. 1279–1298.
38. 38)
  - 21. Simonyan, K., Zisserman, A.: ‘Very deep convolutional networks for large-scale image recognition’. Int. Conf. Learn. Represent, San Diego, CA, USA, 2015.
39. 39)
  - 29. Sun, K., Xiao, B., Liu, D., et al: ‘Deep high-resolution representation learning for human pose estimation’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Long Beach, CA, USA, 2019, pp. 5693–5703.
40. 40)
  - 2. Veloso, M.M., Biswas, J., Coltin, B., et al: ‘Cobots: robust symbiotic autonomous mobile service robots’. Proc. Int. Joint Connf. Artif. Intell. (IJCAI). (AAAI Press), Buenos Aires, Argentina, 2015, pp. 4423–4429.
41. 41)
  - 10. Guarnieri, M., Kurazume, R., Masuda, H., et al: ‘Helios system: a team of tracked robots for special urban search and rescue operations’. Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., St. Louis, MI, USA, 2009, pp. 2795–2800.
42. 42)
  - 28. Cao, Z., Hidalgo Martinez, G., Simon, T., et al: ‘Openpose: realtime multi-person 2d pose estimation using part affinity fields’, IEEE Trans. Pattern Anal. Mach. Intell., 2019, 43, (1), pp. 172–186.
43. 43)
  - 39. Chen, Y., Wang, Z., Peng, Y., et al: ‘Cascaded pyramid network for multi-person pose estimation’. Proc. IEEE Conf. Comput. Vis. Pattern Recognit, Salt Lake City, UT, USA, 2018, pp. 7103–7112.
44. 44)
  - 6. Kurazume, R., Hirose, S.: ‘Development of a cleaning robot system with cooperative positioning system’, Auton. Robot., 2000, 9, (3), pp. 237–246.
45. 45)
  - 1. Luo, R.C., Hsiao, T.J.: ‘Dynamic wireless indoor localization incorporating with an autonomous mobile robot based on an adaptive signal model fingerprinting approach’, IEEE Trans. Ind. Electron., 2018, 66, (3), pp. 1940–1951.
46. 46)
  - 42. Glorot, X., Bordes, A., Bengio, Y.: ‘Deep sparse rectifier neural networks’. Proc. Int. Conf. Artif. Intell. Stat., Lauderdale, FL, USA, 2011, pp. 315–323.
47. 47)
  - 26. Tian, Z., Shen, C., Chen, H., et al: ‘FCOS: fully convolutional one-stage object detection’. Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Seoul, Republic of Korea, 2019, pp. 9627–9636.
48. 48)
  - 17. Lim, J., Choi, K., Cho, J., et al: ‘Integration of gps and monocular vision for land vehicle navigation in urban area’, Int. J. Automot. Technol., 2017, 18, (2), pp. 345–356.
49. 49)
  - 45. Chen, L.C., Papandreou, G., Schroff, F., et al: ‘Rethinking atrous convolution for semantic image segmentation’, arXiv preprint arXiv:170605587, 2017.
50. 50)
  - 15. Abbott, E., Powell, D.: ‘Land-vehicle navigation using gps’, Proc. IEEE, 1999, 87, (1), pp. 145–162.
51. 51)
  - 19. Yuan, J.Z., Chen, H., Zhao, B., et al: ‘Estimation of vehicle pose and position with monocular camera at urban road intersections’, J. Comput. Sci. Technol., 2017, 32, (6), pp. 1150–1161.
52. 52)
  - 3. Omrane, H., Masmoudi, M.S., Masmoudi, M.: ‘Fuzzy logic based control for autonomous mobile robot navigation’, Comput. Intell. Neurosci., 2016, 2016, pp. 1–10.

Real-time keypoints detection for autonomous recovery of the unmanned ground vehicle

References

Related content