Rendezvous planning for multiple autonomous underwater vehicles using a Markov decision process

Veronika Yordanova; Hugh Griffiths; Stephen Hailes

Rendezvous planning for multiple autonomous underwater vehicles using a Markov decision process

View Fulltext

Author(s): Veronika Yordanova¹ ; Hugh Griffiths² ; Stephen Hailes³
- Affiliations: 1: Atlas Elektronik GmbH , Sebaldsbruecker Heerstrasse 235, Bremen , Germany ;
  2: Department of Electronic and Electrical Engineering , University College London , Torrington Place, London , UK ;
  3: Department of Computer Science , University College London , Gower Street, London , UK
Source: Volume 11, Issue 12, December 2017, p. 1762 – 1769
DOI: 10.1049/iet-rsn.2017.0098 , Print ISSN 1751-8784, Online ISSN 1751-8792

This is an open access article published by the IET under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/)

Received 10/03/2017, Accepted 17/07/2017, Revised 25/05/2017, Published 20/07/2017

Multiple autonomous underwater vehicles (AUVs) are a potential alternative to conventional large manned vessels for mine countermeasure (MCM) operations. Online mission planning for cooperative multi-AUV network often rely on predefined contingencies of reactive methods and does not deliver an optimal end-goal performance. Markov decision process is a decision-making framework that allows an optimal solution, taking into account future decision estimates, rather than having a myopic view. However, most real-world problems are too complex to be represented by this framework. The authors deal with the complexity problem by abstracting the MCM scenario with a reduced state and action space, yet retaining the information that defines the goal and constraints coming from the application. Another critical part of the model is the ability of the vehicles to communicate and enable a cooperative mission. They use the rendezvous point (RP) method. The RP method schedules meeting points for the vehicles throughout the mission. The authors’ model provides an optimal action selection solution for the multi-AUV MCM problem. The computation of the mission plan is performed in the order of minutes. This quick execution demonstrates the model is feasible for real-time applications.

References

1. 1)
  - 8. Yordanova, V., Griffiths, H.: ‘Synchronous rendezvous technique for multi-vehicle mine countermeasure operations’ (OCEANS'15 MTS/IEEE Washington, 2015).
2. 2)
  - 19. Kearns, M., Mansour, Y., Ng, A.: ‘A sparse sampling algorithm for near-optimal planning in large Markov decision processes’, Mach. Learn., 2002, 49, (2), pp. 193–208.
3. 3)
  - 11. Silver, D., Huang, A., Maddison, C., et al: ‘Mastering the game of Go with deep neural networks and tree search’, Nature, 2016, 529, (7587), pp. 484–489.
4. 4)
  - 16. Crites, R., Barto, A.: ‘Improving elevator performance using reinforcement learning’, Adv. Neural Inf. Process. Syst., 1996, 8, pp. 1017–1023.
5. 5)
  - 1. Kalwa, J., Pascoal, A, Ridao, P., et al: ‘The European R&D-Project MORPH: marine robotic systems of self-organizing, logically linked physical nodes’, IFAC Proc., 2012, 45, (5), pp. 349–354.
6. 6)
  - 20. Boutilier, C., Dean, T., Hanks, S.: ‘Decision-theoretic planning: Structural assumptions and computational leverage’, J. Artif. Intell. Res., 1999, 11, (1), p. 94.
7. 7)
  - 5. Hanson, F., Radic, S.: ‘High bandwidth underwater optical communication’, Appl. Opt., 2008, 47, (2), pp. 277–283.
8. 8)
  - 21. Shani, G., Pineau, J., Kaplow, R.: ‘A survey of point-based POMDP solvers’, Auton. Agents Multi-Agent Syst., 2013, pp. 1–51.
9. 9)
  - 14. Sutton, R., Barto, A.: ‘Reinforcement learning: an introduction’ (MIT press, Cambridge, 1998, vol. 1. No. 1).
10. 10)
  - 6. Stojanovic, M.: ‘Recent advances in high-speed underwater acoustic communications’, IEEE J. Ocean. Eng., 1996, 21, (2), pp. 125–136.
11. 11)
  - 4. Munafò, A., Simetti, E., Turetta, A., et al: ‘Autonomous underwater vehicle teams for adaptive ocean sampling: a data-driven approach’, Ocean. Dyn., 2011, 61, (11), pp. 1981–1994.
12. 12)
  - 7. Potter, J., Alves, J., Green, D., et al: ‘The JANUS underwater communications standard’, IEEE Underwater Commun. Netw. (UComms), 2014, pp. 1–4.
13. 13)
  - 3. Caiti, A., Calabro, V., Munafò, A., et al: ‘Mobile underwater sensor networks for protection and security: field experience at the UAN11 experiment’, J. Field Robot., 2013, 30, (2), pp. 237–253.
14. 14)
  - 9. Yordanova, V., Griffiths, H.: ‘Rendezvous point technique for multivehicle mine countermeasure operations in communication-constrained environments’, Mar. Technol. Soc. J., 2016, 50, (2), pp. 5–16.
15. 15)
  - 2. Petillo, S., Schmidt, H., Balasuriya, A.: ‘Constructing a distributed AUV network for underwater plume-tracking operations’, Int. J. Distrib. Sens. Netw., 2012, 8.
16. 16)
  - 12. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Playing Atari with deep reinforcement learning’ (arXiv preprint arXiv:1312.5602, 2013).
17. 17)
  - 15. Panait, L., Luke, S.: ‘Cooperative multi-agent learning: the state of the art’, Auton. Agents Multi-Agent Syst., 2005, 11, (3), pp. 387–434.
18. 18)
  - 18. Barto, A., Bradtke, S., Singh, S.: ‘Learning to act using real-time dynamic programming’, Artif. Intell., 1995, 72, (1–2), pp. 81–138.
19. 19)
  - 22. Lin, J., Morse, A., Anderson, B.: ‘The multi-agent rendezvous problem. Part 1: the synchronous case’, SIAM J. Control Optim., 2007, 46, (6), pp. 2096–2119.
20. 20)
  - 13. White, D.J.: ‘A survey of applications of Markov decision processes’, J. Oper. Res. Soc., 1993, 44, (1), pp. 1073–1096.
21. 21)
  - 17. Dean, T., Kaelbling, L., Kirman, J., et al: ‘Planning with deadlines in stochastic domains’, In AAAI, 1993, 93, pp. 574–579.
22. 22)
  - 10. Navy U. S.: ‘The navy unmanned undersea vehicle (UUV) master plan.’ US Navy, 2004.
23. 23)
  - 23. Beardwood, J., Halton, J., Hammersley, J.: ‘The shortest path through many points’, Math. Proc. Cambridge Philosophical Soc., 1959, 55, pp. 299–327.
24. 24)
  - 24. Bellman, R.: ‘Dynamic programming and Lagrange multipliers’, Proc. Natl. Acad. Sci., 1956, 42, (10), pp. 767–769.

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Rendezvous planning for multiple autonomous underwater vehicles using a Markov decision process

References

Related content