Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon openaccess Rendezvous planning for multiple autonomous underwater vehicles using a Markov decision process

Multiple autonomous underwater vehicles (AUVs) are a potential alternative to conventional large manned vessels for mine countermeasure (MCM) operations. Online mission planning for cooperative multi-AUV network often rely on predefined contingencies of reactive methods and does not deliver an optimal end-goal performance. Markov decision process is a decision-making framework that allows an optimal solution, taking into account future decision estimates, rather than having a myopic view. However, most real-world problems are too complex to be represented by this framework. The authors deal with the complexity problem by abstracting the MCM scenario with a reduced state and action space, yet retaining the information that defines the goal and constraints coming from the application. Another critical part of the model is the ability of the vehicles to communicate and enable a cooperative mission. They use the rendezvous point (RP) method. The RP method schedules meeting points for the vehicles throughout the mission. The authors’ model provides an optimal action selection solution for the multi-AUV MCM problem. The computation of the mission plan is performed in the order of minutes. This quick execution demonstrates the model is feasible for real-time applications.

References

    1. 1)
      • 8. Yordanova, V., Griffiths, H.: ‘Synchronous rendezvous technique for multi-vehicle mine countermeasure operations’ (OCEANS'15 MTS/IEEE Washington, 2015).
    2. 2)
      • 19. Kearns, M., Mansour, Y., Ng, A.: ‘A sparse sampling algorithm for near-optimal planning in large Markov decision processes’, Mach. Learn., 2002, 49, (2), pp. 193208.
    3. 3)
      • 11. Silver, D., Huang, A., Maddison, C., et al: ‘Mastering the game of Go with deep neural networks and tree search’, Nature, 2016, 529, (7587), pp. 484489.
    4. 4)
      • 16. Crites, R., Barto, A.: ‘Improving elevator performance using reinforcement learning’, Adv. Neural Inf. Process. Syst., 1996, 8, pp. 10171023.
    5. 5)
      • 1. Kalwa, J., Pascoal, A, Ridao, P., et al: ‘The European R&D-Project MORPH: marine robotic systems of self-organizing, logically linked physical nodes’, IFAC Proc., 2012, 45, (5), pp. 349354.
    6. 6)
      • 20. Boutilier, C., Dean, T., Hanks, S.: ‘Decision-theoretic planning: Structural assumptions and computational leverage’, J. Artif. Intell. Res., 1999, 11, (1), p. 94.
    7. 7)
      • 5. Hanson, F., Radic, S.: ‘High bandwidth underwater optical communication’, Appl. Opt., 2008, 47, (2), pp. 277283.
    8. 8)
      • 21. Shani, G., Pineau, J., Kaplow, R.: ‘A survey of point-based POMDP solvers’, Auton. Agents Multi-Agent Syst., 2013, pp. 151.
    9. 9)
      • 14. Sutton, R., Barto, A.: ‘Reinforcement learning: an introduction’ (MIT press, Cambridge, 1998, vol. 1. No. 1).
    10. 10)
      • 6. Stojanovic, M.: ‘Recent advances in high-speed underwater acoustic communications’, IEEE J. Ocean. Eng., 1996, 21, (2), pp. 125136.
    11. 11)
      • 4. Munafò, A., Simetti, E., Turetta, A., et al: ‘Autonomous underwater vehicle teams for adaptive ocean sampling: a data-driven approach’, Ocean. Dyn., 2011, 61, (11), pp. 19811994.
    12. 12)
      • 7. Potter, J., Alves, J., Green, D., et al: ‘The JANUS underwater communications standard’, IEEE Underwater Commun. Netw. (UComms), 2014, pp. 14.
    13. 13)
      • 3. Caiti, A., Calabro, V., Munafò, A., et al: ‘Mobile underwater sensor networks for protection and security: field experience at the UAN11 experiment’, J. Field Robot., 2013, 30, (2), pp. 237253.
    14. 14)
      • 9. Yordanova, V., Griffiths, H.: ‘Rendezvous point technique for multivehicle mine countermeasure operations in communication-constrained environments’, Mar. Technol. Soc. J., 2016, 50, (2), pp. 516.
    15. 15)
      • 2. Petillo, S., Schmidt, H., Balasuriya, A.: ‘Constructing a distributed AUV network for underwater plume-tracking operations’, Int. J. Distrib. Sens. Netw., 2012, 8.
    16. 16)
      • 12. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Playing Atari with deep reinforcement learning’ (arXiv preprint arXiv:1312.5602, 2013).
    17. 17)
      • 15. Panait, L., Luke, S.: ‘Cooperative multi-agent learning: the state of the art’, Auton. Agents Multi-Agent Syst., 2005, 11, (3), pp. 387434.
    18. 18)
      • 18. Barto, A., Bradtke, S., Singh, S.: ‘Learning to act using real-time dynamic programming’, Artif. Intell., 1995, 72, (1–2), pp. 81138.
    19. 19)
      • 22. Lin, J., Morse, A., Anderson, B.: ‘The multi-agent rendezvous problem. Part 1: the synchronous case’, SIAM J. Control Optim., 2007, 46, (6), pp. 20962119.
    20. 20)
      • 13. White, D.J.: ‘A survey of applications of Markov decision processes’, J. Oper. Res. Soc., 1993, 44, (1), pp. 10731096.
    21. 21)
      • 17. Dean, T., Kaelbling, L., Kirman, J., et al: ‘Planning with deadlines in stochastic domains’, In AAAI, 1993, 93, pp. 574579.
    22. 22)
      • 10. Navy U. S.: ‘The navy unmanned undersea vehicle (UUV) master plan.’ US Navy, 2004.
    23. 23)
      • 23. Beardwood, J., Halton, J., Hammersley, J.: ‘The shortest path through many points’, Math. Proc. Cambridge Philosophical Soc., 1959, 55, pp. 299327.
    24. 24)
      • 24. Bellman, R.: ‘Dynamic programming and Lagrange multipliers’, Proc. Natl. Acad. Sci., 1956, 42, (10), pp. 767769.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-rsn.2017.0098
Loading

Related content

content/journals/10.1049/iet-rsn.2017.0098
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address