Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon openaccess Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit

As a popular research topic in the area of distributed artificial intelligence, the multi-robot pursuit problem is widely used as a testbed for evaluating coordinated and cooperative strategies in multi-robot systems. This study the problem of multi-robot pursuit game using reinforcement learning (RL) techniques is studied. Unlike most existing studies that apply fully centralised deep RL methods based on the centralised-learning and decentralised-execution scheme, the authors propose a fully decentralised multi-agent deep RL approach by modelling each agent as an individual deep RL agent that has its own individual learning system (i.e. individual action-value function, individual leaning update process, and individual action output). To realise coordination among agents, the limited information of other environmental agents is used as input of the learning process. Experimental results show that both distributed and centralised approaches can ultimately solve the pursuit-evasion problem in different dimensions, but the learning efficiency and coordination performance of the proposed distributed approach are much better than the traditional centralised approach.

References

    1. 1)
      • 23. Yu, C., Zhang, M., Ren, F.: ‘Collective learning for the emergence of social norms in networked multiagent systems’, IEEE Trans. Cybern., 2014, 44, (12), pp. 23422355.
    2. 2)
    3. 3)
      • 15. Camci, E., Kayacan, E.: ‘Game of drones: UAV pursuit-evasion game with type-2 fuzzy logic controllers tuned by reinforcement learning’. IEEE Int. Conf. on Fuzzy Systems, Vancouver, Canada, 2016, pp. 618625.
    4. 4)
      • 10. Sunehag, P., Lever, G., Gruslys, A., et al: ‘Value-decomposition networks for cooperative multi-agent learning’. AAMAS, Sao Paulo, Brazil, 2017, pp. 20852087.
    5. 5)
      • 9. Peng, P., Wen, Y., Yang, Y., et al: ‘Multiagent bidirectionally-coordinated nets: emergence of human-level coordination in learning to play StarCraft combat games’, arXiv preprint arXiv:1703.10069, 2017.
    6. 6)
      • 16. Kehagias, A., Hollinger, G., Singh, S.: ‘A graph search algorithm for indoor pursuit/evasion’, Math. Comput. Model., 2009, 50, (9), pp. 13051317.
    7. 7)
      • 7. Schulman, J., Levine, S., Abbeel, P., et al: ‘Trust region policy optimization’. ICML, Lille, France, 2015, pp. 18891897.
    8. 8)
      • 1. Sutton, R.S., Barto, A.G.: ‘Reinforcement learning: an Introduction’ (MIT Press, Cambridge, MA, USA, 2018).
    9. 9)
      • 6. Mnih, V., Kavukcuoglu, K., Silver, D., et al: ‘Human-level control through deep reinforcement learning’, Nature, 2015, 518, (7540), p. 529.
    10. 10)
      • 28. Wang, Z., Schaul, T., Hessel, M., et al: ‘Dueling network architectures for deep reinforcement learning’. Conf. on Int. Conf. on Machine Learning, New York City, NY, USA, 2016, pp. 19952003.
    11. 11)
      • 30. Grondman, I., Busoniu, L., Lopes, G.A., et al: ‘A survey of actor-critic reinforcement learning: standard and natural policy gradients’, IEEE Trans. Syst. Man Cybern., C, 2012, 42, (6), pp. 12911307.
    12. 12)
      • 3. Silver, D., Huang, A., Maddison, C.J., et al: ‘Mastering the game of Go with deep neural networks and tree search’, Nature, 2015, 529, (7587), pp. 484489.
    13. 13)
      • 25. Watkins Christopher, J.C.H., Dayan, P.: ‘Q-learning’, Mach. Learn., 1992, 8, (3–4), pp. 279292.
    14. 14)
      • 26. Van Hasselt, H., Guez, A., Silver, D.: ‘Deep reinforcement learning with double Q-learning’. AAAI, Austin, TX, USA, 2015.
    15. 15)
      • 11. Leibo, J.Z., Zambaldi, V., Lanctot, M., et al: ‘Multi-agent reinforcement learning in sequential social dilemmas’. AAMAS, Sao Paulo, Brazil, 2017, pp. 464473.
    16. 16)
      • 29. Zheng, L., Yang, J., Cai, H., et al: ‘MAgent: a many-agent reinforcement learning platform for artificial collective intelligence’. AAAI, New Orleans, LA, USA, 2018.
    17. 17)
      • 5. Busoniu, L., Babuska, R., De Schutter, B.: ‘A comprehensive survey of multiagent reinforcement learning’, IEEE Trans. Syst. Man Cybern.-C, Appl. Rev., 2008, 38, (2), pp. 156172.
    18. 18)
      • 12. Isaacs, R., Philip, R.: ‘Differential games: a mathematical theory with applications to warfare and pursuit’, Control Opt., 1966, 17, (2), pp. 6060.
    19. 19)
      • 4. Olfati-Saber, R., Fax, J.A, Murray, R.M.: ‘Consensus and cooperation in networked multi-agent systems’, Proc. IEEE, 2007, 95, (1), pp. 215233.
    20. 20)
      • 19. Su, Z.B., Lu, J.L., Tong, L.: ‘Strategy of cooperative hunting by multiple Mobile robots’, J. Beijing Inst. Technol., 2004, 5, (8), pp. 403406.
    21. 21)
      • 18. Grizou, J., Barrett, S., Stone, P., et al: ‘Collaboration in Ad hoc teamwork: ambiguous tasks, roles, and communication’. AAMAS Adaptive Learning Agents (ALA) Workshop, Singapore, 2016.
    22. 22)
      • 8. Lowe, R., Wu, Y., Tamar, A., et al: ‘Multi-agent actor-critic for mixed cooperative-competitive environments’. Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 63796390.
    23. 23)
      • 24. Tan, M.: ‘Multi-agent reinforcement learning: independent vs. Cooperative agents’. Machine Learning Proc., Amherst, MA, USA, 1993, pp. 330337.
    24. 24)
      • 2. Jiang, Z., Xu, D., Liang, J.: ‘A deep reinforcement learning framework for the financial portfolio management problem’, arXiv preprint arXiv:1706.10059, 2017.
    25. 25)
      • 14. Vidal, R., Shakernia, O., Kim, H.J., et alProbabilistic pursuit-evasion games: theory, implementation, and experimental evaluation’, IEEE Trans. Robotics Autom., 2002, 18, (5), pp. 662669.
    26. 26)
      • 27. Hausknecht, M., Stone, P.: ‘Deep recurrent Q-learning for partially observable MDPs’. 2015 AAAI Fall Symp. Series, Arlington, VA, USA, 2015.
    27. 27)
      • 13. Parsons, D.T.: ‘Pursuit-evasion in a graph’, Theory and Applications of Graphs. Lecture Notes in Mathematics (Springer, New York City, NY, USA, 1976), pp. 426441.
    28. 28)
      • 22. Yu, C., Zhang, M., Ren, F., et al: ‘Multiagent learning of coordination in loosely coupled multiagent systems’, IEEE Trans. Cybern., 2015, 45, (12), pp. 28532867.
    29. 29)
      • 17. Hollinger, G., Singh, S., Kehagias, A.: ‘Improving the efficiency of clearing with multi-agent teams’, Int. J. Robotics Res., 2010, 29, (8), pp. 10881105.
    30. 30)
      • 21. Yu, C., Zhang, M., Ren, F., et al: ‘Emotional multiagent reinforcement learning in spatial social dilemmas’, IEEE Trans. Neural Netw. Learn. Syst., 2015, 26, (12), pp. 30833096.
http://iet.metastore.ingenta.com/content/journals/10.1049/joe.2019.1200
Loading

Related content

content/journals/10.1049/joe.2019.1200
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address