Reinforcement learning-based multi-agent system for network traffic signal control

Reinforcement learning-based multi-agent system for network traffic signal control

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Intelligent Transport Systems — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

A challenging application of artificial intelligence systems involves the scheduling of traffic signals in multi-intersection vehicular networks. This paper introduces a novel use of a multi-agent system and reinforcement learning (RL) framework to obtain an efficient traffic signal control policy. The latter is aimed at minimising the average delay, congestion and likelihood of intersection cross-blocking. A five-intersection traffic network has been studied in which each intersection is governed by an autonomous intelligent agent. Two types of agents, a central agent and an outbound agent, were employed. The outbound agents schedule traffic signals by following the longest-queue-first (LQF) algorithm, which has been proved to guarantee stability and fairness, and collaborate with the central agent by providing it local traffic statistics. The central agent learns a value function driven by its local and neighbours' traffic conditions. The novel methodology proposed here utilises the Q-Learning algorithm with a feedforward neural network for value function approximation. Experimental results clearly demonstrate the advantages of multi-agent RL-based control over LQF governed isolated single-intersection control, thus paving the way for efficient distributed traffic signal control in complex settings.


    1. 1)
      • Reinforcement learning: a survey
    2. 2)
      • Watkins, C.J.C.H.: `Learning from delayed rewards', 1989, PhD, Cambridge University, Cambridge, UK
    3. 3)
      • Reinforcement learning: an introduction
    4. 4)
      • Jayakrishnan, R., Mattingly, S., McNally, M.: `Performance study of SCOOT traffic control system with non-ideal detectorization: field operational test in the city of Anaheim', 80thAnn. Meeting of the Transportation Research Board, 2001, Washington, DC
    5. 5)
    6. 6)
      • Fehon, P.K.: `Adaptive traffic signals are we missing the boat?', ITE District 6 Ann. Meeting, 2004, DKS Associates
    7. 7)
      • Salkham, A., Cunningham, R., Garg, A., Cahill, V.: `A collaborative reinforcement learning approach to urban traffic control optimization', Proc. 2008 IEEE/WIC/ACM Int. Conf. on Web Intelligence and Intelligent Agent Technology, December 2008, Sydney, Australia, p. 560–566
    8. 8)
    9. 9)
      • ITS and traffic management
    10. 10)
      • A multivariable regulator approach to traffic-responsive network-wide signal control
    11. 11)
      • Multi-agent model predictive control of signaling split in urban traffic networks
    12. 12)
      • Cooperative multi-agent system for coordinated traffic signal control
    13. 13)
      • Wiering, M.: `Multi-agent reinforcement learning for traffic light control', Proc. 17th Int. Conf. on Machine Learning, 2000
    14. 14)
      • Reinforcement learning for the true adaptive traffic signal control
    15. 15)
      • Adaptive traffic signal control using approximate dynamic programming
    16. 16)
      • Stochastic adaptive control model for traffic signal systems
    17. 17)
      • Self-organizing traffic lights
    18. 18)
      • A novel signal scheduling algorithm with quality of service provisioning for an isolate intersection
    19. 19)
      • Automated adaptive traffic corridor control using reinforcement learning
    20. 20)
      • A theory of cerebellar function
    21. 21)
      • Multivariable functional interpolation and adaptivenetworks
    22. 22)
      • Q-learning
    23. 23)
      • Neural networks: a comprehensive foundation
    24. 24)
      • Artificial neural networks

Related content

This is a required field
Please enter a valid email address