This study presents a three-lane highway overtaking strategy for an automated vehicle, which is based on a heuristic planning reinforcement learning algorithm. The proposed decision-making controller focuses on keeping the autonomous vehicle operating safely and efficiently. First, the modelling of the overtaking driving scenario is introduced and the reference approaches named intelligent driver model and minimise overall braking induced by lane changes are formulated. Second, the Dyna-H algorithm, which combines the modified Q-learning algorithm with a heuristic planning policy, is utilised for highway overtaking decision-making. Three different heuristic strategies are formulated to improve learning efficiency and compare performance. This algorithm is applied to determine the lane change and speed selection for an ego vehicle in the environment with uncertainties. Finally, the performance of Dyna-H is estimated in the autonomous overtaking scenario by comparing it with the reference and traditional learning methods. Furthermore, the Dyna-H-enabled decision-making strategies are validated and analysed in an open-sourcing driving dataset. Results prove that the proposed decision-making strategy could produce superior performance in convergence rate and control.

References

1. 1)
  - 2. Hubmann, C., Becker, M., Althoff, D., et al: ‘Decision making for autonomous driving considering interaction and uncertain prediction of surrounding vehicles’. IEEE Intelligent Vehicles Symp. (IV), Los Angeles, CA, USA, 2017, pp. 1671–1678.
2. 2)
  - 22. Liu, T., Hu, X., Li, S., et al: ‘Reinforcement learning optimized look-ahead energy management of a parallel hybrid electric vehicle’, IEEE/ASME Trans. Mechatronics, 2017, 22, (4), pp. 1497–1507.
3. 3)
  - 12. Lenz, D., Diehl, F., Le, M.T., et al: ‘Deep neural networks for Markovian interactive scene prediction in highway scenarios’. IEEE Intelligent Vehicles Symp. (IV), New York, June 2017, pp. 685–692.
4. 4)
  - 5. Urmson, C., Anhalt, J., Bagnell, D., et al: ‘Autonomous driving in urban environments: Boss and the urban challenge’, J. Field Robot., 2008, 25, pp. 425–466.
5. 5)
  - 6. Liu, T., Tang, X., Zhang, J., et al: ‘Reinforcement learning-enabled decision-making strategies for a vehicle cyber physical system in connected environment’, IEEE VPPC, 2019, 2019, pp. 1–5.
6. 6)
  - 24. Carlucho, I., De Paula, M., Wang, S., et al: ‘Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning’, Robot. Auton. Syst., 2018, 107, pp. 71–86.
7. 7)
  - 18. Zhou, M., Qu, X., Jin, S.: ‘On the impact of cooperative autonomous vehicles in improving freeway merging: a modified intelligent driver model-based approach’, IEEE Trans. Intell. Transp. Syst., 2017, 18, (6), pp. 1422–1428.
8. 8)
  - 26. Gläscher, J., Daw, N., Dayan, P., et al: ‘States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning’, Neuron, 2010, 66, (4), pp. 585–595.
9. 9)
  - 33. Interstate 80 Freeway Dataset: ‘Next generation SIMulation fact sheet’, FHWAHRT-06-137, FHWA, 2006.
10. 10)
  - 4. Montemerlo, M., Becker, J., Bhat, S., et al: ‘Junior: the Stanford entry in the urban challenge’, J. Field Robot., 2008, 25, pp. 569–597.
11. 11)
  - 28. Rombokas, E., Malhotra, M., Theodorou, E., et al: ‘Reinforcement learning and synergistic control of the act hand’, IEEE/ASME Trans. Mechatronics, 2013, 18, (2), pp. 569–577.
12. 12)
  - 30. Sutton, R., Barto, A.: ‘Reinforcement learning: an introduction’ (MIT Press, Cambridge, MA, USA, 2018, 2nd edn.).
13. 13)
  - 10. Liu, T., Hu, X.: ‘A bi-level control for energy efficiency improvement of a hybrid tracked vehicle’, IEEE Trans Ind. Inf., 2018, 14, (4), pp. 1616–1625.
14. 14)
  - 15. Santos, M., López, V., Botella, G.: ‘Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems’, Knowl.-Based Syst., 2012, 32, pp. 28–36.
15. 15)
  - 31. NGSIM Overview: ‘Next generation SIMulation fact sheet’, FHWAHRT-06-135, FHWA, 2006.
16. 16)
  - 13. Yu, L., Shao, X., Yan, X.: ‘Autonomous overtaking decision making of driverless bus based on deep Q-learning method’. IEEE Int. Conf. on Robotics and Biomimetics, Macau, People's Republic of China, 2017, pp. 2267–2272.
17. 17)
  - 19. Liebner, M., Baumann, M., Klanner, F., et al: ‘Driver intent inference at urban intersections using the intelligent driver model’. IEEE Intelligent Vehicles Symp. (IV), Alcala de Henares, Spain, June 2012, pp. 1162–1167.
18. 18)
  - 16. Hoel, C.J., Wolff, K., Laine, L.: ‘Automated speed and lane change decision making using deep reinforcement learning’, arXiv preprint arXiv:1803.10056, 2018.
19. 19)
  - 14. Min, K., Kim, H.: ‘Deep Q-learning based high level driving policy determination’. IEEE Intelligent Vehicles Symp. (IV), Macau, People's Republic of China, June 2018, pp. 226–231.
20. 20)
  - 25. Chebotar, Y., Hausman, K., Zhang, M., et al: ‘Combining model-based and model-free updates for trajectory-centric reinforcement learning’. 34th Int. Conf. Mach. Learn. ICML 2017, Sydney, Australia, 2017, vol. 2, pp. 1173–1185.
21. 21)
  - 3. Leonard, J., How, J., Teller, S., et al: ‘A perception-driven autonomous urban vehicle’, J. Field Robot., 2008, 25, pp. 727–774.
22. 22)
  - 20. Kesting, A., Treiber, M., Helbing, D.: ‘General lane-changing model MOBIL for car-following models’, Transp. Res. Rec., 2007, 1999, (1), pp. 86–94.
23. 23)
  - 21. Rahman, M., Chowdhury, M., Xie, Y., et al: ‘Review of microscopic lane-changing models and future research opportunities’, IEEE Trans. Intell. Transp. Syst., 2017, 14, (4), pp. 1942–1956.
24. 24)
  - 32. Thiemann, C., Treiber, M., Kesting, A.: ‘Estimating acceleration and lane-changing dynamics from next generation simulation trajectory data’, Transp. Res. Rec., J. Transp. Res. Board, 2008, 2088, pp. 90–101.
25. 25)
  - 23. You, C., Lu, J., Filev, D., et al: ‘Highway traffic modeling and decision making for autonomous vehicle using reinforcement learning’. 2018 IEEE Intelligent Vehicles Symp. (IV), Changshu, People's Republic of China, 2018, pp. 1227–1232.
26. 26)
  - 7. Hoermann, S., Stumper, D., Dietmayer, K.: ‘Probabilistic long-term prediction for autonomous vehicles’. IEEE Intelligent Vehicles Symp. (IV), Los Angeles, CA, USA, June 2017, pp. 237–243.
27. 27)
  - 11. Liu, T., Hu, X., Hu, W., et al: ‘A heuristic planning reinforcement learning-based energy management for power-split plug-in hybrid electric vehicles’, IEEE Trans Ind. Inf., 2019, 15, (12), pp. 6436–6445.
28. 28)
  - 27. Wang, Z., Chen, C., Li, H., et al: ‘Incremental reinforcement learning with prioritized sweeping for dynamic environments’, IEEE/ASME Trans. Mechatronics, TMECH, 2019, 24, pp. 621–632.
29. 29)
  - 29. Tseng, Y., Hwang, K., Jiang, W., et al: ‘An improved Dyna-Q algorithm based in reverse model learning’. ICSSE, Morioka, Japan, June 2015, pp. 200–212.
30. 30)
  - 8. Liu, W., Kim, S., Pendleton, S., et al: ‘Situation-aware decision making for autonomous driving on urban road using online POMDP’. IEEE Intelligent Vehicles Symp. (IV), Seoul, South Korea, June 2015, pp. 1126–1133.
31. 31)
  - 17. Treiber, M., Hennecke, A., Helbing, D.: ‘Congested traffic states in empirical observations and microscopic simulations’, Phys. Rev. E, 2000, 62, pp. 1805–1824.
32. 32)
  - 1. Schwarting, W., Alonso-Mora, J., Rus, D.: ‘Planning and decision-making for autonomous vehicles’, Annu. Rev. Control, Robot. Auton. Syst., 2018, 1, pp. 187–210.
33. 33)
  - 9. Liu, T., Wang, B., Yang, C.: ‘Online Markov chain-based energy management for a hybrid tracked vehicle with speedy Q-learning’, Energy, 2018, 160, pp. 544–555.

Heuristics-oriented overtaking decision making for autonomous vehicles using reinforcement learning

References

Related content