access icon free Using reinforcement learning techniques to solve continuous-time non-linear optimal tracking problem without system dynamics

The optimal tracking of non-linear systems without knowing system dynamics is an important and intractable problem. Based on the framework of reinforcement learning (RL) and adaptive dynamic programming, a model-free adaptive optimal tracking algorithm is proposed in this study. After constructing an augmented system with the tracking errors and the reference states, the tracking problem is converted to a regulation problem with respect to the new system. Several RL techniques are synthesised to form a novel algorithm which learns the optimal solution online in real time without any information of the system dynamics. Continuous adaptation laws are defined by the current observations and the past experience. The convergence is guaranteed by Lyapunov analysis. Two simulations on a linear and a non-linear systems demonstrate the performance of the proposed approach.

Inspec keywords: optimal control; nonlinear control systems; Lyapunov methods; learning (artificial intelligence); dynamic programming; linear systems; continuous time systems

Other keywords: Lyapunov analysis; nonlinear optimal tracking problem; model-free adaptive optimal tracking algorithm; adaptive dynamic programming; reinforcement learning; linear system; continuous-time problem

Subjects: Nonlinear control systems; Learning in AI (theory); Stability in control theory; Linear control systems; Optimal control; Optimisation techniques

References

    1. 1)
    2. 2)
    3. 3)
      • 2. F.L., Lewis, , , D., Liu, : ‘Reinforcement learning and approximate dynamic programming for feedback control’ (Wiley, New York, 2012).
    4. 4)
    5. 5)
    6. 6)
    7. 7)
    8. 8)
    9. 9)
    10. 10)
    11. 11)
    12. 12)
    13. 13)
    14. 14)
    15. 15)
    16. 16)
      • 19. T., Dierks, , , S., Jagannathan, : ‘Optimal control of affine nonlinear continuous-time systems’, 2010 American Control Conf. (ACC), June 2010, pp. 15681573.
    17. 17)
    18. 18)
    19. 19)
    20. 20)
    21. 21)
      • 3. W.B., Powell, : ‘Approximate dynamic programming: solving the curses of dimensionality’ (Wiley-Interscience, 2007).
    22. 22)
      • 18. T., Dierks, , , S., Jagannathan, : ‘Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamics’. Proc. 48th IEEE Conf. on Decision and Control, held jointly with the 28th Chinese Control Conf., December 2009 SEP, pp. 67506755.
    23. 23)
    24. 24)
    25. 25)
    26. 26)
    27. 27)
    28. 28)
    29. 29)
    30. 30)
      • 4. H., Zhang, , , D., Liu, , , Y., Luo, , et al.: ‘Adaptive dynamic programming for control. Algorithms and stability’ (Springer-Verlag, London, 2012).
    31. 31)
    32. 32)
    33. 33)
    34. 34)
    35. 35)
      • 1. R.S., Sutton, , , A.G., Barto, : ‘Reinforcement learning: an introduction’ (MIT Press, Cambridge, MA, 1998).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cta.2015.0769
Loading

Related content

content/journals/10.1049/iet-cta.2015.0769
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading