access icon free Model-free optimal tracking control for discrete-time system with delays using reinforcement Q-learning

Reinforcement Q-learning algorithm for the optimal tracking control problem with unknown dynamics and delays is proposed. Traditional reinforcement learning methods require an accurate system model, which is avoided by means of the Q-learning method. This is very meaningful in practical implementation because all or part of the model of the system is often difficult to obtain or requires an additional high cost. First, the augmented system composed of the original system and reference trajectory is constructed, then the corresponding augmented linear quadratic tracking (LQT) Bellman equation is derived. Based on this, the reinforcement Q-learning algorithm is presented at the end. To implement this method, the iteration equations are solved online by using the least squares technique.

Inspec keywords: linear quadratic control; optimal control; continuous time systems; learning (artificial intelligence); iterative methods; Riccati equations; learning systems

Other keywords: accurate system model; optimal tracking control problem; -learning method; additional high cost; delays; practical implementation; -learning algorithm; discrete-time system; unknown dynamics; corresponding augmented LQT Bellman equation; augmented system; model-free optimal tracking control; traditional reinforcement learning methods

Subjects: Optimisation techniques; Optimal control; Knowledge engineering techniques; Interpolation and function approximation (numerical analysis); Self-adjusting control systems; Learning in AI (theory); Linear algebra (numerical analysis)

References

    1. 1)
    2. 2)
    3. 3)
    4. 4)
    5. 5)
    6. 6)
    7. 7)
    8. 8)
    9. 9)
      • 1. Zhang, H.G., Liu, D.R., Luo, Y.H., et al: ‘Adaptive dynamic programming for control: algorithms and stability’ (Springer-Verlag, London, UK, 2013).
    10. 10)
    11. 11)
    12. 12)
    13. 13)
    14. 14)
    15. 15)
http://iet.metastore.ingenta.com/content/journals/10.1049/el.2017.3238
Loading

Related content

content/journals/10.1049/el.2017.3238
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading