access icon free Data-driven design of two-degree-of-freedom controllers using reinforcement learning techniques

Motivated by the successful application for feedback control, this study extends the study of reinforcement learning techniques to the design of two-degree-of-freedom controllers in the data-driven environment. Based on the residual generator based form of Youla parameterisation, all stabilising controllers are first interpreted in the feedback–feedforward situation with a Kalman filter-based residual generator acting as the core part. For the reference tracking problem, further discussions are conducted from the regulatory perspective and using the Q learning, recursive least squares methods and the policy iteration algorithm. The entire design is carried out as a two-stage process that separately achieves the optimal feedback and feedforward controllers. Finally, the effectiveness of the proposed approach is demonstrated with its application in the laboratory continuous stirred tank heater process.

Inspec keywords: iterative methods; learning (artificial intelligence); feedback; Kalman filters; process control; least squares approximations

Other keywords: reference tracking problem; recursive least squares methods; data-driven environment; optimal feedback controller; two-degree-of-freedom controllers; Kalman filter-based residual generator; feedback-feedforward situation; feedback control; Youla parameterisation; Q learning; policy iteration algorithm; residual generator based form; feedforward controllers; continuous stirred tank heater process; reinforcement learning techniques

Subjects: Interpolation and function approximation (numerical analysis); Signal processing theory; Control in industrial production systems; Control technology and theory (production); Numerical analysis; Industrial processes

References

    1. 1)
    2. 2)
    3. 3)
    4. 4)
    5. 5)
    6. 6)
      • 25. Zhou, K., Doyle, J., Glover, K.: ‘Robust and optimal control’ (Prentice-Hall, Upper Saddle River, NJ, 1996).
    7. 7)
    8. 8)
    9. 9)
      • 20. Bradtke, S.J., Ydstie, B.E., Barto, A.G.: ‘Adaptive linear quadratic control using policy iteration’. Proc. Amer. Control Conf., Baltimore, MD, June 1994, pp. 34753479.
    10. 10)
    11. 11)
      • 11. Yin, S., Li, X., Gao, H., Kaynak, O.: ‘Data-based techniques focused on modern industry: an overview’, IEEE Trans. Ind. Electron., 2014, doi: 10.1109/TIE.2014.2308133.
    12. 12)
      • 2. Gonzalez, O.R., Antsaklis, P.J.: ‘Implementations of two degrees of freedom controllers’. Proc. Amer. Control Conf., Pittsburgh, PA, June 1989, pp. 269273.
    13. 13)
      • 24. Anderson, B.D.O., Moore, J.B.: ‘Optimal control: linear quadratic methods’ (Prentice-Hall, Engelwood Cliffs, NJ, 1990).
    14. 14)
    15. 15)
      • 10. Tsypkin, Y.A.: ‘Foundations of the theory of learning systems’ (Academic Press, New York, 1973).
    16. 16)
    17. 17)
    18. 18)
      • 26. Ding, S.X.: ‘Model-based fault diagnosis techniques – design schemes, algorithms and tools’ (Springer-Verlag, London, 2008, 2nd edn. 2013).
    19. 19)
      • 7. Sukkarnkha, P., Panjapornpon, C.: ‘Two-degree-of-freedom controller design for uncertain processes using input/output linearization control technique’, AJCHE, 2011, 11, (1), pp. 1621.
    20. 20)
    21. 21)
      • 22. Lewis, F.L., Liu, D.: ‘Reinforcement learning and approximate dynamic programming for feedback control’ (Wiley, 2013).
    22. 22)
    23. 23)
    24. 24)
      • 29. Åström, K.J., Wittenmark, B.: ‘Adaptive control’ (Addison-Wesley, 1989, 2nd edn. 1995).
    25. 25)
      • 23. Werbos, P.J.: ‘Beyond regression: new tools for prediction and analysis in the behavior sciences’, Ph.D. dissertation, Harvard University, 1974.
    26. 26)
    27. 27)
      • 17. Sutton, R.S., Sutton, A.G.: ‘Reinforcement learning: an introduction’ (The MIT Press, Cambridge, MA, 1998).
    28. 28)
    29. 29)
      • 1. Vidyasagar, M.: ‘Control system synthesis: a factorization approach’ (The MIT Press, Cambridge, MA, 1985).
    30. 30)
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cta.2014.0156
Loading

Related content

content/journals/10.1049/iet-cta.2014.0156
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading