Motivated by the successful application for feedback control, this study extends the study of reinforcement learning techniques to the design of two-degree-of-freedom controllers in the data-driven environment. Based on the residual generator based form of Youla parameterisation, all stabilising controllers are first interpreted in the feedback–feedforward situation with a Kalman filter-based residual generator acting as the core part. For the reference tracking problem, further discussions are conducted from the regulatory perspective and using the Q learning, recursive least squares methods and the policy iteration algorithm. The entire design is carried out as a two-stage process that separately achieves the optimal feedback and feedforward controllers. Finally, the effectiveness of the proposed approach is demonstrated with its application in the laboratory continuous stirred tank heater process.

References

1. 1)
  - J. Qiu , G. Feng , J. Yang . A new design of delay-dependent robust H∞ filtering for discrete-time T–S fuzzy systems with time-varying delay. IEEE Trans. Fuzzy Syst. , 5 , 1044 - 1058
2. 2)
  - 15. Ding, S.X.: ‘Data-driven design of monitoring and diagnosis systems for dynamic processes: a review of subspace technique based schemes and some recent results’, J. Process. Control, 2014, 24, (2), pp. 431–449 (doi: 10.1016/j.jprocont.2013.08.011).
3. 3)
  - 30. Thornhill, N., Patwardhan, S., Shah, S.: ‘A continuous stirred tank heater simulation model with applications’, J. Process. Control, 2008, 18, (3), pp. 347–360 (doi: 10.1016/j.jprocont.2007.07.006).
4. 4)
  - 33. Yin, S., Ding, S.X., Haghani, A., Hao, H., Zhang, P.: ‘A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process’, J. Process Control, 2012, 22, (9), pp. 1567–1581 (doi: 10.1016/j.jprocont.2012.06.009).
5. 5)
  - 38. Qiu, J., Feng, G., Gao, H.: ‘Fuzzy-model-based piecewise H∞ static-output-feedback controller design for networked nonlinear systems’, IEEE Trans. Fuzzy Syst., 2010, 18, (5), pp. 919–934 (doi: 10.1109/TFUZZ.2010.2052259).
6. 6)
  - 25. Zhou, K., Doyle, J., Glover, K.: ‘Robust and optimal control’ (Prentice-Hall, Upper Saddle River, NJ, 1996).
7. 7)
  - 14. Yin, S., Wang, G., Karimi, H.R.: ‘Data-driven design of robust fault detection system for wind turbines’, Mechatronics, 2014, 24, (4), pp. 298–306 (doi: 10.1016/j.mechatronics.2013.11.009).
8. 8)
  - 45. Yin, S., Ding, S., Luo, H.: ‘Real-time implementation of fault tolerant control system with performance optimization’, IEEE Trans. Ind. Electron., 2014, 64, (5), pp. 2402–2411 (doi: 10.1109/TIE.2013.2273477).
9. 9)
  - 20. Bradtke, S.J., Ydstie, B.E., Barto, A.G.: ‘Adaptive linear quadratic control using policy iteration’. Proc. Amer. Control Conf., Baltimore, MD, June 1994, pp. 3475–3479.
10. 10)
  - 8. Lecchini, A., Campi, M.C., Savaresi, S.M.: ‘Virtual reference feedback tuning for two degree of freedom controllers’, Int. J. Adapt. Control Signal Process., 2002, 16, (5), pp. 355–371 (doi: 10.1002/acs.711).
11. 11)
  - 11. Yin, S., Li, X., Gao, H., Kaynak, O.: ‘Data-based techniques focused on modern industry: an overview’, IEEE Trans. Ind. Electron., 2014, doi: 10.1109/TIE.2014.2308133.
12. 12)
  - 2. Gonzalez, O.R., Antsaklis, P.J.: ‘Implementations of two degrees of freedom controllers’. Proc. Amer. Control Conf., Pittsburgh, PA, June 1989, pp. 269–273.
13. 13)
  - 24. Anderson, B.D.O., Moore, J.B.: ‘Optimal control: linear quadratic methods’ (Prentice-Hall, Engelwood Cliffs, NJ, 1990).
14. 14)
  - 19. Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., Melhuish, C.: ‘Reinforcement learning and optimal adaptive control: an overview and implementation examples’, Annu. Rev. Control, 2012, 36, (1), pp. 42–59 (doi: 10.1016/j.arcontrol.2012.03.004).
15. 15)
  - 10. Tsypkin, Y.A.: ‘Foundations of the theory of learning systems’ (Academic Press, New York, 1973).
16. 16)
  - 6. Lundstrom, P., Skogestad, S., Doyle, J.C.: ‘Two-degree-of-freedom controller design for an ill-conditioned distillation process using μ-synthesis’, IEEE Trans. Control Syst. Technol., 1999, 7, (1), pp. 12–21 (doi: 10.1109/87.736744).
17. 17)
  - F.L. Lewis , K.G. Vamvoudakis . Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans. Syst. Man Cybern. B , 1 , 14 - 24
18. 18)
  - 26. Ding, S.X.: ‘Model-based fault diagnosis techniques – design schemes, algorithms and tools’ (Springer-Verlag, London, 2008, 2nd edn. 2013).
19. 19)
  - 7. Sukkarnkha, P., Panjapornpon, C.: ‘Two-degree-of-freedom controller design for uncertain processes using input/output linearization control technique’, AJCHE, 2011, 11, (1), pp. 16–21.
20. 20)
  - 12. Zhang, Y., Yang, Y., Ding, S.X., Li, L.: ‘Data-driven design and optimization of feedback control systems for industrial applications’, IEEE Trans. Ind. Electron., 2014, 61, (11), pp. 6409–6417 (doi: 10.1109/TIE.2014.2301757).
21. 21)
  - 22. Lewis, F.L., Liu, D.: ‘Reinforcement learning and approximate dynamic programming for feedback control’ (Wiley, 2013).
22. 22)
  - 16. Yin, S., Ding, S., Xie, X., Luo, H.: ‘A review on basic data-driven approaches for industrial process monitoring’, IEEE Trans. Ind. Electron., 2014, 61, (11), pp. 6418–6428 (doi: 10.1109/TIE.2014.2301773).
23. 23)
  - 9. Elso, J., Gil-Martínez, M., García-Sanz, M.: ‘Quantitative feedback-feedforward control for model matching and disturbance rejection’, IET Control Theory Appl., 2013, 7, (6), 894–900 (doi: 10.1049/iet-cta.2012.0596).
24. 24)
  - 29. Åström, K.J., Wittenmark, B.: ‘Adaptive control’ (Addison-Wesley, 1989, 2nd edn. 1995).
25. 25)
  - 23. Werbos, P.J.: ‘Beyond regression: new tools for prediction and analysis in the behavior sciences’, Ph.D. dissertation, Harvard University, 1974.
26. 26)
  - 27. Ding, S.X., Yang, G., Zhang, P., et al: ‘Feedback control structures, embedded residual signals, and feedback control schemes with an integrated residual access’, IEEE Trans. Control Syst. Technol., 2010, 18, (2), pp. 352–367 (doi: 10.1109/TCST.2009.2018451).
27. 27)
  - 17. Sutton, R.S., Sutton, A.G.: ‘Reinforcement learning: an introduction’ (The MIT Press, Cambridge, MA, 1998).
28. 28)
  - F.Y. Wang , H. Zhang , D. Liu . Adaptive dynamic programming: an introduction. IEEE Computat. Intell. Mag. , 2 , 39 - 47
29. 29)
  - 1. Vidyasagar, M.: ‘Control system synthesis: a factorization approach’ (The MIT Press, Cambridge, MA, 1985).
30. 30)
  - 5. Wu, K.L., Yu, C.C., Cheng, Y.C.: ‘A two degree of freedom level control’, J. Process. Control, 2001, 11, (3), pp. 311–319 (doi: 10.1016/S0959-1524(00)00005-6).

Data-driven design of two-degree-of-freedom controllers using reinforcement learning techniques

References

Related content