© The Institution of Engineering and Technology
In this study, a data-driven optimisation solution for operational index control for a class of industrial processes is presented. First, the operational index control problem is formulated as an optimal tracking control problem. Then, an augmented system composed of the device loop dynamics and operational indices dynamics is constructed on two different time scales. Since, finding mathematical model of the operational indices dynamics is difficult, in contrast to most existing operational optimisation and control methods that use a mathematical model of the operational indices dynamics, a reinforcement learning algorithm based on actor-critic structure is employed to provide a data-driven optimisation control method to select optimal process setpoints so that the operational indices can track desired values. This solution does not require complete knowledge of the industrial process dynamics. Moreover, complicated system identification of the dynamics of the operational indices is not required. The effectiveness of the proposed method is demonstrated by experimental results that are carried out on a hardware-in-the-loop emulation system for a mineral grinding process.
References
-
-
1)
-
4. Wang, F.-Y., Zhang, H., Liu, D.: ‘Adaptive dynamic programming: an introduction’, IEEE Comput. Intell. Mag., 2009, 4, (2), pp. 39–47 (doi: 10.1109/MCI.2009.932261).
-
2)
-
3. Selmic, R.R., Lewis, F.L.: ‘Deadzone compensation in motion control systems using neural networks’, IEEE Trans. Autom. Control, 2000, 45, (4), pp. 602–613 (doi: 10.1109/9.847098).
-
3)
-
36. Hewer, G.: ‘An iterative technique for the computation of the steady state gains for the discrete optimal regulator’, IEEE Trans. Autom. Control, 1971, 16, (4), pp. 382–384 (doi: 10.1109/TAC.1971.1099755).
-
4)
-
1. Engell, S.: ‘Feedback control for optimal process operation’, J. Process Control, 2007, 17, (3), pp. 203–219 (doi: 10.1016/j.jprocont.2006.10.011).
-
5)
-
11. Åström, K.J., Wittenmark, B.: ‘Adaptive control’ (Courier Corporation, 2013).
-
6)
-
29. Bertsekas, D.P., Tsitsiklis, J.N.: ‘Neuro-dynamic programming: an overview’, Proc. 34th IEEE Conf. on Decision and Control, 1995, vol. 1, pp. 560–564.
-
7)
-
25. Liu, F., Gao, H., Qiu, J., et al: ‘Networked multirate output feedback control for setpoints compensation and its application to rougher flotation process’, IEEE Trans. Ind. Electron., 2014, 61, (1), pp. 460–468 (doi: 10.1109/TIE.2013.2240640).
-
8)
-
20. Wei, Q., Liu, D.: ‘Data-driven neuro-optimal temperature control of water–gas shift reaction using stable iterative adaptive dynamic programming’, IEEE Trans. Ind. Electron., 2014, 61, (11), pp. 6399–6408 (doi: 10.1109/TIE.2014.2301770).
-
9)
-
4. Marlin, T.E., Hrymak, A.N.: ‘Real-time operations optimization of continuous processes’. AIChE Symp. Series, New York, NY, USA, 1997, vol. 93, no. 316, pp. 156–164.
-
10)
-
13. Yin, S., Wang, G., Gao, H.: ‘Data-driven process monitoring based on modified orthogonal projections to latent structures’, IEEE Trans. Control Syst. Technol., 2015, .
-
11)
-
25. Nocedal, J., Wright, S.J.: ‘Numerical optimization’ (Springer Science & Business Media, 2006).
-
12)
-
32. Lancaster, P., Rodman, L.: ‘Algebraic Riccati equations’ (Oxford University Press, 1995).
-
13)
-
10. Bien, Z., Xu, J.: ‘Iterative learning control: analysis, design, integration and applications’ (Springer Science & Business Media, 2012).
-
14)
-
36. Sutton, R.S.: ‘Learning to predict by the methods of temporal differences’, Mach. Learn., 1988, 3, (1), pp. 9–44.
-
15)
-
27. Lewis, F.L., Vrabie, D., Syrmos, V.L.: ‘Optimal control’ (John Wiley & Sons, 2012, 3rd edn.).
-
16)
-
19. Lin, X., Lei, S., Song, C., , et al: ‘ADHDP for the pH value control in the clarifying process of sugar cane juice’. Advances in Neural Networks – ISNN, Beijing, China, 2008, pp. 796–805.
-
17)
-
24. Dai, W., Chai, T., Yang, S.X.: ‘Data-driven optimization control for safety operation of hematite grinding process’, IEEE Trans. Ind. Electron., 2015, 62, (5), pp. 2930–2941 (doi: 10.1109/TIE.2014.2362093).
-
18)
-
4. Yin, S., Huang, Z.: ‘Performance monitoring for vehicle suspension system via fuzzy positivistic C-means clustering based on accelerometer measurements’, IEEE/ASME Trans. Mechatronics, 2014, 20,(5), pp. 2613–2620 (doi: 10.1109/TMECH.2014.2358674).
-
19)
-
2. Chai, T., Qin, S.J., Wang, H.: ‘Optimal operational control for complex industrial processes’, Annu. Rev. Control, 2014, 38, (1), pp. 81–92 (doi: 10.1016/j.arcontrol.2014.03.005).
-
20)
-
23. De Angelis, F., Boaro, M., Fuselli, D., et al : ‘Optimal home energy management under dynamic electrical and thermal constraints’, IEEE Trans. Ind. Inf., 2013, 9, (3), pp. 1518–1527 (doi: 10.1109/TII.2012.2230637).
-
21)
-
29. Qin, S., Badgwell, T.: ‘A survey of industrial model predictive control technology’, Control Eng. Pract., 2003, 11, pp. 733–764 (doi: 10.1016/S0967-0661(02)00186-7).
-
22)
-
21. Khan, S.G., Herrmann, G., Lewis, F.L., et al : ‘Reinforcement learning and optimal adaptive control: an overview and implementation examples’, Annu. Rev. Control, 2012, 36, (1), pp. 42–59 (doi: 10.1016/j.arcontrol.2012.03.004).
-
23)
-
26. Bertsekas, D.P.: ‘Constrained optimization and Lagrange multiplier methods’ (Academic Press, 2014).
-
24)
-
22. Lin, X., Zhang, Z., Liu, D.: ‘Temperature control in precalcinator with dual heuristic dynamic programming’. Int. Joint Conf. on Neural Networks, 2007, pp. 344–349.
-
25)
-
39. Ljung, L.: ‘System identification’ (Birkhäuser, Boston, 1998).
-
26)
-
18. Wei, Q., Liu, D.: ‘Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification’, IEEE Trans. Autom. Sci. Eng., 2014, 11, (4), pp. 1020–1036 (doi: 10.1109/TASE.2013.2284545).
-
27)
-
9. F.L., Lewis, , , D., Vrabie, , , K.G., Vamvoudakis, : ‘Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers’, IEEE Control Syst. Mag., 2012, 32, (6), pp. 76–105 (doi: 10.1109/MCS.2012.2214134).
-
28)
-
15. Igelink, B., Pao, Y.H.: ‘Stochastic choice of basis function in adaptive function approximation and functional-link net’, IEEE Trans. Neural Netw., 1995, 6, (6), pp. 1320–1329 (doi: 10.1109/72.471375).
-
29)
-
42. Kiumarsi, B., Lewis, F.L.: ‘Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems’, IEEE Trans. Neural Netw. Learn. Syst., 2015, 26, (1), pp. 140–151 (doi: 10.1109/TNNLS.2014.2358227).
-
30)
-
7. Yin, S., Li, X., Gao, H., et al: ‘Data-based techniques focused on modern industry: an overview’, IEEE Trans. Ind. Electron., 2015, 62, (1), pp. 657–667 (doi: 10.1109/TIE.2014.2308133).
-
31)
-
1. Hou, Z.-S., Wang, Z.: ‘From model-based control to data-driven control: survey, classification and perspective’, Inf. Sci., 2013, 235, pp. 3–35 (doi: 10.1016/j.ins.2012.07.014).
-
32)
-
17. Lewis, F.L., Vrabie, D.: ‘Reinforcement learning and adaptive dynamic programming for feedback control’, IEEE Circuits Syst. Mag., 2009, 9, (3), pp. 32–50 (doi: 10.1109/MCAS.2009.933854).
-
33)
-
8. Liu, D., Javaherian, H., Kovalenko, O., Huang, T.: ‘Adaptive critic learning techniques for engine torque and air-fuel ratio control’, IEEE Trans. Syst. Man Cybern. B, Cybern., 2008, 38, (4), pp. 988–993 (doi: 10.1109/TSMCB.2008.922019).
-
34)
-
7. Yin, S., Gao, H., Kaynak, O.: ‘Data-driven control and process monitoring for industrial applications – Part II’, IEEE Trans. Ind. Electron., 2015, 1, (62), pp. 583–586 (doi: 10.1109/TIE.2014.2328316).
-
35)
-
29. Wang, D., Liu, D., Wei, Q., et al: ‘Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming’, Automatica, 2012, 48, (8), pp. 1825–1832 (doi: 10.1016/j.automatica.2012.05.049).
-
36)
-
22. Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: ‘Discrete-time nonlinear hjb solution using approximate dynamic programming: convergence proof’, IEEE Trans. Syst. Man. Cybern. B, Cybern., 2008, 38, (4), pp. 943–949 (doi: 10.1109/TSMCB.2008.926614).
-
37)
-
5. Amrit, R., Rawlings, J.B., Biegler, L.T.: ‘Optimizing process economics online using model predictive control’, Comput. Chem. Eng., 2013, 58, pp. 334–343 (doi: 10.1016/j.compchemeng.2013.07.015).
-
38)
-
16. Vrabie, D., Vamvoudakis, K.G., Lewis, F.L.: ‘Optimal adaptive control and differential games by reinforcement learning principles’ (IET, 2013).
-
39)
-
33. Dierks, T., Jagannathan, S.: ‘Online optimal control of nonlinear discrete-time systems using approximate dynamic programming’, J. Control Theory Appl., 2011, 9, (3), pp. 361–369 (doi: 10.1007/s11768-011-0178-0).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cta.2015.0798
Related content
content/journals/10.1049/iet-cta.2015.0798
pub_keyword,iet_inspecKeyword,pub_concept
6
6