© The Institution of Engineering and Technology
A novel hybrid Q-learning algorithm is introduced for the design of a linear adaptive optimal regulator for a large-scale interconnected system with event-sampled inputs and state vector. Here, the time-driven Q-learning along with proposed iterative parameter learning updates are utilised within the event-sampled instants to both improve efficiency of the optimal regulator and obtain a more generalised online Q-learning framework. The network-induced losses due to the presence of a communication network among the subsystems are considered along with the uncertain system dynamics. Stochastic model-free Q-learning and dynamic programming are utilised in the hybrid learning mode for the optimal regulator design. The asymptotic convergence of the system state vector and boundedness of the parameter vector is demonstrated using Lyapunov analysis. Further, when the regression vector of the Q-function estimator satisfies the persistency of excitation condition, the Q-function parameters converge to the expected target values. The analytical design is evaluated using numerical examples via simulation. The net result is the design of a data-driven event-sampled adaptive optimal regulator for an uncertain large-scale interconnected system.
References
-
-
1)
-
17. Zhou, X., Li, C., Huang, T., , et al: ‘Fast gradient-based distributed optimisation approach for model predictive control and application in four-tank benchmark’, IET Control Theory Appl., 2015, 9, (10), pp. 1579–1586 (doi: 10.1049/iet-cta.2014.0549).
-
2)
-
1. Zhang, W., Branicky, M.S., Phillips, S.M.: ‘Stability of networked control systems’, IEEE Control Syst. Mag., 2001, 21, (1), pp. 84–99 (doi: 10.1109/37.898794).
-
3)
-
33. Wang, X., Lemmon, M.D.: ‘Event-triggering in distributed networked control systems’, IEEE Trans. Autom. Control, 2011, 56, (3), pp. 586–601 (doi: 10.1109/TAC.2010.2057951).
-
4)
-
9. Ioannou, P.: ‘Decentralized adaptive control of interconnected systems’, IEEE Trans. Autom. Control, 1986, 31, (4), pp. 291–298 (doi: 10.1109/TAC.1986.1104282).
-
5)
-
6. Hu, S., Zhu, W.: ‘Stochastic optimal control and analysis of stability of networked control systems with long delay’, Automatica, 2003, 39, pp. 1877–1884 (doi: 10.1016/S0005-1098(03)00196-1).
-
6)
-
31. Liu, D., Wang, D., Li, H.: ‘Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach’, IEEE Trans. Neural Netw. Learn. Syst., 2014, 25, (2), pp. 418–428 (doi: 10.1109/TNNLS.2013.2280013).
-
7)
-
3. Lewis, A., Al-Tamimi, F.L., Abu-Khalaf, M.: ‘Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control’, Automatica, 2007, 43, (3), pp. 473–481 (doi: 10.1016/j.automatica.2006.09.019).
-
8)
-
1. Hou, Z.-S., Wang, Z.: ‘From model-based control to data-driven control: survey, classification and perspective’, Inf. Sci., 2013, 235, pp. 3–35 (doi: 10.1016/j.ins.2012.07.014).
-
9)
-
8. Jamshidi, M.: ‘Large-scale systems: modeling, control, and fuzzy logic’ (Prentice-Hall, Inc., 1996).
-
10)
-
18. Zheng, Y., Li, S., Qiu, H.: ‘Networked coordination-based distributed model predictive control for large-scale system’, IEEE Trans. Control Syst. Technol., 2013, 21, (3), pp. 991–998 (doi: 10.1109/TCST.2012.2196280).
-
11)
-
6. Sahoo, A., Jagannathan, S.: ‘Event-triggered optimal regulation of uncertain linear discrete-time systems by using Q-learning scheme’, Proc. IEEE 53rd Annual Conf. on Decision and Control, Los Angeles, CA, December 2014, pp. 1233–1238.
-
12)
-
27. Xiangnan, Z., Zhen, Ni, Haibo, He., , et al: ‘Event-triggered reinforcement learning approach for unknown nonlinear continuous-time system’. Proc. Int. Joint Conf. on Neural Networks, Beijing, July 2014, pp. 3677–3684.
-
13)
-
20. Chen, M.Z.Q., Liangyin, Z., Su, H., , et al: ‘A distributed control approach to a robust output regulation problem for multi-agent linear systems’, IET Control Theory Appl., 2015, 9, (5), pp. 755–765 (doi: 10.1049/iet-cta.2014.0595).
-
14)
-
15. Camponogara, E., De, L., Marcelo, L.: ‘Distributed optimization for MPC of linear networks with uncertain dynamics’, IEEE Trans. Autom. Control, 2012, 57, (3), pp. 804–809 (doi: 10.1109/TAC.2011.2168070).
-
15)
-
24. Halevi, Y., Ray, A.: ‘Integrated communication and control systems: Part I – analysis’, J. Dyn. Syst. Meas. Control, 1988, 110, (4), pp. 367–373 (doi: 10.1115/1.3152698).
-
16)
-
30. Horn, R.A., Johnson, C.R.: ‘Matrix analysis’ (Cambridge University Press, 2012).
-
17)
-
2. Bradtke, S.J., Ydstie, B.E., Barto, A.G.: ‘Adaptive linear quadratic control using policy iteration’. Proc. American Control Conf., July 1994, pp. 3475–3479.
-
18)
-
9. F.L., Lewis, , , D., Vrabie, , , K.G., Vamvoudakis, : ‘Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers’, IEEE Control Syst. Mag., 2012, 32, (6), pp. 76–105 (doi: 10.1109/MCS.2012.2214134).
-
19)
-
27. Wang, X., Hong, Y., Huang, J., Jiang, Z.P.: ‘A distributed control approach to a robust output regulation problem for multi-agent linear systems’, IEEE Trans. Autom. Control, 2010, 55, (12), pp. 2891–2895 (doi: 10.1109/TAC.2010.2076250).
-
20)
-
19. Dierks, T., Jagannathan, S.: ‘Online optimal control of affine nonlinear discrete-time systems with unknown internal dynamics by using time-based policy update’, IEEE Trans. Neural Netw. Learn. Syst., 2012, 23, (7), pp. 1118– 1129 (doi: 10.1109/TNNLS.2012.2196708).
-
21)
-
23. Meng, X., Chen, T.: ‘Event-driven communication for sampled-data control systems’. Proc. American Control Conf., Washington, DC, June 2013, pp. 3002–3007.
-
22)
-
26. Song, Y., Fang, X.: ‘Distributed model predictive control for polytopic uncertain systems with randomly occurring actuator saturation and packet loss’, IET Control Theory Appl., 2014, 8, (5), pp. 297–310 (doi: 10.1049/iet-cta.2013.0376).
-
23)
-
8. Venkat, A.N., Hiskens, I.A., Rawlings, J.B., et al: ‘Distributed MPC strategies with application to power system automatic generation control’, IEEE Trans. Control Syst. Technol., 2008, 16, (6), pp. 1192–1206 (doi: 10.1109/TCST.2008.919414).
-
24)
-
11. Mehraeen, S., Jagannathan, S.: ‘Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton–Jacobi–Bellman formulation’, IEEE Trans. Neural Netw., 2011, 22, (11), pp. 1757–1769 (doi: 10.1109/TNN.2011.2160968).
-
25)
-
29. Goodwin, G.C., Sin, K.S.: ‘Adaptive filtering prediction and control’ (Courier Corporation, 2014).
-
26)
-
1. Lewis, F.L., Syrmos, V.L.: ‘Optimal control’ (John Wiley & Sons, 1995).
-
27)
-
12. Xu, H., Jagannathan, S., Lewis, F.L.: ‘Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses’, Automatica, 2012, 48, pp. 1017–1030 (doi: 10.1016/j.automatica.2012.03.007).
-
28)
-
13. Narendra, K.S., Mukhopadhyay, S.: ‘To communicate or not to communicate: A decision-theoretic approach to decentralized adaptive control’. Proc. American Control Conf., Baltimore, MD, July 2010, pp. 6369–6376.
-
29)
-
22. Guinaldo, M., Lehmann, D., Sanchez, J., , et al: ‘Distributed event-triggered control with network delays and packet losses’, Proc. IEEE 51st Annual Conf. on Decision and Control, Maui, HI, July 2012, pp. 1–6.
-
30)
-
10. Šiljak, D.D., Zečević, A.I.: ‘Control of large-scale systems: beyond decentralized feedback’, Ann. Rev. Control, 2005, 29, (2), pp. 169–179 (doi: 10.1016/j.arcontrol.2005.08.003).
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cta.2015.0943
Related content
content/journals/10.1049/iet-cta.2015.0943
pub_keyword,iet_inspecKeyword,pub_concept
6
6