Adaptive embedded control of cyber-physical systems using reinforcement learning

Hamid Mirzaei Buini; Steffen Peter; Tony Givargis

Adaptive embedded control of cyber-physical systems using reinforcement learning

View Fulltext

Author(s): Hamid Mirzaei Buini ¹¹ ; Steffen Peter ¹¹ ; Tony Givargis ¹¹
- Affiliations: 1: Center for Embedded and Cyber-Physical Systems, University of California, Irvine, USA
Source: Volume 2, Issue 3, October 2017, p. 127 – 135
DOI: 10.1049/iet-cps.2017.0048 , Online ISSN 2398-3396

This is an open access article published by the IET under the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/)

Received 18/03/2017, Accepted 28/06/2017, Published 07/07/2017

Embedded control parameters of cyber-physical systems (CPS), such as sampling rate, are typically invariant and designed with a worst case scenario in mind. In an over-engineered system, control parameters are assigned values that satisfy system-wide performance requirements at the expense of excessive energy and resource overheads. Dynamic and adaptive control parameters can reduce the overhead but are complex and require in-depth knowledge of the CPS and its operating environment – which typically is unavailable during design time. The authors investigate the application of reinforcement learning (RL) to dynamically adapt high-level system parameters, at run time, as a function of the system state. RL is an alternative approach to the classical control theory for CPSs that can learn and adapt control properties without the need of an in-depth controller model. Specifically, we show that RL can modulate sampling times to save processing power without compromising control quality. We apply a novel statistical cloud-based evaluation framework to study the validity of our approach for the cart-pole balancing control problem as well as the well-known mountain car problem. The results show an improved real-world power efficiency of up to 20% compared with an optimal system with fixed controller settings.

References

1. 1)
  - 14. Albertos, P., Crespo, A.: ‘Real-time control of non-uniformly sampled systems’, Control Eng. Pract., 1999, 7, (4), pp. 445–458.
2. 2)
  - 19. Ahnert, K., Mulansky, M.: ‘Odeint-solving ordinary differential equations in C++’, arXiv preprint arXiv:11103397, 2011.
3. 3)
  - 9. Henriksson, D., Cervin, A.: ‘Optimal on-line sampling period assignment for real-time control tasks based on plant state information’. 44th IEEE Conf. Decision and Control, 2005 and 2005 European Control Conf. CDC-ECC'05, 2005, pp. 4469–4474.
4. 4)
  - 6. Kara, E.C., Berges, M., Krogh, B., et al: ‘Using smart devices for system-level management and control in the smart grid: A reinforcement learning framework’. 2012 IEEE Third Int. Conf. Smart Grid Communications (SmartGridComm), 2012, pp. 85–90.
5. 5)
  - 1. Juan, D.C., Garg, S., Park, J., et al: ‘Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling’. 2013 Int. Conf. Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2013, pp. 1–10.
6. 6)
  - 3. Sala, A.: ‘Computer control under time-varying sampling period: an LMI gridding approach’, Automatica, 2005, 41, (12), pp. 2077–2082.
7. 7)
  - 10. Simon, D., Robert, D., Sename, O.: ‘Robust control/scheduling co-design: application to robot control’. 11th IEEE Real Time and Embedded Technology and Applications Symp., 2005. RTAS 2005, 2005, pp. 118–127.
8. 8)
  - 20. Durand, S., Castellanos, J.F.G., Marchand, N., et al: ‘Event-based control of the inverted pendulum: swing up and stabilization’, J. Control Eng. Appl. Inform., 2013, 15, (3), pp. 96–104.
9. 9)
  - 7. Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al: ‘Continuous control with deep reinforcement learning’. 2015arXiv preprint arXiv:150902971.
10. 10)
  - 15. Khan, S.G., Herrmann, G., Lewis, F.L., et al: ‘Reinforcement learning and optimal adaptive control: an overview and implementation examples’, Annu. Rev. Control, 2012, 36, (1), pp. 42–59.
11. 11)
  - 12. Balluchi, A., Murrieri, P., Sangiovanni Vincentelli, A.L.: ‘Controller synthesis on non-uniform and uncertain discrete–time domains’, in Morari, M. (Ed.), ‘Hybrid systems: computation and control’ (Springer, 2005), pp. 118–133.
12. 12)
  - 4. Cervin, A., Velasco, M., Martí, P., et al: ‘Optimal online sampling period assignment: theory and experiments’, IEEE Trans. Control Syst. Technol., 2011, 19, (4), pp. 902–910.
13. 13)
  - 16. Marchand, N., Durand, S., Castellanos, J.F.G.: ‘A general formula for the stabilization of event-based controlled systems’. 2011 50th IEEE Conf. Decision and Control and European Control Conf., 2011, pp. 8199–8204.
14. 14)
  - 11. Albertos, P., Salt, J.: ‘Non-uniform sampled-data control of MIMO systems’, Annu. Rev. Control, 2011, 35, (1), pp. 65–76.
15. 15)
  - 21. Moore, A.W.: ‘Efficient memory-based learning for robot control’. 1990.
16. 16)
  - 17. Watkins, C.J., Dayan, P.: ‘Q-learning’, Mach. Learn., 1992, 8, (3-4), pp. 279–292.
17. 17)
  - 13. Khan, S., Goodall, R.M., Dixon, R.: ‘Non-uniform sampling strategies for digital control’, Int. J. Syst. Sci., 2013, 44, (12), pp. 2234–2254.
18. 18)
  - 2. Buini, H.M., Peter, S., Givargis, T.: ‘Including variability of physical models into the design automation of cyber-physical systems’. 2015 52nd ACM/EDAC/IEEE Design Automation Conf. (DAC), 2015, pp. 1–6.
19. 19)
  - 8. Neema, H., Lattmann, Z., Meijer, P., et al: ‘Design space exploration and manipulation for cyber physical systems’. IFIP First Int. Workshop on Design Space Exploration of Cyber-Physical Systems (IDEAL), 2014.
20. 20)
  - 5. El Tantawy, S., Abdulhai, B., Abdelgawad, H.: ‘Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (marlin-atsc): methodology and large-scale application on downtown Toronto’, IEEE Trans. Intell. Transport. Syst., 2013, 14, (3), pp. 1140–1150.
21. 21)
  - 18. Sutton, R.S., Barto, A.G.: ‘Reinforcement learning: an introduction’ (MIT Press, 1998).

Login

Not registered yet?

Share

Tools

Login to add to favourites

Key

Adaptive embedded control of cyber-physical systems using reinforcement learning

References

Related content