http://iet.metastore.ingenta.com
1887

Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems

Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Control Theory & Applications — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The design of robust controllers for continuous-time (CT) non-linear systems with completely unknown non-linearities is a challenging task. The inability to accurately identify the non-linearities online or offline motivates the design of robust controllers using adaptive dynamic programming (ADP). In this study, an ADP-based robust neural control scheme is developed for a class of unknown CT non-linear systems. To begin with, the robust non-linear control problem is converted into a non-linear optimal control problem via constructing a value function for the nominal system. Then an ADP algorithm is developed to solve the non-linear optimal control problem. The ADP algorithm employs actor-critic dual networks to approximate the control policy and the value function, respectively. Based on this architecture, only system data is necessary to update simultaneously the actor neural network (NN) weights and the critic NN weights. Meanwhile, the persistence of excitation assumption is no longer required by using the Monte Carlo integration method. The closed-loop system with unknown non-linearities is demonstrated to be asymptotically stable under the obtained optimal control. Finally, two examples are provided to validate the developed method.

References

    1. 1)
      • 1. Basar, T., Bernhard, P.: ‘Hoptimal control and related minimax design problems: a dynamic game approach’ (Birkhäuser, Boston, 1995, 2nd edn.).
    2. 2)
      • 2. Lu, L., Yao, B.: ‘Online constrained optimization based adaptive robust control of a class of MIMO nonlinear systems with matched uncertainties and input/state constraints’, Automatica, 2014, 50, (3), pp. 864873.
    3. 3)
      • 3. Wang, N., Er, M.J., Sun, J.C., et al: ‘Adaptive robust online constructive fuzzy control of a complex surface vehicle system’, IEEE Trans. Cybern., 2016, 46, (7), pp. 15111523.
    4. 4)
      • 4. Freeman, R., Kokotovic, P.V.: ‘Robust nonlinear control design: state-space and Lyapunov techniques’ (Birkhäuser, Basel, 2008).
    5. 5)
      • 5. Aliyu, M.D.S.: ‘Nonlinear HControl, Hamiltonian Systems and Hamilton–Jacobi Equations’ (CRC Press, Florida, 2011).
    6. 6)
      • 6. Isidori, A.: ‘Nonlinear control systems: an introduction’ (Springer, New York, 1985).
    7. 7)
      • 7. Lin, F., Brandt, R.D.: ‘An optimal control approach to robust control of robot manipulators’, IEEE Trans. Robot. Autom., 1998, 14, (1), pp. 6977.
    8. 8)
      • 8. Adhyaru, D., Kar, I., Gopal, M.: ‘Fixed final time optimal control approach for bounded robust controller design using Hamilton–Jacobi–Bellman solution’, IET Control Theory Applic., 2009, 3, (9), pp. 11831195.
    9. 9)
      • 9. Wang, D., Liu, D., Li, H.: ‘Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems’, IEEE Trans. Autom. Sci. Eng., 2014, 11, (2), pp. 627632.
    10. 10)
      • 10. P.J.Werbos, Beyond regression: new tools for prediction and analysis in the behavioral sciences’. PhD. dissertation, Harvard University, 1974.
    11. 11)
      • 11. Bellman, R.: ‘Dynamic programming’ (Princeton University Press, New Jersey, 1957).
    12. 12)
      • 12. Powell, W.B.: ‘Approximate dynamic programming: solving the curses of dimensionality’ (John Wiley & Sons, New Jersey, 2007).
    13. 13)
      • 13. Yang, X., Liu, D., Wei, Q.: ‘Online approximate optimal control for affine nonlinear systems with unknown internal dynamics using adaptive dynamic programming’, IET Control Theory Applic., 2014, 8, (16), pp. 16761688.
    14. 14)
      • 14. Huang, Y.: ‘Neuro-observer based online finite-horizon optimal control for uncertain nonlinear continuous-time systems’, IET Control Theory Applic., 2017, 11, (3), pp. 401410.
    15. 15)
      • 15. Zhang, H., Jiang, H., Luo, C., et al: ‘Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms, doi:10.1109/TCYB.2016.2611613’, IEEE Trans. Cybern., 2016, 00, (0), pp. 00.
    16. 16)
      • 16. Zhang, H., Liang, H., Wang, Z., et al: ‘Optimal output regulation for heterogeneous multiagent systems via adaptive dynamic programming’, IEEE Trans. Neural Netw. Learn. Syst., 2017, 28, (1), pp. 1829.
    17. 17)
      • 17. Bertsekas, D.P.: ‘Value and policy iterations in optimal control and adaptive dynamic programming’, IEEE Trans. Neural Netw. Learn. Syst., 2017, 28, (3), pp. 500509.
    18. 18)
      • 18. Zhong, X., He, H.: ‘An event-triggered ADP control approach for continuous-time system with unknown internal states’, IEEE Trans. Cybern., 2017, 47, (3), pp. 683694.
    19. 19)
      • 19. Liu, D., Wei, Q., Wang, D., et al: ‘Adaptive dynamic programming with applications in optimal control’ (Springer, New York, 2017).
    20. 20)
      • 20. Heydari, A.: ‘Revisiting approximate dynamic programming and its convergence’, IEEE Trans. Cybern., 2014, 44, (12), pp. 27332743.
    21. 21)
      • 21. Sokolov, Y., Kozma, R., Werbos, L.D., et al: ‘Complete stability analysis of a heuristic approximate dynamic programming control design’, Automatica, 2015, 59, pp. 918.
    22. 22)
      • 22. Yang, G.H., Fan, Q.Y.: ‘Adaptive fault-tolerant control for affine nonlinear systems based on approximate dynamic programming’, IET Control Theory Applic., 2016, 10, (6), pp. 655663.
    23. 23)
      • 23. Luo, Y., Sun, Q., Zhang, H., et al: ‘Adaptive critic design-based robust neural network control for nonlinear distributed parameter systems with unknown dynamics’, Neurocomputing, 2015, 148, pp. 200208.
    24. 24)
      • 24. Bertsekas, D.P., Tsitsiklis, J.N.: ‘Neuro-dynamic programming’ (Athena Scientific, Massachusetts, 1996).
    25. 25)
      • 25. Sahoo, A., Xu, H., Jagannathan, S.: ‘Approximate optimal control of affine nonlinear continuous-time systems using event-sampled neurodynamic programming’, IEEE Trans. Neural Netw. Learn. Syst., 2017, 28, (3), pp. 639652.
    26. 26)
      • 26. Vamvoudakis, K.G., Vrabie, D., Lewis, F.L.: ‘Online adaptive algorithm for optimal control with integral reinforcement learning’, Int. J. Robust Nonlinear Control, 2014, 24, (17), pp. 26862710.
    27. 27)
      • 27. Zhu, Y., Zhao, D., Li, X.: ‘Using reinforcement learning techniques to solve continuous-time nonlinear optimal tracking problem without system dynamics’, IET Control Theory Applic., 2016, 10, (12), pp. 13391347.
    28. 28)
      • 28. Modares, H., Lewis, F.L., Naghibi Sistani, M.: ‘Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks’, IEEE Trans. Neural Netw. Learn. Syst., 2013, 24, (10), pp. 15131525.
    29. 29)
      • 29. Yang, X., Liu, D., Wang, D.: ‘Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints’, Int. J. Control, 2014, 87, (3), pp. 553566.
    30. 30)
      • 30. Na, J., Herrmann, G.: ‘Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems’, IEEE/CAA J. Autom. Sinica, 2014, 1, (4), pp. 412422.
    31. 31)
      • 31. Yang, X., Liu, D., Luo, B., et al: ‘Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning’, Inf. Sci., 2016, 369, pp. 731747.
    32. 32)
      • 32. Wei, Q., Song, R., Yan, P.: ‘Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP’, IEEE Trans. Neural Netw. Learn. Syst., 2016, 27, (2), pp. 444458.
    33. 33)
      • 33. Wang, D., Liu, D., Zhang, Q., et al: ‘Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics’, IEEE Trans. Syst. Man Cybern. Syst., 2016, 46, (11), pp. 15441555.
    34. 34)
      • 34. Zhang, H., Cui, L., Zhang, X., et al: ‘Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method’, IEEE Trans. Neural Netw., 2011, 22, (12), pp. 22262236.
    35. 35)
      • 35. Mu, C., Ni, Z., Sun, C., et al: ‘Data-driven tracking control with adaptive dynamic programming for a class of continuous-time nonlinear systems’, IEEE Trans. Cybern., 2016, doi:10.1109/TCYB.2016.2548941.
    36. 36)
      • 36. Kamalapurkar, R., Walters, P., Dixon, W.E.: ‘Model-based reinforcement learning for approximate optimal regulation’, Automatica, 2016, 64, pp. 94104.
    37. 37)
      • 37. Yang, X., Liu, D., Huang, Y.: ‘Neural-network-based online optimal control for uncertain nonlinear continuous-time systems with control constraints’, IET Control Theory Applic., 2013, 7, (17), pp. 20372047.
    38. 38)
      • 38. Jiang, Y., Jiang, Z.P.: ‘Robust adaptive dynamic programming and feedback stabilization of nonlinear systems’, IEEE Trans. Neural Netw. Learn. Syst., 2014, 25, (5), pp. 882893.
    39. 39)
      • 39. Lee, J., Park, J.B., Choi, Y.H.: ‘Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations’, IEEE Trans. Neural Netw. Learn. Syst., 2015, 26, (5), pp. 916932.
    40. 40)
      • 40. Modares, H., Lewis, F.L., Jiang, Z.P.: ‘H tracking control of completely unknown continuous-time systems via off-policy reinforcement learning’, IEEE Trans. Neural Netw. Learn. Syst., 2015, 26, (10), pp. 25502562.
    41. 41)
      • 41. Zhu, Y., Zhao, D., Li, X.: ‘Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data’, IEEE Trans. Neural Netw. Learn. Syst., 2017, 28, (3), pp. 714725.
    42. 42)
      • 42. Luo, B., Wu, H.N., Huang, T., et al: ‘Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design’, Automatica, 2014, 50, (12), pp. 32813290.
    43. 43)
      • 43. Vrabie, D., Lewis, F.: ‘Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems’, IEEE Trans. Neural Netw., 2009, 22, (3), pp. 237246.
    44. 44)
      • 44. Lewis, F.L., Vrabie, D., Syrmos, V.L.: ‘Optimal control’ (John Wiley and Sons, New Jersey, 2012).
    45. 45)
      • 45. Khalil, H.K.: ‘Nonlinear systems’ (Prentice-Hall, New Jersey, 2002, 3rd edn.).
    46. 46)
      • 46. Saridis, G.N., Lee, C.S.G.: ‘An approximation theory of optimal control for trainable manipulators’, IEEE Trans. Syst. Man Cybern., 1979, 9, (3), pp. 152179.
    47. 47)
      • 47. Abu Khalaf, M., Lewis, F.L.: ‘Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach’, Automatica, 2005, 41, (5), pp. 779791.
    48. 48)
      • 48. Hornik, K., Stinchcombe, M., White, H.: ‘Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks’, IEEE Trans. Neural Netw., 1990, 3, (5), pp. 551560.
    49. 49)
      • 49. Finlayson, B.A.: ‘The method of weighted residuals and variational principles’ (Academic Press, New York, 1972).
    50. 50)
      • 50. Evans, M., Swartz, T.: ‘Approximating integrals via Monte Carlo and deterministic methods’ (Oxford University Press, New York, 2000).
    51. 51)
      • 51. Rudin, W.: ‘Functional analysis’ (McGraw-Hill Publishing Co., New York, 1991).
    52. 52)
      • 52. Stevens, B.L., Lewis, F.L.: ‘Aircraft control and simulation’ (Wiley-Interscience, New Jersey, 2003, 2nd edn.).
    53. 53)
      • 53. Song, J., He, S., Ding, Z., et al: ‘A new iterative algorithm for solving H control problem of continuous-time Markovian jumping linear systems based on online implementation’, Int. J. Robust Nonlinear Control, 2016, 26, (17), pp. 37373754.
    54. 54)
      • 54. Song, J., He, S., Liu, F., et al: ‘Data-driven policy iteration algorithm for optimal control of continuous-time itô stochastic systems with Markovian jumps’, IET Control Theory Applic., 2016, 10, (12), pp. 14311439.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cta.2017.0154
Loading

Related content

content/journals/10.1049/iet-cta.2017.0154
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address