Hardware architecture for high-speed real-time dynamic programming applications

Author(s): B. Matthews and I. Elhanany
DOI: 10.1049/iet-cdt:20070027

For access to this article, please select a purchase option:

Buy article PDF

Buy Knowledge Pack

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership

Recommend Title Publication to library

IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Author(s): B. Matthews ¹ and I. Elhanany ¹
- Affiliations: 1: Electrical and Computer Engineering Department, University of Tennessee, Knoxville, USA
Source: Volume 2, Issue 3, May 2008, p. 164 – 171
DOI: 10.1049/iet-cdt:20070027 , Print ISSN 1751-8601, Online ISSN 1751-861X

A novel hardware architecture for performing the core computations required by dynamic programming (DP) techniques is introduced. The latter pertain to a vast range of applications that necessitate an optimal sequence of decisions to be obtained. An underlying assumption is that a complete model of the environment is provided, whereby the dynamics are governed by a Markov decision process. Existing DP implementations have traditionally focused on software-based mechanisms. Here, the authors present a method for exploiting the inherent parallelism associated with computing both the value function and optimal policy. This allows for the optimal policy to be obtained several orders of magnitude faster than traditional software implementations, establishing the viability of the approach for demanding, real-time applications. The well-known rental car management problem has been studied as a benchmark for which a field-programmable gate array-based implementation was designed. The results highlight the advantages of the proposed approach with respect to the execution speed and the scalability properties.

References

1. 1)
  - W. Usaha , J. Barria . Markov decision theory framework for resource allocation in leo satellite considerations. IEE Proc. Commun. , 56 , 270 - 276
2. 2)
  - Padberg, F.: `On the potential of process simulation in software project schedule optimization', 29thAnnual Int., Computer Software and Applications Conf., COMPSAC 2005, 2, p. 127–130.
3. 3)
  - Sutton RS: On the significance of markov decision processes ICANN 1997 273–282.
4. 4)
  - R. Bellman . (1957) Dynamic programming.
5. 5)
  - S. Kim , M.E. Lewis , C.C. White . Optimal vehicle routing with real-time traffic information. IEEE Trans. Intell. Transp. Syst. , 2 , 178 - 188
6. 6)
  - Haas, Z., Halpern, J.Y., Li, L.: `A decision-theoretic approach to resource allocation in wireless multimedia networks', Proc. 4th int. workshop Discrete algorithms and methods for mobile computing and communications DIALM '00, 2000, p. 86–95.
7. 7)
  - ‘Xilinx Virtex-4 technical documentation’, available at: http://www.xilinx.com.
8. 8)
  - ‘Altera Stratix II technical documentation’, available at: http://www.altera.com.
9. 9)
  - R.S. Sutton , A.G. Barto . (1998) Reinforecement learning: an introduction.
10. 10)
  - Ferguson, D., Stentz, A.: `Focussed processing of mdps for path planning', 16thIEEE Int. Conf. Tools with Artificial Intelligence, ICTAI 2004, p. 310–317.
11. 11)
  - Kotsalis, G., Dahleh, M.: `Model reduction of irreducible markov chains', Proc. 42nd IEEE Conf. Decision and Control, 2003, 6, p. 5727–5728.
12. 12)
  - K. Katsikopoulos , S. Engelbrecht . Markov decision processes with delays and asynchronous cost collection. IEEE Trans. Autom. Control , 4 , 568 - 574
13. 13)
  - T. Javidi , D. Teneketzis . Sensitivity analysis for an optimal routing policy in an ad hoc wireless network. IEEE Trans. Autom. Control , 8 , 1303 - 1316
14. 14)
  - D.P. Bertsekas , J.N. Tsitsiklis . (1996) Neuro-dynamic programming.
15. 15)
  - Laroche, P., Charpillet, F., Schott, R.: `Mobile robotics planning using abstract Markov decision processes', Proc. 11th IEEE Int. Conf. Tools with Artificial Intelligence ICTAI '99, 1999, Washington, DC, USA, IEEE Computer Society, p. 299.
16. 16)
  - Kang, J., Kolmanovsky, I., Grizzle, J.: `Approximate dynamic programming solutions for lean burn engine aftertreatment', 38thIEEE Conf. Decision and Control, 1999, 2, p. 1703–1708.

Hardware architecture for high-speed real-time dynamic programming applications

Hardware architecture for high-speed real-time dynamic programming applications

Buy article PDF

Buy Knowledge Pack

Thank you

References

Related content