http://iet.metastore.ingenta.com
1887

Low-power branch target buffer for application-specific embedded processors

Low-power branch target buffer for application-specific embedded processors

For access to this article, please select a purchase option:

Buy article PDF
$19.95
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IEE Proceedings - Computers and Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

A methodology for a low-power branch identification mechanism which enables the design of extremely power-efficient branch predictors for embedded processors is presented. The proposed technique utilises application-specific information regarding the control-flow structure of the program's major loops. Such information is used to completely eliminate the power hungry branch target buffer (BTB) lookups which normally occur at every execution cycle. Exact application knowledge regarding the control-flow structure of the program obviates the power expensive BTB operations, thus enabling the utilisation of contemporary branch predictors in high-end, yet power-sensitive embedded processors. The utilisation of exact application knowledge results not only in the complete elimination of the power hungry BTB structure but also in a perfect branch and target address identification. A cost-efficient and programmable hardware architecture for capturing the control-flow structure of the program is presented. The hardware complexity of the proposed architecture is carefully analysed in terms of power, performance and area overhead. The proposed technique delivers power reductions in excess of 90% for a set of representative embedded benchmarks.

References

    1. 1)
      • Ball, T., Larus, J.R.: `Branch prediction for free', Proc. Conf. on Programming Language Design and Implementation, June 1993, p. 300–313.
    2. 2)
      • C. Young , M.D. Smith . Static correlated branch prediction. ACM Trans. Program. Lang. Syst. , 111 - 159
    3. 3)
      • Fisher, J.A., Freudenberger, S.M.: `Predicting conditional branch directions from previous runs of a program', Proc. Int. Conf. on Architectural Support for Programming Languages and Operating Systems, October 1992, p. 85–95.
    4. 4)
      • Pan, S.T., So, K., Rahmeh, J.T.: `Improving the accuracy of dynamic branch prediction using branch correlation', Proc. Int. Conf. on Architectural Support for Programming Languages and Operating Systems, October 1992, p. 76–84.
    5. 5)
      • McFarling, S.: `Combining branch predictors', TN-36, Technical, June 1993.
    6. 6)
      • Lee, L.H., Scott, J., Moyer, B., Arends, J.: `Low-cost branch folding for embedded applications with small tight loops', Proc. 32nd Int. Symp. on Microarchitecture, November 1999, p. 103–111.
    7. 7)
      • Parikh, D., Skadron, K., Zhang, Y., Barcella, M., Stan, M.: `Power issues related to branch prediction', Proc. Int. Symp. on High-performance Computer Architecture, Feb. 2002, Cambridge, MA, USA, p. 233–244.
    8. 8)
      • Chaver, D., Pinuel, L., Prineto, M., Tirado, F., Huang, M.: `Branch prediction on demand: an energy-efficient solution', Proc. Int. Symp. on High-performance Computer Architecture, Aug. 2003, p. 25–27.
    9. 9)
      • Baniasadi, A.: `Power-aware branch predictor update for high-performance processors', Proc. Int. Workshop on Power and Timing Modeling, Optimization and Simulation, September 2003, p. 420–429.
    10. 10)
    11. 11)
    12. 12)
      • Lee, C., Potkonjak, M., Mangione-Smith, W.H.: `MediaBench: A tool for evaluating and synthesizing multimedia and communications systems', Proc. 30th Int. Symp. on Microarchitecture, December 1997, p. 330–335.
    13. 13)
      • Available at http://lame.sourceforge.net.
    14. 14)
      • Shivakumar, P., Jouppi, N.: `An integrated cache timing, power and area model', CACTI 3.0, Technical, 2001.
http://iet.metastore.ingenta.com/content/journals/10.1049/ip-cdt_20041101
Loading

Related content

content/journals/10.1049/ip-cdt_20041101
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address