Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Selective block buffering TLB system for embedded processors

Selective block buffering TLB system for embedded processors

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IEE Proceedings - Computers and Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

The authors present a translation lookaside buffer (TLB) system with low power consumption for embedded processors. The proposed TLB is constructed as multiple banks, each with an associated block buffer and a corresponding comparator. Either the block buffer or the main bank is selectively accessed on the basis of two bits in the tag buffer. Dynamic power savings are achieved by reducing the number of entries accessed in parallel, as a result of using the tag buffer as a filtering mechanism. The performance overhead of the proposed TLB is negligible compared with other hierarchical TLB structures. For example, the two-cycle overhead of the proposed TLB is only ∼1%, as compared with 5% overhead for a filter (micro)-TLB and 14% overhead for a banked-TLB with block buffering. The authors show that the average hit ratios of the block buffers and the main banks of the proposed TLB are 94% and 6%, respectively. Dynamic power is reduced by ∼93% with respect to a fully associative TLB, 87% with respect to a filter-TLB and 60% relative to a banked-TLB with block buffering. Therefore, significant power savings are achieved with only a small performance degradation.

References

    1. 1)
      • N.S. Kim , T. Austin , T. Mudge , D. Grunwald , R. Melhem , R. Graybill . (2002) Challenges for architectural level power modeling, Power aware computing.
    2. 2)
      • Wilton, S.J.E., Jouppi, N.: `An enhanced access and cycle time model for on-chip caches', Digital WRL Research Report 93/5, July 1994.
    3. 3)
      • B.L. Jacob , T.N. Mudge . Virtual memory in contemporary microprocessors. IEEE Micro , 4 , 60 - 75
    4. 4)
      • Kamble, M.B., Ghose, K.: `Energy-efficiency of VLSI cache: a comparative study', Proc. IEEE 10th. Int. Conf. on VLSI Design, Jan. 1997, p. 261–267.
    5. 5)
      • AustinT.M.Simplescalar 4.0 release notehttp://www.simplecsalar.com/, 2003.
    6. 6)
      • Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: `MiBench: a free, commercially representative embedded benchmark suite', Proc. IEEE 4th Annual Workshop on Workload Characterization, Dec. 2001.
    7. 7)
      • Intel Co.: StrongARM SA-1100 microprocessor, Technical Reference Manual, 1998.
    8. 8)
      • Juan, T., Lang, T., Navarro, J.: `Reducing TLB power requirements', Proc. Int. Symp. on Low Power Electronics and Design, 1997.
    9. 9)
      • Jacob, B.: `Cache design for embedded real-time systems', Proc. Embedded Systems Conf., June 1999.
    10. 10)
      • EdlerJ.HillM.D.Dinero IV trace-driven uniprocessor cache simulatoravailable from Univ. Wisconsin; ftp://ftp.nj.nec.com/pub/edler/d4/, 1997.
    11. 11)
      • Segars, S.: `Low power design techniques for microprocessors', Tutorial Note of the ISSCC, Feb. 2001.
    12. 12)
    13. 13)
      • Memik, G., Reinman, G., Mangione-Smith, W.H.: `Just say no: benefits of early cache miss determination', Proc. HPCA, Feb. 2003.
    14. 14)
      • Kadayif, I., Sivasubramaniam, A., Kandemir, M., Kandiraju, G., Chen, G.: `Generating physical addresses directly for saving instruction TLB energy efficiency', Proc. Int. Symp. on Microarchitecture, 2002.
    15. 15)
      • Lang, T., Juan, T., Navarro, J.J.: `The difference-bit cache', Proc. ISCA, May 1996, p. 114–120.
    16. 16)
      • Manne, S., Klauser, A., Grunwald, D., Somenzi, F.: `Low power TLB design for high performance microprocessors', Univ. of Colorado Technical Report, 1997.
    17. 17)
      • Kamble, M.B., Ghose, K.: `Analytical energy dissipation models for low power caches', Proc. Int. Symp. on Low-Power Electronics and Design, Aug. 1997.
    18. 18)
      • Reinman, G., Jouppi, N.: `CACTI 3.0: an integrated cache timing and power, and area model', Compaq WRL Report, Aug. 2001.
    19. 19)
      • Austin, T.M., Sohi, G.S.: `High-bandwidth address translation for multiple-issue processors', Proc. 23rd ACM Int. Symp. on Computer Architecture, May 1996, p. 158–167.
    20. 20)
      • Kin, J., Gupta, M., Mangione-Smith, W.H.: `The filter cache: an energy efficient memory structure', Proc. Int. Symp. on Microarchitecture, 1997, p. 184–193.
    21. 21)
      • Ghose, K., Kamble, M.B.: `Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation', Proc. Int. Symp. on Low Power Electronics and Design, Aug. 1999, p. 70–75.
    22. 22)
      • J.H. Choi , J.H. Lee , S.W. Jeong , S.D. Kim , C. Weems . A low power TLB structure for embedded systems. Comput. Archit. Lett.
    23. 23)
      • Reinman, G, Jouppi, N.: `An integrated cache timing and power model', Compaq WRL Report, 1999.
    24. 24)
      • NEC Co.NEC announces CB-12 family, world's first 0.13-micron cell-based ICshttp://www.necus.com/companies/2/000127.htm, 2000.
    25. 25)
      • B.L. Jacob , T.N. Mudge . Virtual memory: issues of implementation. Computer , 6 , 33 - 43
    26. 26)
http://iet.metastore.ingenta.com/content/journals/10.1049/ip-cdt_20045025
Loading

Related content

content/journals/10.1049/ip-cdt_20045025
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address