Your browser does not support JavaScript!

Reducing power of functional units in high-performance processors by checking instruction codes and resizing adders

Reducing power of functional units in high-performance processors by checking instruction codes and resizing adders

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

A hardware technique to reduce static and dynamic power consumption in functional units of 64-bit high-performance processors is presented here. The instructions that require an adder have been studied it can be concluded and that, there is a large percentage of instruction where one of the two source operands is always narrow and does not require a 64-bit adder. Furthermore, by analysing the executed applications, it is feasible to classify their internal operations according to their bit-width requirements and select the appropriate adder type that each instruction requires. This approach is based on substituting some of the 64-bit power-hungry adders with 32-bit ones, which consume much lower power, and modifying the protocol to issue as much instructions as possible to these low power consumption units, while incurring in negligible performance penalties. Five different configurations were tested for the execution units. Results indicate that this technique can save between up to 50% of the power consumed by the adders and up to 21% of the overall power consumption in the execution unit of high-performance architectures. Moreover, the simulations show good results in terms of power efficiency (IPC/W) and it can be affirmed that it could prevent the creation of hot spots in the functional units.


    1. 1)
      • Available at
    2. 2)
      • Standard Performance Evaluation Corporation. SPEC.CPU2000 Benchmarks.
    3. 3)
      • Gowan, M., Biro, L., Jackson, D.: `Power considerations in the design of the alpha 21264 microprocessor', 35thDesign Automation Conference, 1998.
    4. 4)
      • González, J., Skadron, K.: `Power-aware design for high-performance processors', Documentation of 10th International Symposium on High-Performance Computer Architecture (HPCA-10), 2004.
    5. 5)
      • Alpha Architecture Handbook.
    6. 6)
      • S. Thompson , P. Packan , M. Bohr . MOS scaling: transistor challenges for the 21st century. Intel Technol. J. , 3
    7. 7)
      • A. Aslund . Power Estimation of High Speed Bit-Parallel Adders’. Reg nr:LiTH-ISY-EX-3534-2004.
    8. 8)
      • Norman, P.: `Cache write policies and performance', Proc. 20th Annual International Symposium on Computer Architecture, May 1993, 21(2), p. 191–201.
    9. 9)
      • D.C. Burger , T.M. Austin . The SimpleScalar tool set, version 2.0. Comput. Architecture News , 3 , 13 - 25
    10. 10)
    11. 11)
      • Dropshot, S., Kursun, V., Albonesi, D.H., Dwarkadas, S., Friedman, E.G.: `Managing static leakage energy in microprocessor functional units', International Symposium on Microarchitecture, Proc. 35th annual ACM/IEEE international symposiumon Microarchitecture, 2002, p. 321–332.
    12. 12)
      • Brooks, D., Martonosi, M.: `Dynamically exploiting narrow width operands to improve processor power and performance', In Proc of the 5th Int'l Symp on High Performance Computer Architecture (HPCA), 1999, p. 13–22.
    13. 13)
      • Perelman, E., Hamerly, G., Calder, B.: `Picking statistically valid and early simulation points', International Conference on Parallel Architectures and Compilation Techniques, 2003.
    14. 14)
      • D.A. Patterson , J.L. Hennessy . (1990) Computer architecture, a quantitative approach.
    15. 15)
    16. 16)
      • Brooks, D., Tiwari, V., Martonosi, M.: `Wattch: A framework for architectural level power analysis and optimizations', Proc. 27th International Symposium on Computer Architecture, 2000, p. 83–94.
    17. 17)
      • Seng, J.S., Tune, E.S., Tullsen, D.M.: `Reducing power with dynamic critical path information', Proc. 34th Annual ACM/IEEE International Symposium on Micro-Architecture, 2001, p. 114–123.
    18. 18)
      • T. Mudge . Power: a first-class architectural design constraint. IEEE Comput. , 4 , 52 - 58
    19. 19)
      • O.-C. Chen , R.-B. Sheen , S. Wang . A low-power adder operating on effective dynamic data ranges. IEEE Trans. Very Large Scale Integration (VLSI) Syst. , 4 , 435 - 453
    20. 20)
      • Choi, J., Jeon, J., Choi, K.: `Power minimization of functional units partially guarded computation', In Proc. 2000 International Symposium on Low Power Electronics and Design, 2000, p. 131–136.
    21. 21)
      • Butts, J.A., Sohi, G.S.: `A static power model for architects', 33rdAnnual International Symposium on Microarchitecture, December 2000, p. 191–201.
    22. 22)
      • Haga, S., Reeves, N., Barua, R., Marculescu, D.: `Dynamic functional unit assignment for low power', Design, Automation and Test in Europe Conference and Exhibition (DATE'03), 2003, p. 03–07.
    23. 23)
      • Lee, H.-H., Fryman, J.B., Diril, A.U., Dhillon, Y.S.: `The elusive metric for low-power architecture research', Workshop on Complexity-Efective Design in conjunction with ISCA-30, 2003.
    24. 24)
      • Wei, L., Chen, Z., Johnson, M., Roy, K.: `Design and optimization of low voltage high performance dual threshold CMOS circuits', Proc. 35th ACM/IEEE Design Automation Conf., 1998, p. 489–492.
    25. 25)
      • D. Brooks , M. Martonosi . Value-based clock gating and operation packing: dynamic strategies for improving processor power and performance. ACM Trans. Comput. Syst. , 2
    26. 26)
      • S.K. Mathew . A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core. IEEE J. Solid-State Circuits , 5 , . 689 - 695
    27. 27)
      • A. Maldonado_Vazquez . Power-Performance Tradeoffs in Digital Arithmetic Circuits’. Summer Undergraduate Program in Engineering Research at Berkeley SUPERB, Summer 2003.

Related content

This is a required field
Please enter a valid email address