Data-reuse exploration under an on-chip memory constraint for low-power FPGA-based systems

Data-reuse exploration under an on-chip memory constraint for low-power FPGA-based systems

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Contemporary FPGA-based reconfigurable systems have been widely used to implement data-dominated applications. In these applications, data transfer and storage consume a large proportion of the system energy. Exploiting data-reuse can introduce significant power savings, but also introduces the extra requirement for on-chip memory. To aid data-reuse design exploration early during the design cycle, the authors present an optimisation approach to achieve a power-optimal design satisfying an on-chip memory constraint in a targeted FPGA-based platform. The data-reuse exploration problem is mathematically formulated and shown to be equivalent to the multiple-choice knapsack problem. The solution to this problem for an application code corresponds to the decision of which array references are to be buffered on-chip and where loading reused data of the array references into on-chip memory happen in the code, in order to minimise power consumption for a fixed on-chip memory size. The authors also present an experimentally verified power model, capable of providing the relative power information between different data-reuse design options of an application, resulting in a fast and efficient design-space exploration. The experimental results demonstrate that the approach enables us to find the most power-efficient design for all the benchmark circuits tested.


    1. 1)
    2. 2)
      • Guo, Z., Najjar, W., Vahid, F., Vissers, K.: `A quantitative analysis of the speedup factors of FPGAs over processors', FPGA '04: Proc. 2004 ACM/SIGDA 12th Int. Symp. Field Programmable Gate Arrays, ACM, 2004, New York, NY, USA, p. 162–170.
    3. 3)
      • V. Bonato , E. Marques , G.A. Constantinides . A floating-point extended kalman filter implementation for autonomous mobile robots. VLSI Signal Process.
    4. 4)
      • Kuon, I., Rose, J.: `Measuring the gap between FPGAs and ASICs', FPGA' 06: Proc. 2006 ACM/SIGDA 14th Int. Symp. Field Programmable Gate Arrays, ACM, 2006, New York, NY, USA, p. 21–30.
    5. 5)
    6. 6)
      • F. Catthoor , E. de Greef , S. Suytack . (1998) Custom memory management methodology: exploration of memory organisation for embedded multimedia system design.
    7. 7)
      • Liu, Q., Masselos, K., Constantinides, G.A.: `Data reuse exploration for FPGA based platforms applied to the full search motion estimation algorithm', 2006 Int. Conf. Field Programmable Logic and Applications, 2006, p. 389–394.
    8. 8)
      • Liu, Q., Constantinides, G.A., Masselos, K., Cheung, P.Y.K.: `Automatic on-chip memory minimization for data reuse', FCCM '07: Proc. 15th Annual IEEE Symp. Field-Programmable Custom Computing Machines, IEEE Computer Society, 2007, DC, USA, Washington, p. 251–260.
    9. 9)
      • S. Martello , P. Toth . (1990) Knapsack problems: algorithms and computer implementations.
    10. 10)
    11. 11)
    12. 12)
    13. 13)
      • Brockmeyer, E., Miranda, M., Corporaal, H., Catthoor, F.: `Layer assignment techniques for low energy in multi-layered memory organisations', Proc. 6th ACM/IEEE Design, Automation and Test in Europe Conf. and Exhibition, March 2003, Munich, Germany, p. 1070–1075.
    14. 14)
    15. 15)
      • Soudris, D., Zervas, N.D., Argyriou, A., Dasygenis, M., Tatas, K., Goutis, C.E., Thanailakis, A.: `Data-reuse and parallel embedded architectures for low-power, real-time multimedia applications', PATMOS '00: Proc. 10th Int. Workshop on Integrated Circuit Design, Power and Timing Modeling, Optimization and Simulation, 2000, London, UK, p. 243–254, Springer-Verlag.
    16. 16)
    17. 17)
      • Baradaran, N., Park, J., Diniz, P.C.: `Compiler reuse analysis for the mapping of data in FPGAs with RAM blocks', 2004 IEEE Int. Conf. Field-Programmable Technology, December 2004, p. 145–152.
    18. 18)
      • M. Weinhardt , W. Luk . Memory access optimization for reconfigurable systems. IEE Proc Comput. Digit. Tech. , 105 - 112
    19. 19)
      • Baradaran, N., Diniz, P.C.: `A register allocation algorithm in the presence of scalar replacement for fine-grain configurable architectures', DATE' 05: Proc. Conf. Design, Automation and Test in Europe, IEEE Computer Society, 2005, Washington, DC, USA, p. 6–11.
    20. 20)
    21. 21)
      • Celoxica, ‘RC300 board specifications’, accessed January 2007. Available:
    22. 22)
      • Xilinx: ‘Xilinx Xpower estimator user guide’, accessed January 2007. Available: central.
    23. 23)
      • L. Wang , M. French , A. Davoodi , D. Agarwal . FPGA dynamic power minimization through placement and routing constraints. EURASIP J. Embedded Syst. , 1 , 1 - 10
    24. 24)
      • Clarke, J.A.: `High-level power optimization for digital signal processing in reconfigurable logic', , PhD, , dissertation, Imperial College London, London, UK, 2008.
    25. 25)
      • ‘1Mx36 & 2Mx18 flow-through NtRAM datasheet’,, accessed January 2007.
    26. 26)
      • V. Bhaskaran , K. Konstantinides . (1997) Image and video compression standards: algorithms and architectures.
    27. 27)
      •, accessed August 2006.
    28. 28)
      • ‘Handel-C language reference manual’, http://www.celoxica.comaccessed August 2006.

Related content

This is a required field
Please enter a valid email address