http://iet.metastore.ingenta.com
1887

Area- and power-efficient iterative single/double-precision merged floating-point multiplier on FPGA

Area- and power-efficient iterative single/double-precision merged floating-point multiplier on FPGA

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

In this study, an area and power-efficient iterative floating-point (FP) multiplier architecture is designed and implemented on FPGA devices with pipelined architecture. The proposed multiplier supports both single-precision (SP) and double-precision (DP) operations. The operation mode can be switched during run time by changing the precision selection signal. The Karatsuba algorithm is applied when mapping the mantissa multiplier in order to reduce the number of digital signal processing (DSP) blocks required. For DP operations, the iterative method is applied which require much less hardware than a fully pipelined DP multiplier and thus reduces the power consumption. To further reduce the power consumption, the unused logic blocks for a specific operation mode are disabled. Compared to previous work, the proposed multiplier can achieve 33% reduction of DSP blocks, 4.3% less look-up tables (LUTs), and 31.2% less flip-flops while having 4% faster clock frequency on Virtex-5 devices. Compared to the intellectual property core DP multiplier provided by the FPGA vendors, the proposed multiplier required less DSP blocks and achieves lower-power consumption. The mapping solutions and implementation results of the proposed multiplier on Xilinx Virtex-7 and Altera Arria-10 devices are also presented. In addition, the results of a direct implementation of the proposed architecture on STM-90 nm ASIC platform are reported.

References

    1. 1)
      • 1. IEEE standard for floating-point arithmetic’, IEEE Std. 754-2008, 2008, pp. 170.
        . , 1 - 70
    2. 2)
      • A.D. Booth .
        2. Booth, A.D.: ‘A signed binary multiplication technique’, Q. J. Mech. Appl. Math., 1951, 4, (2), pp. 236240.
        . Q. J. Mech. Appl. Math. , 2 , 236 - 240
    3. 3)
      • S.-R. Kuang , J.-P. Wang , H.-Y. Hong .
        3. Kuang, S.-R., Wang, J.-P., Hong, H.-Y.: ‘Variable-latency floating-point multipliers for low-power applications’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2010, 18, (10), pp. 14931497.
        . IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , 10 , 1493 - 1497
    4. 4)
      • F. de Dinechin , B. Pasca .
        4. de Dinechin, F., Pasca, B.: ‘Large multipliers with fewer DSP blocks’. Int. Conf. on Field Programmable Logic and Applications, 2009. FPL 2009, August 2009, pp. 250255.
        . Int. Conf. on Field Programmable Logic and Applications, 2009. FPL 2009 , 250 - 255
    5. 5)
      • S. Banescu , F. de Dinechin , B. Pasca .
        5. Banescu, S., de Dinechin, F., Pasca, B., et al: ‘Multipliers for floating-point double precision and beyond on FPGAs’, SIGARCH Comput. Archit. News, 2011, 38, (4), pp. 7379.
        . SIGARCH Comput. Archit. News , 4 , 73 - 79
    6. 6)
      • D.H. Bailey .
        6. Bailey, D.H.: ‘High-precision computation: applications and challenges [Keynote I]’. 2013 21st IEEE Symp. on Computer Arithmetic (ARITH), April 2013, p. 3.
        . 2013 21st IEEE Symp. on Computer Arithmetic (ARITH) , 3
    7. 7)
      • D. Tan , C.E. Lemonds , M.J. Schulte .
        7. Tan, D., Lemonds, C.E., Schulte, M.J.: ‘Low-power multiple-precision iterative floating-point multiplier with SIMD support’, IEEE Trans. Comput., 2009, 58, (2), pp. 175187.
        . IEEE Trans. Comput. , 2 , 175 - 187
    8. 8)
      • A. Akka , M.J. Schulte .
        8. Akka, A., Schulte, M.J.: ‘Dual-mode floating-point multiplier architectures with parallel operations’, J. Syst. Archit., 2006, 52, (10), pp. 549562.
        . J. Syst. Archit. , 10 , 549 - 562
    9. 9)
      • K. Manolopoulos , D. Reisis , V.A. Chouliaras .
        9. Manolopoulos, K., Reisis, D., Chouliaras, V.A.: ‘An efficient multiple precision floating-point multiplier’. 2011 18th IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS), December 2011, pp. 153156.
        . 2011 18th IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS) , 153 - 156
    10. 10)
      • J. Langou , J. Langou , P. Luszczek .
        10. Langou, J., Langou, J., Luszczek, P., et al: ‘Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)’. Proc. of the ACM/IEEE SC 2006 Conf., November 2006, p. 50.
        . Proc. of the ACM/IEEE SC 2006 Conf. , 50
    11. 11)
      • J. Lee , G.D. Peterson .
        11. Lee, J., Peterson, G.D.: ‘The role of precision for iterative refinement’. 2012 Symp. on Application Accelerators in High Performance Computing (SAAHPC), July 2012, pp. 125128.
        . 2012 Symp. on Application Accelerators in High Performance Computing (SAAHPC) , 125 - 128
    12. 12)
      • M.C. Smith , J.S. Vetter , X. Liang .
        12. Smith, M.C., Vetter, J.S., Liang, X.: ‘Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis’. Proc. 19th IEEE Int. Parallel and Distributed Processing Symp., 2005, April 2005, p. 157b.
        . Proc. 19th IEEE Int. Parallel and Distributed Processing Symp., 2005 , 157b
    13. 13)
      • K.C.C. Loi , S.B. Ko .
        13. Loi, K.C.C., Ko, S.B.: ‘Scalable elliptic curve cryptosystem FPGA processor for NIST prime curves’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2015, 23, (11), pp. 27532756.
        . IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , 11 , 2753 - 2756
    14. 14)
      • M.K. Jaiswal , R.C.C. Cheung .
        14. Jaiswal, M.K., Cheung, R.C.C.: ‘Area-efficient architectures for double precision multiplier on FPGA, with run-time-reconfigurable dual single precision support’, Microelectron. J., 2013, 44, (5), pp. 421430.
        . Microelectron. J. , 5 , 421 - 430
    15. 15)
      • A. Karatsuba , Y. Ofman .
        15. Karatsuba, A., Ofman, Y.: ‘Multiplication of many-digital numbers by automatic computers’. Proc. of the USSR Academy of Sciences, 1962, vol. 145, pp. 293294.
        . Proc. of the USSR Academy of Sciences , 293 - 294
    16. 16)
      • (2012)
        16. Virtex-5 FPGA XtremeDSP design considerations user guide, UG193(v3.5) ed., Xilinx, January2012.
        .
    17. 17)
      • G. Hinton , M. Upton , D.J. Sager .
        17. Hinton, G., Upton, M., Sager, D.J., et al: ‘A 0.18 mu;m CMOS IA-32 processor with a 4 GHz integer execution unit’, IEEE J. Solid-State Circuits, 2001, 36, (11), pp. 16171627.
        . IEEE J. Solid-State Circuits , 11 , 1617 - 1627
    18. 18)
      • (2014)
        18. 7 Series DSP48E1 slice, UG479(v1.8) ed., Xilinx, November2014.
        .
    19. 19)
      • (2015)
        19. Arria 10 core fabric and general purpose I/Os handbook, Altera, May2015.
        .
    20. 20)
      • (2014)
        20. Arria 10 native fixed point DSP IP core user guide, UG-01163 ed., Altera, December2014.
        .
    21. 21)
      • 21. ‘Berkeley TestFloat’. Available at http://www.jhauser.us/arithmetic/TestFloat.html.
        .
    22. 22)
      • M. Chaudhary , P. Lee .
        22. Chaudhary, M., Lee, P.: ‘Two-stage logarithmic converter with reduced memory requirements’, IET Comput. Digit. Tech., 2014, 8, (1), pp. 2329.
        . IET Comput. Digit. Tech. , 1 , 23 - 29
    23. 23)
      • M. Chaudhary , P. Lee .
        23. Chaudhary, M., Lee, P.: ‘An improved two-step binary logarithmic converter for FPGAs’, IEEE Trans. Circuits Syst. II, Express Briefs, 2015, 62, (5), pp. 476480.
        . IEEE Trans. Circuits Syst. II, Express Briefs , 5 , 476 - 480
    24. 24)
      • M. Haselman , M. Beauchamp , A. Wood .
        24. Haselman, M., Beauchamp, M., Wood, A., et al: ‘A comparison of floating point and logarithmic number systems for FPGAs’. 13th Annual IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM'05), April 2005, pp. 181190.
        . 13th Annual IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM'05) , 181 - 190
    25. 25)
      • S. Oberman , M. Flynn .
        25. Oberman, S., Flynn, M.: ‘Design issues in division and other floating-point operations’, IEEE Trans. Comput., 1997, 46, (2), pp. 154161.
        . IEEE Trans. Comput. , 2 , 154 - 161
    26. 26)
      • C.H. Ho , P.H.W. Leong , W. Luk .
        26. Ho, C.H., Leong, P.H.W., Luk, W., et al: ‘Virtual embedded blocks: a methodology for evaluating embedded elements in FPGAs’. 2006 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, April 2006, pp. 3544.
        . 2006 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines , 35 - 44
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2016.0100
Loading

Related content

content/journals/10.1049/iet-cdt.2016.0100
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address