Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free Area- and power-efficient iterative single/double-precision merged floating-point multiplier on FPGA

In this study, an area and power-efficient iterative floating-point (FP) multiplier architecture is designed and implemented on FPGA devices with pipelined architecture. The proposed multiplier supports both single-precision (SP) and double-precision (DP) operations. The operation mode can be switched during run time by changing the precision selection signal. The Karatsuba algorithm is applied when mapping the mantissa multiplier in order to reduce the number of digital signal processing (DSP) blocks required. For DP operations, the iterative method is applied which require much less hardware than a fully pipelined DP multiplier and thus reduces the power consumption. To further reduce the power consumption, the unused logic blocks for a specific operation mode are disabled. Compared to previous work, the proposed multiplier can achieve 33% reduction of DSP blocks, 4.3% less look-up tables (LUTs), and 31.2% less flip-flops while having 4% faster clock frequency on Virtex-5 devices. Compared to the intellectual property core DP multiplier provided by the FPGA vendors, the proposed multiplier required less DSP blocks and achieves lower-power consumption. The mapping solutions and implementation results of the proposed multiplier on Xilinx Virtex-7 and Altera Arria-10 devices are also presented. In addition, the results of a direct implementation of the proposed architecture on STM-90 nm ASIC platform are reported.

References

    1. 1)
      • 21. ‘Berkeley TestFloat’. Available at http://www.jhauser.us/arithmetic/TestFloat.html.
    2. 2)
      • 22. Chaudhary, M., Lee, P.: ‘Two-stage logarithmic converter with reduced memory requirements’, IET Comput. Digit. Tech., 2014, 8, (1), pp. 2329.
    3. 3)
      • 25. Oberman, S., Flynn, M.: ‘Design issues in division and other floating-point operations’, IEEE Trans. Comput., 1997, 46, (2), pp. 154161.
    4. 4)
      • 11. Lee, J., Peterson, G.D.: ‘The role of precision for iterative refinement’. 2012 Symp. on Application Accelerators in High Performance Computing (SAAHPC), July 2012, pp. 125128.
    5. 5)
      • 18. 7 Series DSP48E1 slice, UG479(v1.8) ed., Xilinx, November2014.
    6. 6)
      • 9. Manolopoulos, K., Reisis, D., Chouliaras, V.A.: ‘An efficient multiple precision floating-point multiplier’. 2011 18th IEEE Int. Conf. on Electronics, Circuits and Systems (ICECS), December 2011, pp. 153156.
    7. 7)
      • 8. Akka, A., Schulte, M.J.: ‘Dual-mode floating-point multiplier architectures with parallel operations’, J. Syst. Archit., 2006, 52, (10), pp. 549562.
    8. 8)
      • 2. Booth, A.D.: ‘A signed binary multiplication technique’, Q. J. Mech. Appl. Math., 1951, 4, (2), pp. 236240.
    9. 9)
      • 5. Banescu, S., de Dinechin, F., Pasca, B., et al: ‘Multipliers for floating-point double precision and beyond on FPGAs’, SIGARCH Comput. Archit. News, 2011, 38, (4), pp. 7379.
    10. 10)
      • 10. Langou, J., Langou, J., Luszczek, P., et al: ‘Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)’. Proc. of the ACM/IEEE SC 2006 Conf., November 2006, p. 50.
    11. 11)
      • 12. Smith, M.C., Vetter, J.S., Liang, X.: ‘Accelerating scientific applications with the SRC-6 reconfigurable computer: methodologies and analysis’. Proc. 19th IEEE Int. Parallel and Distributed Processing Symp., 2005, April 2005, p. 157b.
    12. 12)
      • 24. Haselman, M., Beauchamp, M., Wood, A., et al: ‘A comparison of floating point and logarithmic number systems for FPGAs’. 13th Annual IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM'05), April 2005, pp. 181190.
    13. 13)
      • 7. Tan, D., Lemonds, C.E., Schulte, M.J.: ‘Low-power multiple-precision iterative floating-point multiplier with SIMD support’, IEEE Trans. Comput., 2009, 58, (2), pp. 175187.
    14. 14)
      • 1. IEEE standard for floating-point arithmetic’, IEEE Std. 754-2008, 2008, pp. 170.
    15. 15)
      • 13. Loi, K.C.C., Ko, S.B.: ‘Scalable elliptic curve cryptosystem FPGA processor for NIST prime curves’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2015, 23, (11), pp. 27532756.
    16. 16)
      • 20. Arria 10 native fixed point DSP IP core user guide, UG-01163 ed., Altera, December2014.
    17. 17)
      • 17. Hinton, G., Upton, M., Sager, D.J., et al: ‘A 0.18 mu;m CMOS IA-32 processor with a 4 GHz integer execution unit’, IEEE J. Solid-State Circuits, 2001, 36, (11), pp. 16171627.
    18. 18)
      • 15. Karatsuba, A., Ofman, Y.: ‘Multiplication of many-digital numbers by automatic computers’. Proc. of the USSR Academy of Sciences, 1962, vol. 145, pp. 293294.
    19. 19)
      • 16. Virtex-5 FPGA XtremeDSP design considerations user guide, UG193(v3.5) ed., Xilinx, January2012.
    20. 20)
      • 26. Ho, C.H., Leong, P.H.W., Luk, W., et al: ‘Virtual embedded blocks: a methodology for evaluating embedded elements in FPGAs’. 2006 14th Annual IEEE Symp. on Field-Programmable Custom Computing Machines, April 2006, pp. 3544.
    21. 21)
      • 23. Chaudhary, M., Lee, P.: ‘An improved two-step binary logarithmic converter for FPGAs’, IEEE Trans. Circuits Syst. II, Express Briefs, 2015, 62, (5), pp. 476480.
    22. 22)
      • 4. de Dinechin, F., Pasca, B.: ‘Large multipliers with fewer DSP blocks’. Int. Conf. on Field Programmable Logic and Applications, 2009. FPL 2009, August 2009, pp. 250255.
    23. 23)
      • 19. Arria 10 core fabric and general purpose I/Os handbook, Altera, May2015.
    24. 24)
      • 6. Bailey, D.H.: ‘High-precision computation: applications and challenges [Keynote I]’. 2013 21st IEEE Symp. on Computer Arithmetic (ARITH), April 2013, p. 3.
    25. 25)
      • 14. Jaiswal, M.K., Cheung, R.C.C.: ‘Area-efficient architectures for double precision multiplier on FPGA, with run-time-reconfigurable dual single precision support’, Microelectron. J., 2013, 44, (5), pp. 421430.
    26. 26)
      • 3. Kuang, S.-R., Wang, J.-P., Hong, H.-Y.: ‘Variable-latency floating-point multipliers for low-power applications’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2010, 18, (10), pp. 14931497.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2016.0100
Loading

Related content

content/journals/10.1049/iet-cdt.2016.0100
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address