http://iet.metastore.ingenta.com
1887

Quadruple throughput fixed point quarter precision multiply accumulate circuit design

Quadruple throughput fixed point quarter precision multiply accumulate circuit design

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Computers & Digital Techniques — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

This study proposes an efficient very large scale integration (VLSI) architecture for quadruple throughput fixed point multiply accumulate circuit (MAC). The proposed n × n bits MAC is used to perform one n × n bits or two n × (n/2) bits or four (n/2) × (n/2) bits MAC operations in parallel. The objective of the proposed MAC is to improve throughput of the existing MAC designs. The proposed and existing designs are implemented by 45 nm CMOS TSMC library and the results show that the proposed architecture achieves better improvement in throughput than existing designs. For example, the proposed 32 × 32 bits MAC architecture achieves 60.4% of improvement in throughput over existing array multiplier-based double throughput MAC.

References

    1. 1)
      • M.M.A. Basiri , S.k.N. Mahammad .
        1. Basiri, M.M.A., Mahammad, S.k.N.: ‘Configurable folded IIR filter design’, IEEE Trans. Circuits Syst. II Express Briefs, 2015, 62, (12), pp. 11441148.
        . IEEE Trans. Circuits Syst. II Express Briefs , 12 , 1144 - 1148
    2. 2)
      • M.M.A. Basiri , S.k.N. Mahammad .
        2. Basiri, M.M.A., Mahammad, S.k.N.: ‘Multi­mode parallel and folded VLSI architectures for 1D-­fast Fourier transform’, Integr. VLSI J., 2016, 55, pp. 4366.
        . Integr. VLSI J. , 43 - 66
    3. 3)
      • M.M.A. Basiri , S.k.N. Mahammad .
        3. Basiri, M.M.A., Mahammad, S.k.N.: ‘An efficient VLSI architecture for discrete Hadamard transform’. IEEE Int. VLSI Design Conf., January 2016, pp. 140145.
        . IEEE Int. VLSI Design Conf. , 140 - 145
    4. 4)
      • M.M.A. Basiri , S.k.N. Mahammad .
        4. Basiri, M.M.A., Mahammad, S.k.N.: ‘An efficient VLSI architecture for lifting based 1D/2D discrete wavelet transform’, Microprocess. Microsyst., 2016, 47, (B), pp. 404418.
        . Microprocess. Microsyst. , 404 - 418
    5. 5)
      • M.A. Basiri M. , N. Mahammad Sk .
        5. Basiri M., M.A., Mahammad Sk, N.: ‘An efficient hardware based higher radix floating point MAC design’, ACM Trans. Design Autom. Electronic Syst. (TODAES), 2014, 20, (1), pp. 15:115:25.
        . ACM Trans. Design Autom. Electronic Syst. (TODAES) , 1 , 15:1 - 15:25
    6. 6)
      • F. Elguibaly .
        6. Elguibaly, F.: ‘A fast parallel multiplier accumulator using the modified Booth algorithm’, IEEE Trans. Circuits Syst. II, 2000, 27, (9), pp. 902908.
        . IEEE Trans. Circuits Syst. II , 9 , 902 - 908
    7. 7)
      • Y.-H. Seo , D.W. Kim .
        7. Seo, Y.-H., Kim, D.W.: ‘A new VLSI architecture of parallel multiplier-accumulator based on radix-2 modified booth algorithm’, IEEE Trans. VLSI Syst., 2010, 18, (2), pp. 201208.
        . IEEE Trans. VLSI Syst. , 2 , 201 - 208
    8. 8)
      • M.A. Basiri M. , N. Mahammad Sk .
        8. Basiri M., M.A., Mahammad Sk, N.: ‘An efficient hardware based MAC design in digital filters with complex numbers’. IEEE Int. Conf. on Signal Processing and Integrated Networks (SPIN), February 2014, pp. 475480.
        . IEEE Int. Conf. on Signal Processing and Integrated Networks (SPIN) , 475 - 480
    9. 9)
      • (2001)
        9. Texas Instruments: ‘Circular buffering on TMS320C6000’. Application Report, 2001.
        .
    10. 10)
      • M. Sjalander , P. Larsson-Edefors .
        10. Sjalander, M., Larsson-Edefors, P.: ‘Multiplication acceleration through twin precision’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2009, 17, (9), pp. 12331246.
        . IEEE Trans. Very Large Scale Integr. (VLSI) Syst. , 9 , 1233 - 1246
    11. 11)
      • M.M.A. Basiri , S.C. Nayak , S.k.N. Mahammad .
        11. Basiri, M.M.A., Nayak, S.C., Mahammad, S.k.N.: ‘Multiplication acceleration through quarter precision Wallace tree multiplier’. IEEE Int. Conf. on Signal Processing and Integrated Networks (SPIN), February 2014, pp. 502505.
        . IEEE Int. Conf. on Signal Processing and Integrated Networks (SPIN) , 502 - 505
    12. 12)
      • C.S. Wallace .
        12. Wallace, C.S.: ‘A suggestion for a fast multiplier’, IEEE Trans. Electron. Comput., 1964, EC-13, (1), pp. 1417.
        . IEEE Trans. Electron. Comput. , 1 , 14 - 17
    13. 13)
      • M.A. Basiri M. , N. Mahammad Sk .
        13. Basiri M., M.A., Mahammad Sk, N.: ‘Memory based multiplier design in custom and FPGA implementation’. Int. Symp. on Advances in Intelligent Systems and Computing, September 2014, vol. 320, pp. 253265.
        . Int. Symp. on Advances in Intelligent Systems and Computing , 253 - 265
    14. 14)
      • T.T. Hoang , M. Sjalander , P. Larsson-Edefors .
        14. Hoang, T.T., Sjalander, M., Larsson-Edefors, P.: ‘A high-speed, energy-efficient two-cycle multiply-accumulate (MAC) architecture and its application to a double-throughput MAC unit’, IEEE Trans. Circuits Syst. I, 2010, 52, (12), pp. 30733081.
        . IEEE Trans. Circuits Syst. I , 12 , 3073 - 3081
    15. 15)
      • A. Danysh , D. Tan .
        15. Danysh, A., Tan, D.: ‘Architecture and implementation of a vector/SIMD multiply-accumulate unit’, IEEE Trans. Comput., 2005, 54, (3), pp. 284293.
        . IEEE Trans. Comput. , 3 , 284 - 293
    16. 16)
      • Y. Luo , Z. Zhang , X. Huang .
        16. Luo, Y., Zhang, Z., Huang, X., et al: ‘Architecture and implementation of a vector MAC unit for complex number’. IEEE Int. Conf. on Communications and Networking in China (CHINACOM), August 2014, pp. 589594.
        . IEEE Int. Conf. on Communications and Networking in China (CHINACOM) , 589 - 594
    17. 17)
      • V. Gierenz , C. Panis , J. Nurmi .
        17. Gierenz, V., Panis, C., Nurmi, J.: ‘Parameterized MAC unit generation for a scalable embedded DSP core’, J. Microprocess. Microsyst., 2010, 34, pp. 138150.
        . J. Microprocess. Microsyst. , 138 - 150
    18. 18)
      • P. Zicari , S. Perri , P. Corsonello .
        18. Zicari, P., Perri, S., Corsonello, P., et al: ‘An optimized adder accumulator for high speed MACs’. IEEE Int. Conf. on ASIC, October 2005, vol. 2, pp. 757760.
        . IEEE Int. Conf. on ASIC , 757 - 760
    19. 19)
      • A. Abdelgawad .
        19. Abdelgawad, A.: ‘Low power multiply accumulate unit (MAC) for future wireless sensor networks’. IEEE Sensors Applications Symp. (SAS), February 2013, pp. 129132.
        . IEEE Sensors Applications Symp. (SAS) , 129 - 132
    20. 20)
      • N. Eftaxiopoulos , G. Zervakis , K. Pekmestzi .
        20. Eftaxiopoulos, N., Zervakis, G., Pekmestzi, K., et al: ‘High performance MAC designs’. IEEE Int. Design and Test Symp., December 2014, pp. 3035.
        . IEEE Int. Design and Test Symp. , 30 - 35
    21. 21)
      • J. Garland , D. Gregg .
        21. Garland, J., Gregg, D.: ‘Low complexity multiply accumulate unit for weight-sharing convolutional neural networks’, IEEE Comput. Archit. Lett., 2017, (in press).
        . IEEE Comput. Archit. Lett.
    22. 22)
      • C. Hamacher , Z. Vranesic , S. Zaky . (2012)
        22. Hamacher, C., Vranesic, Z., Zaky, S., et al: ‘Computer organization and embedded systems’ (The McGraw Hill Publications, 2012, 6th edn.), pp. 336421.
        .
    23. 23)
      • E. Casseau , B. Le Gal .
        23. Casseau, E., Le Gal, B.: ‘Design of multi-mode application-specific cores based on high-level synthesis’, Integr. VLSI J., 2012, 45, pp. 921.
        . Integr. VLSI J. , 9 - 21
    24. 24)
      • R. Gonzalez , B.M. Gordon , M.A. Horowitz .
        24. Gonzalez, R., Gordon, B.M., Horowitz, M.A.: ‘Supply and threshold voltage scaling for low power CMOS’, IEEE J. Solid State Circuits, 1997, 32, (8), pp. 12101216.
        . IEEE J. Solid State Circuits , 8 , 1210 - 1216
    25. 25)
      • C.-T. Sah . (1994)
        25. Sah, C.-T.: ‘Fundamentals of solid state electronics’ (World Scientific Publishing, 1994), pp. 575585.
        .
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2017.0051
Loading

Related content

content/journals/10.1049/iet-cdt.2017.0051
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address