Compact and high-throughput parameterisable architectures for memory-based FFT algorithms

Compact and high-throughput parameterisable architectures for memory-based FFT algorithms

For access to this article, please select a purchase option:

Buy article PDF
(plus tax if applicable)
Buy Knowledge Pack
10 articles for $120.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Your details
Why are you recommending this title?
Select reason:
IET Circuits, Devices & Systems — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Designers must carefully choose the best-suited fast Fourier transform (FFT) algorithm among various available techniques for the custom implementation that meets their design requirements, such as throughput, latency, and area. This article, to the best of authors' knowledge, is the first to present a compact and yet high-throughput parameterisable hardware architecture for implementing different FFT algorithms, including radix-2, radix-4, radix-8, mixed-radix, and split-radix algorithms. The designed architectures are fully parameterisable to support a variety of transform lengths and variable word-lengths. The FFT algorithms have been modelled and simulated in double-precision floating-point and fixed-point representations using authors' custom-developed library of numerical operations. The designed FFT architectures are modelled in Verilog hardware description language and their cycle-accurate and bit-true simulation results are verified against their fixed-point simulation models. The characteristics and implementation results of various FFT architectures on a Xilinx Virtex-7 FPGA are presented. Compared to recently published works, authors' memory-based FFT architectures utilise less reconfigurable resources while maintaining comparable or higher operating frequencies. The ASIC implementation results in a standard 45-nm CMOS technology are also presented for the designed memory-based FFT architectures. The execution times of FFTs on a workstation and a graphics processing unit are compared against authors' FPGA implementations.

Inspec keywords: logic design; application specific integrated circuits; signal flow graphs; CMOS memory circuits; fixed point arithmetic; fast Fourier transforms; memory architecture; discrete Fourier transforms; integrated circuit design; hardware description languages; CMOS logic circuits; field programmable gate arrays; reconfigurable architectures; floating point arithmetic

Other keywords: radix-4 algorithm; double-precision floating-point representation; fixed-point representations; author memory-based FFT architectures; ASIC; fixed-point simulation models; numerical operations; radix-8 algorithm; Verilog hardware description language; regular structures; cycle-accurate simulation; radix-2 algorithm; symmetries properties; standard CMOS technology; compact-throughput parameterisable architectures; discrete FFT algorithms; signal flow graphs; split-radix algorithms; size 45 nm; bit-true simulation; mixed-radix algorithm; reconfigurable resources; Xilinx Virtex-7 FPGA; author custom-developed library; high-throughput parameterisable architectures; fast Fourier transform algorithm; synthesisable FFT architectures

Subjects: Digital circuit design, modelling and testing; Storage system design; Logic circuits; Microprocessors and microcomputers; Digital arithmetic methods; Combinatorial mathematics; CMOS integrated circuits; Logic and switching circuits; Integral transforms in numerical analysis; Computer architecture; Integral transforms in numerical analysis; Logic design methods; Semiconductor storage; Combinatorial mathematics; Memory circuits


    1. 1)
      • 1. Cooley, J.W., Tukey, J.W.: ‘An algorithm for the machine calculation of complex Fourier series’, Math. Comput., 1965, 19, (1), pp. 297301.
    2. 2)
      • 2. Duhamel, P., Hollmann, H.: ‘Split radix FFT algorithm’, Electron. Lett., 1984, 20, (4), pp. 1416.
    3. 3)
      • 3. Saenz, J., Raygoza, J., Becerra, E., et al: ‘FPGA design and implementation of radix-2 fast Fourier transform algorithm with 16 and 32 points’. IEEE Int. Autumn Meeting on Power, Electronics and Computing, Ixtapa, Mexico, 2015, pp. 16.
    4. 4)
      • 4. Bansal, M., Nakhate, S.: ‘High speed pipelined 64-point FFT processor based on radix-22 for wireless LAN’. Int. Conf. on Signal Processing and Integrated Networks, Noida, India, 2017, pp. 607612.
    5. 5)
      • 5. Suleiman, I.: ‘FPGA implementation of low power 64-point radix-4 FFT processor for OFDM system’. Int. Conf. on Computers, Communications, & Signal Processing, Kuala Lumpur, Malaysia, 2005, pp. 278281.
    6. 6)
      • 6. ‘Xilinx Incorporation’. Available at, accessed August 2018.
    7. 7)
      • 7. Chen, J., Hu, J., Lee, S., et al: ‘Hardware efficient mixed radix-25/16/9 FFT for LTE systems’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2015, 23, (2), pp. 221228.
    8. 8)
      • 8. Uzun, I.S., Amira, A., Bouridane, A.: ‘FPGA implementations of fast Fourier transforms for real-time signal and image processing’, IEE Proc., Vis. Image Signal Process., 2005, 152, (3), pp. 283296.
    9. 9)
      • 9. Siu, T., Sham, C., Lau, F.: ‘Operating frequency improvement on FPGA implementation of a pipeline large-FFT processor’. Int. Conf. on Advanced Communication Technology, Bongpyeong, South Korea, 2017, pp. 59.
    10. 10)
      • 10. Garrido, M., Sanchez, M., Lopez Vallejo, M.L., et al: ‘A 4096-point radix-4 memory-based FFT using DSP slices’, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 2017, 25, (1), pp. 375379.
    11. 11)
      • 11. Shih, X., Chou, H.: ‘Reconfigurable VLSI design of a changeable hybrid-radix FFT hardware architecture with 2D-FIFO storing structure for 3GPP LTE systems’, ICT Express, 2018, 4, (3), pp. 144148.
    12. 12)
      • 12. Yu, C., Lee, K., Kuo, C.: ‘Low-complexity twiddle factor generator for FFT processors’. Int. Conf. on Consumer Electronics, Las Vegas, USA, 2017, pp. 350351.
    13. 13)
      • 13. Long, S., Hong, M., Shiue, M.: ‘A low-complexity generalized memory addressing scheme for continous-flow fast Fourier transform’. Int. Conf. on Computer and Communication Systems, Nagoya, Japan, 2018, pp. 492496.
    14. 14)
      • 14. Parhi, K., Messerschmitt, D.G.: ‘Pipeline interleaving and parallelism in recursive digital filters – part I: pipelining using scattered look-ahead and decomposition’, IEEE Trans. Acoust. Speech Signal Process., 1989, 37, (7), pp. 10991117.
    15. 15)
      • 15. Ma, C., Chen, H., Yu, J., et al: ‘A novel conflict-free parallel memory access scheme for FFT constant geometry architectures’, Sci. China Inform. Sci., 2013, 56, (4), pp. 19.
    16. 16)
      • 16. Wang, L., Zhou, X., Sobelman, G., et al: ‘Generic mixed-radix FFT pruning’, IEEE Signal Process. Lett., 2012, 19, (3), pp. 167170.
    17. 17)
      • 17. ‘On-line Encyclopedia of Integer Sequences’. Available at, accessed December 2018.

Related content

This is a required field
Please enter a valid email address