access icon free XNORCONV: CNNs accelerator implemented on FPGA using a hybrid CNNs structure and an inter-layer pipeline method

Nowadays, convolutional neural networks (CNNs) have become a research hotspot because of their high performance in computer vision and pattern recognition. However, as the high energy consumption of traditional graphic processing units-based CNNs, it is difficult to deploy them into portable devices. To deal with this problem, a hybrid CNN structure (XNORCONV) was proposed and implemented on field-programmable gate array (FPGA) in this study. Two improvements have been applied in XNORCONV. Firstly, the multiplications in the convolutional layer (CONV) were replaced by XNOR operations to save the multiplier and reduce computational complexity. Secondly, an inter-layer pipeline was designed to further accelerate the calculation. XNORCONV was implemented on Xilinx Zynq-7000 xc7z020clg400-1 under the clock frequency of 150 MHz and tested with MNIST dataset. The results of the experiment show that XNORCONV can classify each picture from MNIST in , and achieve 98.4% recognition accuracy. Compared with traditional Lenet-5 on different platforms, XNORCONV reduced multiplication by 85.6% with only 0.4% accuracy loss.

Inspec keywords: convolutional neural nets; logic design; computational complexity; pipeline processing; field programmable gate arrays; multiplying circuits; logic gates; neural chips

Other keywords: Xilinx Zynq-7000 xc7z020clg400-1; clock frequency; XNORCONV; graphic processing units-based CNNs; convolutional neural networks; pattern recognition; field-programmable gate array; energy consumption; XNOR operations; multiplier; computational complexity; convolutional layer; hybrid CNNs structure; FPGA; MNIST dataset; CNNs accelerator; inter-layer pipeline design; computer vision

Subjects: Logic and switching circuits; Logic design methods; Neural net devices; Computational complexity; Digital circuit design, modelling and testing; Neural nets (circuit implementations); Logic elements; Logic circuits

References

    1. 1)
      • 2. Sudha, N., Abhishek, K., Chaitree, S., et al: ‘Traffic sign recognition using weighted multi-convolutional neural network’, IET. Int. Trans. Syst., 2018, 11, pp. 13961405.
    2. 2)
      • 9. Zhang, C., Li, P.: ‘Optimizing FPGA-based accelerator design for deep convolutional neural networks’. ACM. Int. Symp. FPGA., California, USA, 2015, pp. 161170.
    3. 3)
      • 5. Yanmin, Q., Mengxiao, B., Tian, T., et al: ‘Very deep convolutional neural networks for noise robust speech recognition’, IEEE/ACM. Trans. Audio. Speech. Lang. Process., 2016, 8, pp. 22632276.
    4. 4)
      • 12. Sankaradas, M., Jakkula, V.: ‘A massively parallel coprocessor for convolutional neural networks’. Int. Conf. Application-Specific Systems, Boston, USA, 2009, pp. 5360.
    5. 5)
      • 14. Li, H., Fan, X., Jiao, L., et al: ‘A high performance FPGA-based accelerator for large-scale convolutional neural networks’. Int. Conf. Field Programmable Logic and Applications, Lausanne, Switzerland, 2016, pp. 19.
    6. 6)
      • 6. Takuya, F., Kyosuke, M., Satoshi, T., et al: ‘Speed-Up of object detection neural network with GPU’. IEEE Int. Conf. Image Processing, Athens, Greece, 2018, pp. 710.
    7. 7)
      • 17. Ujiie, T., Hiromoto, M., Sato, T.: ‘Approximated prediction strategy for reducing power consumption of convolutional neural network processor’. Computer Vision and Pattern Recognition Workshops, 2016, vol. 7, pp. 870876.
    8. 8)
      • 22. Rastegari, M., Ordonez, V., Redmon, J., et al: ‘XNOR-Net: ImageNet classification using binary convolutional neural networks’. European Conf. Computer Vision, Amsterdam, Netherlands, 2016, pp. 525542.
    9. 9)
      • 1. Yan, C., Zhang, B., Coenen, F.: ‘Driving posture recognition by convolutional neural networks’, Int. Conf. Nat. Comput., 2016, 2, pp. 103114.
    10. 10)
      • 16. Han, X., Zhou, D., Wang, S., et al: ‘CNN-MERP: an FPGA-based memory-efficient reconfigurable processor for forward and backward propagation of convolutional neural networks’. IEEE. Int. Conf. Computer Design, Arizona, USA, 2016, pp. 320327.
    11. 11)
      • 18. Ma, Y., Suda, N., Cao, Y., et al: ‘Scalable and modularized RTL compilation of convolutional neural networks onto FPGA’. Int. Conf. Field-Programmable Logic and Applications, Arizona, USA, 2016, pp. 18.
    12. 12)
      • 15. Zhao, W., Fu, H., Luk, W., et al: ‘F-CNN: an FPGA-based framework for training convolutional neural networks’. Int. Conf. Application-Specific Systems, London, England, 2016, pp. 107114.
    13. 13)
      • 13. Cadambi, S., Majumdar, A., Becchi, M.: ‘A programmable parallel accelerator for learning and classification’. Int. Conf. Parallel Architectures Compilation Techniques., Vienna, Austria, 2010, pp. 273284.
    14. 14)
      • 4. Xiang, W., Ran, H., Zhenan, S., et al: ‘A light CNN for deep face representation with noisy labels’, IEEE. Trans. Inf. Forensics. Sec., 2018, 5, pp. 28842896.
    15. 15)
      • 7. Du, Z., Fasthuber, R.: ‘Shidiannao: shifting vision processing closer to the sensor’. ACM. Int. Symp. Computer Architecture, Portland, USA, 2015.
    16. 16)
      • 3. Changxing, D., Dacheng, T.: ‘Trunk-branch ensemble convolutional neural networks for video-based face recognition’, IEEE. Trans. Pattern Anal. Mach. Intell., 2017, 5, pp. 10021014.
    17. 17)
      • 11. Suda, N., Chandra, V.: ‘Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks’. ACM. Int. Symp. FPGA., California, USA, 2015.
    18. 18)
      • 21. Courbariaux, M., Hubara, I., Soudry, D., et al: ‘Binarized neural networks: training deep neural networks with weights and activations constrained to + 1 or −1’, 2016, arXiv ID: 1602.02830.
    19. 19)
      • 8. Chen, Y., Luo, Y.: ‘Dadiannao: a machine-learning supercomputer’. ACM. Int. Symp. Microarchitecture, Cambridge, England, 2014, pp. 609622.
    20. 20)
      • 19. Wang, C., Gong, L., Yu, Q., et al: ‘DLAU: A scalable deep learning accelerator unit on FPGA’, IEEE. Trans. Comput.-Aided Des. Integr. Circuits Syst., 2017, 7, pp. 513517.
    21. 21)
      • 20. Lecun, Y., Bottou, L., Bengio, Y., et al: ‘Gradient-based learning applied to document recognition’, IEEE. Proc., 1998, 11, pp. 22782324.
    22. 22)
      • 10. Qiu, J., Wang, J.: ‘Going deeper with embedded FPGA platform for convolutional neural network’. ACM. Int. Symp. FPGA., California, USA, 2016.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-ipr.2019.0385
Loading

Related content

content/journals/10.1049/iet-ipr.2019.0385
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading