access icon openaccess Graphics processing unit implementation and optimisation of a flexible maximum a-posteriori decoder for synchronisation correction

In this paper, the author presents an optimised parallel implementation of a flexible maximum a-posteriori decoder for synchronisation error correcting codes, supporting a very wide range of code sizes and channel conditions. On mid-range GPUs the author demonstrates decoding speedups of more than two orders of magnitude over a central processing unit implementation of the same optimised algorithm, and more than an order of magnitude over the author's earlier GPU implementation. The prominent challenge is to maintain high parallelisation efficiency over a wide range of code sizes and channel conditions, and different execution hardware. The author ensures this with a dynamic strategy for choosing parallel execution parameters at run-time. They also present a variant that trades off some decoding speed for significantly reduced memory requirement, with no loss to the decoder's error correction performance. The increased throughput of their implementation and its ability to work with less memory allow us to analyse larger codes and poorer channel conditions, and makes practical use of such codes more feasible.

Inspec keywords: error correction codes; graphics processing units; parallel algorithms; synchronisation; maximum likelihood decoding

Other keywords: synchronisation error correcting codes; flexible maximum a-posteriori decoder optimisation; high parallelisation efficiency; GPUs; channel conditions; code sizes; dynamic strategy; parallel execution parameters; central processing unit; optimised parallel algorithm; decoder error correction performance; graphics processing unit

Subjects: Codes; Microprocessors and microcomputers

References

    1. 1)
    2. 2)
    3. 3)
    4. 4)
    5. 5)
    6. 6)
    7. 7)
    8. 8)
    9. 9)
      • 5. Bardyn, D., Briffa, J.A., Dooms, A., Schelkens, P.: ‘Forensic data hiding optimized for JPEG 2000’. Proc. IEEE Int. Symp. Circuits and Systems, Rio de Janeiro, Brazil, 15–18 May 2011.
    10. 10)
    11. 11)
      • 10. NVIDIA CUDA C Programming Guide, NVIDIA Corporation, October 2012, version 5.0.
    12. 12)
    13. 13)
      • 16. Ratzer, E.A.: ‘Marker codes for channels with insertions and deletions’, Ann. Telecommun., 2005, 60, pp. 2944.
    14. 14)
      • 8. Buttigieg, V., Briffa, J.A.: ‘Codebook and marker sequence design for synchronization-correcting codes’. Proc. IEEE Int. Symp. Information Theory, St. Petersburg, Russia, 31 July–5 August 2011.
    15. 15)
      • 22. NVIDIA's Next Generation CUDA Compute Architecture: Kepler GK110, NVIDIA Corporation, 2012, version 1.0.
    16. 16)
    17. 17)
      • 18. Granlund, T.: ‘GNU MP: the GNU multiple precision arithmetic library’, Free Software Foundation, December 2012, edition 5.1.0. [Online]. Available at http://www.gmplib.org/gmp-man-5.1.0.pdf.
    18. 18)
    19. 19)
    20. 20)
    21. 21)
    22. 22)
      • 11. Briffa, J.A., Buttigieg, V., Wesemeyer, S.: ‘Time-varying block codes for synchronization errors: MAP decoder and practical issues’, J. Eng., 2014, doi: 10.1049/joe.2014.0062.
    23. 23)
      • 7. Briffa, J.A., Schaathun, H.G., Wesemeyer, S.: ‘An improved decoding algorithm for the Davey–MacKay construction’. Proc. IEEE Int. Conf. Communications, Cape Town, South Africa, 23–27 May 2010.
    24. 24)
      • 20. Eckel, B.: ‘Thinking in C++’ (Pearson Education, 2000, 2nd edn.), vol. 1.
    25. 25)
      • 21. Eckel, B., Allison, C.: ‘Thinking in C++’ (Pearson Education, 2003, 2nd edn.), vol. 2.
    26. 26)
      • 17. Briffa, J.A., Schaathun, H.G.: ‘Improvement of the Davey–MacKay construction’. Proc. IEEE Int. Symp. Information Theory and its Applications, Auckland, New Zealand, 7–10 December 2008, pp. 235238.
    27. 27)
      • 23. NVIDIA CUDA C Best Practices Guide, NVIDIA Corporation, October 2012, version 5.0.
    28. 28)
      • 14. Xianjun, J., Canfeng, C., Jaaskelainen, P., Guzma, V., Berg, H.: ‘A 122Mb/s turbo decoder using a mid-range GPU’. 2013 Ninth Int. Wireless Communications and Mobile Computing Conf. (IWCMC), 2013, pp. 10901094.
    29. 29)
      • 12. Lee, D., Wolf, M., Kim, H.: ‘Design space exploration of the turbo decoding algorithm on GPUs’. Proc. Int. Conf. Compilers, Architectures and Synthesis for Embedded Systems. ACM, 2010, pp. 217226.
    30. 30)
      • 19. Rennich, S.: ‘CUDA C/C++ streams and concurrency’. GPU Technology Conf. NVIDIA, 2011.
    31. 31)
http://iet.metastore.ingenta.com/content/journals/10.1049/joe.2014.0049
Loading

Related content

content/journals/10.1049/joe.2014.0049
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading