access icon free CUDA memory optimisation strategies for motion estimation

As video processing technologies continue to rise quicker than central processing unit (CPU) performance in complexity and image resolution, data-parallel computing methods will be even more important. In fact, the high-performance, data-parallel architecture of modern graphics processing unit (GPUs) can minimise execution times by orders of magnitude or more. However, creating an optimal GPU implementation not only needs converting sequential implementation of algorithms into parallel ones but, more importantly, needs cautious balancing of the GPU resources. It requires also an understanding of the bottlenecks and defect caused by memory latency and code computing. The defiance is even greater when an implementation exceeds the GPU resources. In this study, the authors discuss the parallelisation and memory optimisation strategies of a computer vision application for motion estimation using the NVIDIA compute unified device architecture (CUDA). It addresses optimisation techniques for algorithms that surpass the GPU resources in either computation or memory resources for CUDA architecture. The proposed implementation reveals a substantial improvement in both speed up (SU) and peak signal-to-noise ratio (PSNR). Indeed, the implementation is up to 50 times faster than the CPU counterpart. It also provides an increase in PSNR of the coded test sequence up to 8 dB.

Inspec keywords: parallel architectures; motion estimation; graphics processing units; computer vision; memory architecture; video signal processing; image resolution

Other keywords: video processing technologies; NVIDIA compute unified device architecture; CUDA memory optimisation strategies; peak signal-to-noise ratio; memory resources; code computing; central processing unit; high-performance data-parallel architecture; graphic processor units; CPU performance; GPUs; motion estimation; PSNR; memory latency; computer vision application; data-parallel computing methods; image resolution

Subjects: Parallel architecture; Optical, image and video signal processing; Video signal processing; Computer vision and image processing techniques; Microprocessor chips; Microprocessors and microcomputers

http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2017.0149
Loading

Related content

content/journals/10.1049/iet-cdt.2017.0149
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading