Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

access icon free CUDA memory optimisation strategies for motion estimation

As video processing technologies continue to rise quicker than central processing unit (CPU) performance in complexity and image resolution, data-parallel computing methods will be even more important. In fact, the high-performance, data-parallel architecture of modern graphics processing unit (GPUs) can minimise execution times by orders of magnitude or more. However, creating an optimal GPU implementation not only needs converting sequential implementation of algorithms into parallel ones but, more importantly, needs cautious balancing of the GPU resources. It requires also an understanding of the bottlenecks and defect caused by memory latency and code computing. The defiance is even greater when an implementation exceeds the GPU resources. In this study, the authors discuss the parallelisation and memory optimisation strategies of a computer vision application for motion estimation using the NVIDIA compute unified device architecture (CUDA). It addresses optimisation techniques for algorithms that surpass the GPU resources in either computation or memory resources for CUDA architecture. The proposed implementation reveals a substantial improvement in both speed up (SU) and peak signal-to-noise ratio (PSNR). Indeed, the implementation is up to 50 times faster than the CPU counterpart. It also provides an increase in PSNR of the coded test sequence up to 8 dB.

http://iet.metastore.ingenta.com/content/journals/10.1049/iet-cdt.2017.0149
Loading

Related content

content/journals/10.1049/iet-cdt.2017.0149
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address