Accelerating wavelet lifting on graphics hardware using CUDA

W.J. Laan, van der, J.B.T.M. Roerdink, A.C. Jalba

Research output: Contribution to journalArticleAcademicpeer-review

98 Citations (Scopus)
1 Downloads (Pure)

Abstract

The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as massively parallel coprocessors through NVidia's CUDA compute paradigm. The three main hardware architectures for the 2D DWT (row-column, line-based, block-based) are shown to be unsuitable for a CUDA implementation. Our CUDA-specific design can be regarded as a hybrid method between the row-column and block-based methods. We achieve considerable speedups compared to an optimized CPU implementation and earlier non-CUDA-based GPU DWT methods, both for 2D images and 3D volume data. Additionally, memory usage can be reduced significantly compared to previous GPU DWT methods. The method is scalable and the fastest GPU implementation among the methods considered. A performance analysis shows that the results of our CUDA-specific design are in close agreement with our theoretical complexity analysis.
Original languageEnglish
Pages (from-to)132-146
JournalIEEE Transactions on Parallel and Distributed Systems
Volume22
Issue number1
DOIs
Publication statusPublished - 2011

Fingerprint Dive into the research topics of 'Accelerating wavelet lifting on graphics hardware using CUDA'. Together they form a unique fingerprint.

  • Cite this