Accelerating wavelet lifting on graphics hardware using CUDA

W.J. Laan, van der, J.B.T.M. Roerdink, A.C. Jalba

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

101 Citaten (Scopus)
1 Downloads (Pure)


The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as massively parallel coprocessors through NVidia's CUDA compute paradigm. The three main hardware architectures for the 2D DWT (row-column, line-based, block-based) are shown to be unsuitable for a CUDA implementation. Our CUDA-specific design can be regarded as a hybrid method between the row-column and block-based methods. We achieve considerable speedups compared to an optimized CPU implementation and earlier non-CUDA-based GPU DWT methods, both for 2D images and 3D volume data. Additionally, memory usage can be reduced significantly compared to previous GPU DWT methods. The method is scalable and the fastest GPU implementation among the methods considered. A performance analysis shows that the results of our CUDA-specific design are in close agreement with our theoretical complexity analysis.
Originele taal-2Engels
Pagina's (van-tot)132-146
TijdschriftIEEE Transactions on Parallel and Distributed Systems
Nummer van het tijdschrift1
StatusGepubliceerd - 2011

Vingerafdruk Duik in de onderzoeksthema's van 'Accelerating wavelet lifting on graphics hardware using CUDA'. Samen vormen ze een unieke vingerafdruk.

Citeer dit