Bones : an automatic skeleton-based C-to-CUDA compiler for GPUs

C. Nugteren, H. Corporaal

Onderzoeksoutput: Bijdrage aan tijdschriftTijdschriftartikelAcademicpeer review

21 Citaten (Scopus)
4 Downloads (Pure)


The shift toward parallel processor architectures has made programming and code generation increasingly challenging. To address this programmability challenge, this article presents a technique to fully automatically generate efficient and readable code for parallel processors (with a focus on GPUs). This is made possible by combining algorithmic skeletons, traditional compilation, and "algorithmic species," a classification of program code. Compilation starts by automatically annotating C code with class information (the algorithmic species). This code is then fed into the skeleton-based source-to-source compiler bones to generate CUDA code. To generate efficient code, bones also performs optimizations including host-accelerator transfer optimization and kernel fusion. This results in a unique approach, integrating a skeleton-based compiler for the first time into an automated flow. The benefits are demonstrated experimentally for PolyBench GPU kernels, showing geometric mean speed-ups of 1.4× and 2.4× compared to ppcg and Par4All, and for five Rodinia GPU benchmarks, showing a gap of only 1.2× compared to hand-optimized code.
Originele taal-2Engels
Pagina's (van-tot)35-1-35-25
TijdschriftACM Transactions on Architecture and Code Optimization
Nummer van het tijdschrift4
StatusGepubliceerd - 2014


Duik in de onderzoeksthema's van 'Bones : an automatic skeleton-based C-to-CUDA compiler for GPUs'. Samen vormen ze een unieke vingerafdruk.

Citeer dit