This paper presents a technique to fully automatically generate efficient and readable code for parallel processors. We base our approach on skeleton-based compilation and ‘algorithmic species’, an algorithm classification of program code. We use a tool to automatically annotate C code with species information where possible. The annotated program code is subsequently fed into the skeleton-based source-to-source compiler ‘Bones’, which generates OpenMP, OpenCL or CUDA code and optimises host-accelerator transfers. This results in a unique approach, integrating a skeleton-based compiler for the first time into an automated flow. We demonstrate the benefits of our approach on the PolyBench suite by showing average speed-ups of 1.4x and 1.6x for GPU code compared to ppcg and Par4All, two state-of-the-art compilers.
|Name||Lecture Notes in Computer Science|
|Conference||conference; Advanced parallel processing technologies : 10th international symposium, APPT 2013; 2013-08-27; 2013-08-28|
|Period||27/08/13 → 28/08/13|
|Other||Advanced parallel processing technologies : 10th international symposium, APPT 2013|