Abstract
Algorithmic skeletons can be used to write architecture independent programs, shielding application developers from the details of a parallel implementation. In this paper, we present a C-like skeleton implementation language, PEPCI, that uses term rewriting and partial evaluation to specify skeletons for parallel C dialects. By using skeletons to control the iteration of kernel functions, we provide a stream programming language that is better tailored to the user as well as the underlying architecture. Skeleton merging allows us to reduce the overheads usually associated with breaking an application into small kernels. We have implemented an example image processing application on a heterogeneous embedded prototype platform consisting of an SIMD and ILP processor, and show that a significant speedup can be achieved without requiring knowledge of data parallel processing.
| Original language | English |
|---|---|
| Title of host publication | 20th International Parallel and Distributed Processing Symposium, IPDPS 2006 |
| Place of Publication | Piscataway |
| Publisher | Institute of Electrical and Electronics Engineers |
| Number of pages | 9 |
| ISBN (Print) | 1-4244-0054-6 |
| DOIs | |
| Publication status | Published - 2006 |
| Event | 20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006 - Rhodes Island, Greece Duration: 25 Apr 2006 → 29 Apr 2006 |
Conference
| Conference | 20th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2006 |
|---|---|
| Country/Territory | Greece |
| City | Rhodes Island |
| Period | 25/04/06 → 29/04/06 |