Abstract
In the last three years, GPUs are more and more being used for general purpose applications instead of only for computer graphics. Programming these GPUs is a big challenge; in current GPUs the main bottleneck for many applications is not the computing power, but the memory access bandwidth. Two compile-time optimizations are presented in this paper to deal with the two most important memory access issues. To describe these optimizations, a new notation of the parallel execution of GPU programs is introduced. An implementation of the optimizations shows that performance improvements of up to 40 times are possible.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2010 International Conference on Embedded Computer Systems (SAMOS), 19-22 July , 2010, Samos Greece |
Editors | F.J. Kurdahi, J. Takala |
Place of Publication | Piscataway |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 200-207 |
ISBN (Print) | 978-1-4244-7937-5 |
DOIs | |
Publication status | Published - 2010 |