Abstract
We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia Fermi processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at ˜¿30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster.
Keywords: Computational fluid-dynamics – Lattice Boltzmann methods – GP-GPUs computing
Original language | English |
---|---|
Title of host publication | Parallel Processing and Applied Mathematics : 9th International Conference, PPAM 2011, Torun, Poland, September 11-14, 2011. Revised Selected Papers, Part I |
Editors | R. Wyrzykowski, J. Dongarra, K. Karczewski, J. Wasniewski |
Place of Publication | Berlin |
Publisher | Springer |
Pages | 640-650 |
ISBN (Print) | 978-3-642-31463-6 |
DOIs | |
Publication status | Published - 2012 |
Event | 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011 - Torun, Poland Duration: 11 Sept 2011 → 14 Sept 2011 Conference number: 9 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Volume | 7203 |
ISSN (Print) | 0302-9743 |
Conference
Conference | 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011 |
---|---|
Abbreviated title | PPAM 2011 |
Country/Territory | Poland |
City | Torun |
Period | 11/09/11 → 14/09/11 |