Abstract
Real-world weather forecasting applications consist of compound stencil kernels that do not perform well on conventional architectures. This behavior is due to their complex data access patterns, limited data reusability, and low arithmetic intensity. To overcome these issues, we harness the potential of near-memory computing by offloading a horizontal diffusion kernel, which is a compound stencil kernel, from the COSMO weather prediction application to a reconfigurable fabric. We use a heterogeneous system that comprises a CPU and an FPGA with on-chip SRAM memory and on-board DRAM memory. By introducing a memory hierarchy tailored to the targeted application and using a coherent memory model, we move the computation close to the memory, which improves memory efficiency. Our hardware design on the FPGA uses high-level synthesis techniques and results in an accelerator with IBM CAPI 2.0 (Coherent Accelerator Processor Interface) technology. We evaluate it against a tuned software implementation running on an IBM POWER9 host system. The experimental results show that these kernels on an FPGA can outperform a complete 16-core POWER9 node (configured with 64 threads) by 3.3x. Moreover, our solution provides an 18x improvement in the active energy consumption.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019 |
| Editors | Ioannis Sourdis, Christos-Savvas Bouganis, Carlos Alvarez, Leonel Antonio Toledo Diaz, Pedro Valero, Xavier Martorell |
| Place of Publication | Piscataway |
| Publisher | Institute of Electrical and Electronics Engineers |
| Pages | 263-269 |
| Number of pages | 7 |
| ISBN (Electronic) | 978-1-7281-4884-7 |
| DOIs | |
| Publication status | Published - Sept 2019 |
| Event | 29th International Conferenceon Field Programmable Logic and Applications, FPL 2019 - Barcelona, Spain Duration: 9 Sept 2019 → 13 Sept 2019 Conference number: 29 |
Conference
| Conference | 29th International Conferenceon Field Programmable Logic and Applications, FPL 2019 |
|---|---|
| Abbreviated title | FPL 2019 |
| Country/Territory | Spain |
| City | Barcelona |
| Period | 9/09/19 → 13/09/19 |
Funding
ACKNOWLEDGEMENT This work was performed in the framework of the Horizon 2020 program for the project “Near-Memory Computing (NeMeCo)”. It is funded by the European Commission under Marie Sklodowska-Curie Innovative Training Networks European Industrial Doctorate (Project ID: 676240). We would also like to thank Martino Dazzi for his valuable remarks. This work was partially supported by the H2020 research and innovationprogramme under grant agreement No 732631, project OPRECOMP.
Keywords
- CAPI
- Energy-efficiency
- FPGA
- HPC
- Near-memory computing
- Performance
- near-memory computing
- performance
- energy-efficiency