NARMADA: near-memory horizontal diffusion accelerator for scalable stencil computations

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

8 Citations (Scopus)

Abstract

Real-world weather forecasting applications consist of compound stencil kernels that do not perform well on conventional architectures. This behavior is due to their complex data access patterns, limited data reusability, and low arithmetic intensity. To overcome these issues, we harness the potential of near-memory computing by offloading a horizontal diffusion kernel, which is a compound stencil kernel, from the COSMO weather prediction application to a reconfigurable fabric. We use a heterogeneous system that comprises a CPU and an FPGA with on-chip SRAM memory and on-board DRAM memory. By introducing a memory hierarchy tailored to the targeted application and using a coherent memory model, we move the computation close to the memory, which improves memory efficiency. Our hardware design on the FPGA uses high-level synthesis techniques and results in an accelerator with IBM CAPI 2.0 (Coherent Accelerator Processor Interface) technology. We evaluate it against a tuned software implementation running on an IBM POWER9 host system. The experimental results show that these kernels on an FPGA can outperform a complete 16-core POWER9 node (configured with 64 threads) by 3.3x. Moreover, our solution provides an 18x improvement in the active energy consumption.

Original languageEnglish
Title of host publicationProceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019
EditorsIoannis Sourdis, Christos-Savvas Bouganis, Carlos Alvarez, Leonel Antonio Toledo Diaz, Pedro Valero, Xavier Martorell
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers
Pages263-269
Number of pages7
ISBN (Electronic)978-1-7281-4884-7
DOIs
Publication statusPublished - Sept 2019
Event29th International Conferenceon Field Programmable Logic and Applications, FPL 2019 - Barcelona, Spain
Duration: 9 Sept 201913 Sept 2019
Conference number: 29

Conference

Conference29th International Conferenceon Field Programmable Logic and Applications, FPL 2019
Abbreviated titleFPL 2019
Country/TerritorySpain
CityBarcelona
Period9/09/1913/09/19

Funding

ACKNOWLEDGEMENT This work was performed in the framework of the Horizon 2020 program for the project “Near-Memory Computing (NeMeCo)”. It is funded by the European Commission under Marie Sklodowska-Curie Innovative Training Networks European Industrial Doctorate (Project ID: 676240). We would also like to thank Martino Dazzi for his valuable remarks. This work was partially supported by the H2020 research and innovationprogramme under grant agreement No 732631, project OPRECOMP.

Keywords

  • CAPI
  • Energy-efficiency
  • FPGA
  • HPC
  • Near-memory computing
  • Performance
  • near-memory computing
  • performance
  • energy-efficiency

Fingerprint

Dive into the research topics of 'NARMADA: near-memory horizontal diffusion accelerator for scalable stencil computations'. Together they form a unique fingerprint.

Cite this