Performance analysis and optimization of the JOREK code for many-core CPUs

T.B. Fehér, M. Hölzl, G. Latu, G.T.A. Huijsmans

Research output: Contribution to journalArticleAcademicpeer-review

7 Downloads (Pure)

Abstract

This report investigates the performance of the JOREK code on the Intel Knights Landing and Skylake processor architectures. The OpenMP scaling of the matrix construction part of the code was analyzed and improved synchronization methods were implemented. A new switch was implemented to control the number of threads used for the linear equation solver independently from other parts of the code. The matrix construction subroutine was vectorized, and the data locality was also improved. These steps led to a factor of two speedup for the matrix construction.
Original languageEnglish
Article number1810.04413
Number of pages15
JournalarXiv.org, e-Print Archive, Physics
Volume2018
Publication statusPublished - 1 Oct 2018

Keywords

  • Computer Science - Performance

Fingerprint Dive into the research topics of 'Performance analysis and optimization of the JOREK code for many-core CPUs'. Together they form a unique fingerprint.

  • Cite this