Files

IMUM24_abstracts.pdf
  • Open Access
  • Adobe PDF
  • 4.45 MB

Details

Authors
Abstract
SLIM is a high-performance ocean modeling toolkit developed at UCLouvain, based on the Discontinuous Galerkin Finite Element Method (DG-FEM) with a split-explicit temporal integration scheme. In line with trends in the high-performance computing (HPC) community, the latest version of SLIM has been optimized for parallelization, with a particular emphasis on GPU acceleration. While GPUs are highly suited for large-scale simulations, they tend to be less efficient with smaller problem sizes, as not all of the GPU's resources can be fully utilized. This inefficiency is compounded when scaling across multiple GPUs due to their inherent latency, which limits strong scaling. Moreover, the split-explicit nature of the computation introduces an additional challenge, as the external (2D) mode requires numerous small iterations for every computationally intensive internal (3D) mode iteration. In this work, we address the challenge of efficiently distributing computations across multiple GPUs, presenting techniques designed to minimize latency and maximize throughput. We then assess the performance and scaling of SLIM3D from a single core on a laptop to hundreds of GPUs on a supercomputer. Based on this analysis, we highlight the conditions necessary to achieve optimal performance and scaling on different types of hardware from NVIDIA and AMD as well as CPUs.
Affiliations

Citations

De Le Court, M., Lambrechts, J., Legat, V., & Hanert, E. (2024). Assessing the performances of SLIM3D for multi-GPU simulations. IMUM24 - Book of Abstracts. Published. 21st International Workshop on Multi-scale Unstructured mesh numerical Modeling (IMUM), Louvain-la-Neuve (Belgium). https://hdl.handle.net/2078.5/235293