May 20, 2025 - by CSCS
This webinar provides existing and new users with best practices for executing, debugging and analyzing multi-GPU codes on the CSCS Alps supercomputer.
The first part covers key concepts such as NUMA (Non-Uniform Memory Access), hardware topology, expected FLOPs (Floating Point Operations per Second), and network performance. It also includes practical aspects of device visibility, SLURM configuration, and the use of wrapper scripts to optimize multi-GPU workloads.
The second introduces parallel debugging (LINARO) and performance measurement (NVIDIA, VI-HPS) tools, exploring how these tools can assist users in maximizing their utilization of the Alps supercomputer.
Here, you can watch the video of the webinar >