Summary for the GPU hackathon at CSC

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

ENCCS participates in GPU hackathon

An event of LUMI GPU / Nomad CoE Hackathon was hosted oon Sept. 4-6 at CSC – IT Center for Science, a Finnish center of expertise in information technology - for research software developer teams targeting the GPU partition of LUMI with AMD MI250X GPUs. Seven teams with their projects focusing on computational materials science and computational fluid dynamics were invited, and these participating teams was mentored by experts from AMD, HPE and EuroHPC Competence Centers.
Two members from ENCCS, Yonglei Wang and Wei Li, were working as active mentors for this GPU hackathon.

Within the three-day GPU hackathon, Yonglei worked with the QuantumESPRESSO (a suite for first-principles electronic-structure calculations and materials modeling) team (Fabrizio Ferrari Ruffino, Ivan Carnimeo, Oscar Baseggio, and Laura Bellentani) focusing on batched/streamed FFTs (async, data movement) and porting/profiling Hubbard code (matrices, optimal batch sizes).
For the first topic, we proposed using double loop as hip kernels so that one can execute these kernels on given streams, implemented relevant models, and then compared the performance of FFT schemes with and without streaming computations implemented in CUDA and HIP code.
For the second topic, we worked on the Hubbard code (force and stress) unifying interfaces for different offload models (openACC vs OpenMP), identified the bottlenecks in the Hubbard code, and found a suitable test-case to trace the performance of different code blocks.

Wei Li served as one of the mentors for the FHI-AIMS team. The team consisted of three talented members from TU Dresden and Aalto university. They quickly adapted their CUDA code to run on LUMI-G node using hipfiy. At begining the code was even slower than the CPU version, but they soon relized there was a big overhead caused by creating stream and allocating arrays inside the nested loops. With the guidance of the AMD expert, they perfectly solved this problem. The three-day hackathon was not enough for them to implement their final idea of reordering the loops related to tensor operations for better GPU suitability. Good wishes for the FHI-AIMS team.

After the three-day GPU hackathon, not only the participation teams but also the mentors from ENCCS got significant improvement regarding specific applications of GPU programming.

Read more

Practical Machine Learning -- Event Page

Practical Introduction to GPU Programming

Practical Deep Learning - Schedule

[Webinar] Practical Introduction to GPU Programming