George Markomanolis

@gmarkoma

Joined on Feb 16, 2021

  • MPI Ghost Exchange Optimization Examples Changes Between Example Versions This code contains several implementations of the same ghost exchange algorithm at varying stages of optimization: Orig: Shows a CPU-only implementation that uses MPI, and serves as the starting point for further optimizations. It is recommended to start here! Ver1: Shows an OpenMP target offload implementation that uses the Managed memory model to port the code to GPUs using host allocated memory for MPI communication. Ver2: Shows the usage and advantages of using roctx ranges to get more easily readable profiling output from Omnitrace. Ver3: Under Construction, not expected to work at the moment Ver4: Explores heap-allocating communication buffers once on host.
     Like  Bookmark
  • Login to Lumi ssh USERNAME@lumi.csc.fi To simplify the login to LUMI, you can add the following to your .ssh/config file. # LUMI Host lumi User <USERNAME> Hostname lumi.csc.fi IdentityFile <HOME_DIRECTORY>/.ssh/id_rsa ServerAliveInterval 600
     Like  Bookmark
  • Omnitrace Load Omnitrace Allocate resources with salloc salloc -N 1 --ntasks=1 --partition=gpu-dev --gpus=1 -A XXX --time=00:35:00 Check the various options and their values and also a second command for description srun -n 1 --gpus 1 omnitrace-avail --categories omnitrace
     Like  Bookmark
  • Omnitrace Load Omnitrace Use the proper modules from Adastra as also the project (if any), the instructions below are generic Allocate resources with salloc salloc -N 1 --ntasks=1 --partition=small-g --gpus=1 -A project_465000532 --time=00:35:00 Check the various options and their values and also a second command for description
     Like  Bookmark
  • https://hackmd.io/@gmarkoma/eviden_training Get the initial material cp -r /tmp/HPCTrainingExamples/ . Logistics Reserve resources and login to the node Omnitrace
     Like  Bookmark
  • Get the initial material cp -r /Shared/HPCTrainingExamples/ . ROCgdb Save the following code in a file called let's say saxpy.hip or use the file from HPCTrainingExamples/HIP/saxpy/ #include <hip/hip_runtime.h> __constant__ float a = 1.0f;
     Like  Bookmark
  • You would need to replace the project accounts with the ones that are for your system git clone https://github.com/amd/HPCTrainingExamples.git We assume that you have already allocated resources with salloc cp -r /project/project_462000125/exercises/AMD/HPCTrainingExamples/ . salloc -N 1 -p small-g --gpus=1 -t 10:00 -A project_462000125
     Like  Bookmark
  • Login to Lumi ssh USERNAME@lumi.csc.fi To simplify the login to LUMI, you can add the following to your .ssh/config file. # LUMI Host lumi User <USERNAME> Hostname lumi.csc.fi IdentityFile <HOME_DIRECTORY>/.ssh/id_rsa ServerAliveInterval 600
     Like  Bookmark
  • Logistics Access to LUMI: ssh username@lumi.csc.fi Project account: project_465000524 Slides: /project/project_465000524/slides/ Working space: /scratch/project_465000524/$USER Omnitrace Load Omnitrace module load LUMI/22.08 partition/G rocm/5.3.3
     Like  Bookmark
  • Logistics Access to LUMI: ssh username@lumi.csc.fi Project account: project_465000532 Slides: /project/project_465000532/slides/ Working space: /scratch/project_465000532/$USER Omnitrace Load Omnitrace module load LUMI/22.08 partition/G rocm/5.3.3
     Like  Bookmark
  • Login to Lumi ssh USERNAME@lumi.csc.fi To simplify the login to LUMI, you can add the following to your .ssh/config file. # LUMI Host lumi User <USERNAME> Hostname lumi.csc.fi IdentityFile <HOME_DIRECTORY>/.ssh/id_rsa ServerAliveInterval 600
     Like 2 Bookmark
  • Allocate resources and load module: salloc -N 1 --ntasks=1 --partition=gpu-dev --gpus=1 -A courses01-gpu --time=00:15:00 module load rocm/5.0.2 cd $MYSCRATCH Rocprof Get the exercise: https://github.com/ROCm-Developer-Tools/HIP-Examples/tree/master/mini-nbody
     Like  Bookmark
  • Note: Reservation: small_g (--reservation=small_g)` Rocprof Get the exercise: https://github.com/ROCm-Developer-Tools/HIP-Examples/tree/master/mini-nbody Compile and run the code cd mini-nbody/hip
     Like  Bookmark
  • Rocprof Get the exercise: https://github.com/ROCm-Developer-Tools/HIP-Examples/tree/master/mini-nbody Compile and run the code cd mini-nbody/hip Can compile and run all with
     Like  Bookmark
  • We have made built the Omniperf without GUI support for use in the exercises Load Omniperf: ml rocm/5.3.3 module load cray-python module use /cfs/klemming/home/g/gmarkoma/Public/omniperf/modulefiles/ module load omniperf Reserve a GPU, compile the exercise and execute Omniperf, observe how many times the code is executed
     Like  Bookmark
  • Reserve a GPU Load Omnitrace ml rocm/5.3.3 source /cfs/klemming/home/g/gmarkoma/Public/omnitrace/1.7.4/share/omnitrace/setup-env.sh Allocate resources with salloc Check the various options and their values and also a second command for description srun -n 1 --gpus 1 omnitrace-avail --categories omnitrace srun -n 1 --gpus 1 omnitrace-avail --categories omnitrace --brief --description
     Like  Bookmark
  • We assume that you have already allocated resources with salloc cp -r /projappl/project_465000388/exercises/AMD/HIP-Examples/ . salloc -N 1 -p small-g --gpus=1 -t 10:00 -A project_465000388 module rm rocm module load craype-accel-amd-gfx90a module load PrgEnv-amd module load rocm
     Like  Bookmark
  • We assume that you have already allocated resources with salloc cp -r /projappl/project_465000320/exercises/AMD/HIP-Examples/ . Basic examples cd HIP-Examples/vectorAdd Examine files here – README, Makefile and vectoradd_hip.cpp Notice that Makefile requires HIP_PATH to be set. Check with module show rocm or echo $HIP_PATH Also, the Makefile builds and runs the code. We’ll do the steps separately. Check also the HIPFLAGS in the Makefile. make vectoradd_hip.exe
     Like  Bookmark
  • Reservation: enccs_3 Rocprof Get the exercise: cp -r /global/training/enccs/exercises/HIP-Examples/mini-nbody/ . Compile and run the code
     Like  Bookmark
  • Setup Download the most suitable version from here: https://github.com/AMDResearch/omnitrace/releases Create the installation directory ./omnitrace-1.7.0-opensuse-15.3-ROCm-50200-PAPI-OMPT-Python3.sh --prefix=/opt/omnitrace --exclude-subdir Full documentation: https://amdresearch.github.io/omnitrace/
     Like  Bookmark