# Omniperf - ENCCS Hackathon We have made built the Omniperf without GUI support for use in the exercises * Load Omniperf: ``` ml rocm/5.3.3 module load cray-python module use /cfs/klemming/home/g/gmarkoma/Public/omniperf/modulefiles/ module load omniperf ``` * Reserve a GPU, compile the exercise and execute Omniperf, observe how many times the code is executed ``` salloc -N 1 -p gpu-tst -A edu23.enccsgpu -t 00:30:00 git clone https://github.com/AMD/HPCTrainingExamples.git cd HPCTrainingExamples/HIP/dgemm/ mkdir build cd build cmake .. make cd bin srun -n 1 omniperf profile -n dgemm -- ./dgemm -m 8192 -n 8192 -k 8192 -i 1 -r 10 -d 0 -o dgemm.csv ``` * Run `srun -n 1 --gpus 1 omniperf profile -h` to see all the options * Now is created a workload in the directory workloads with the name dgemmoh I mean for (the argument of the -n). So, we can analyze it ``` srun -n 1 --gpus 1 omniperf analyze -p workloads/dgemm/mi200/ &> dgemm_analyze.txt ``` * If you want to only roofline analysis, then execute: `srun -n 1 --gpus 1 omniperf profile -n dgemm --roof-only -- ./dgemm -m 8192 -n 8192 -k 8192 -i 1 -r 10 -d 0 -o dgemm.csv` * If tou want to know the kernel names, it creates a second pdf with the markers and corresponding names, then run: `srun -n 1 --gpus 1 omniperf profile -n dgemm --kernel-names --roof-only -- ./dgemm -m 8192 -n 8192 -k 8192 -i 1 -r 10 -d 0 -o dgemm.csv` There is no need for srun to analyze but we want to avoid everybody to use the login node. Explore the file `dgemm_analyze.txt` * We can select specific IP Blocks, like: ``` srun -n 1 --gpus 1 omniperf analyze -p workloads/dgemm/mi200/ -b 7.1.2 ``` But you need to know the code of the IP Block * If you have installed Omniperf on your laptop (no ROCm required for analysis) then you can download the data and execute: ``` omniperf analyze -p workloads/dgemm/mi200/ --gui ``` * Open the web page: http://IP:8050/ The IP will be displayed in the output * Use another cod, for example: https://github.com/amd/HPCTrainingExamples/blob/main/HIP/saxpy/saxpy.cpp