# Omniperf - Performance Analysis Tools for AMD GPUs, CRAY User Group Tutorial 2024 -------------------------------------------------------------- # Omniperf Basic Examples ## vcopy Setup environment ```bash module load rocm omniperf/2.0.0-a1017 ``` Get sample code ```bash wget https://github.com/AMDResearch/omniperf/raw/main/sample/vcopy.cpp ``` Compile ```bash hipcc -o vcopy vcopy.cpp ``` Profile with omniperf (select any device with `ROCR_VISIBLE_DEVICES`): ```bash ROCR_VISIBLE_DEVICES=4 \ omniperf profile -n vcopy_all -- ./vcopy -n 1048576 -b 256 ``` A new directory will be created named `workloads/vcopy_all`. Analyze the collected profile using the built-in CLI ```bash omniperf analyze -p workloads/vcopy_all/MI200/ &> vcopy_analyze.txt ``` View `vcopy_analyze.txt` ```bash less vcopy_analyze.txt ``` We can select specific IP Blocks to analyze ```bash omniperf analyze -p workloads/vcopy_all/MI200/ -b 7.1.2 ``` If you've installed Omniperf on your laptop ( No ROCm Required ), you can download `workloads/vcopy_all/mi200/` to your laptop and run omniperf with `--gui` option ```bash omniperf analyze -p workloads/vcopy_all/MI200/ --gui ``` Open the following in your browser ```url http://172.21.7.117:8050/ ``` ![image](https://user-images.githubusercontent.com/109979778/225511493-c7cccc21-9cb8-426f-82da-7f790066aa89.png) Alternatively you can start the server remotely in a port of your choice: ``` ssh USER@aac1.amd.com -p <PORT> -L <GUI_PORT>:localhost:<GUI_PORT> omniperf analyze -p workloads/vcopy_all/MI200/ --gui <GUI_PORT> ``` and load in your browser: ```url http://localhost:<GUI_PORT>/ ``` ## dgemm Let's repeat the exercise ``` cd HPCTrainingExamples/HIP/dgemm rm -rf build mkdir build && cd build cmake .. CPATH=$ROCM_PATH/include/hipblas make ``` Profile and analyze code ``` ROCR_VISIBLE_DEVICES=4 \ omniperf profile --roof-only -n dgemm -- ./bin/dgemm -m 8192 -n 8192 -k 8192 -i 1 -r 10 -d 0 -o dgemm.csv omniperf analyze -p workloads/dgemm/MI200 >& omniperf.out less omniperf.out ``` Start que GUI as before to visualize the roofline. You'll notice some tables won't be populated as only the metrics to compute the roofline were collected.