ROCm™ for AI

# ROCm™ for AI 簡介 ROCm™ 開放軟件平台不斷發展以滿足機器學習 (ML) 社區的需求。最新版本的 ROCm 5.5，開發人員在 AMD Infinity Hub 上配備了交鑰匙 AI 框架容器，簡化的安裝，並有望體驗到內核啟動時間的縮短和應用程序性能的提升。從優化的 MIOpen 庫到我們全面的 MIVisionX 計算機視覺和機器智能庫、實用程序和應用程序； AMD 與 AI 開放社區廣泛合作，以促進和擴展機器和深度學習功能和優化，以幫助擴大可以利用加速計算的工作負載範圍。 --- [toc] --- ## I. MIOpen Link: https://rocmsoftwareplatform.github.io/MIOpen/doc/html/index.html Advanced Micro Devices, Inc’s open source deep learning library. ## II. MIVisionX MIVisionX 工具包是一組綜合的計算機視覺和機器智能庫、實用程序和應用程序，捆綁在一個工具包中。AMD MIVisionX 提供 Khronos OpenVX™ 和 OpenVX™ 擴展的高度優化的一致性開源實現，以及支持 ONNX 和 Khronos NNEF™ 交換格式的捲積神經網絡模型編譯器和優化器。該工具包允許在各種計算機硬件（包括小型嵌入式 x86 CPU、APU、離散 GPU 和異構服務器）上快速構建和部署優化的計算機視覺和機器學習推理工作負載。 Link: https://gpuopen-professionalcompute-libraries.github.io/MIVisionX/ ## III. AMD OpenVX™ AMD OpenVX™ 是 Khronos OpenVX™ 1.3 計算機視覺規範的高度優化的開源實現。它允許在各種計算機硬件（包括小型嵌入式 x86 CPU 和大型工作站離散 GPU）上進行快速原型設計和快速執行。 ## IV. ROCm Installation Guide for Linux (Ubuntu 20.04.5) 參考: https://gist.github.com/neolaw84/98876cebe0ad29d95050f11500c508fe https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.5/page/Introduction_to_ROCm_Installation_Guide_for_Linux.html :::danger 目前 Pytorch 最高版本只支援到 ROCm 5.4.2 | System | User Name | password | ROCm 目標版本| | ---------------- | -------------- | -------- | ---------- | | Ubuntu 20.04.5 | rocm-develop | rocm | v5.4.2 | 安裝前需要注意版本號 ![](https://hackmd.io/_uploads/H1QPQvwEh.png) 此安裝選擇使用 Ubuntu 20.04.5 ::: :rocket: 檢查系統資訊 ```shell= uname -a ``` ![](https://hackmd.io/_uploads/ryod9wwE2.png) ## 1. 依賴套件安裝 sudo lshw -c video lsmod | grep amd ``` AMD 官方沒有提到的部分，以下這些套件需要先安裝 It is very important to install libnuma-dev and libncurses5 before everything: ```shell= sudo apt-get update sudo apt-get dist-upgrade sudo apt-get install libnuma-dev libncurses5 sudo reboot ``` ## 2. Add user to groups render and video Add user to groups render and video: ```shell= sudo usermod -a -G render ${USER} sudo usermod -a -G video ${USER} ``` ## 3. Download and install Download and install: ```shell= sudo apt-get update wget https://repo.radeon.com/amdgpu-install/5.4.2/ubuntu/focal/amdgpu-install_5.4.50402-1_all.deb sudo apt-get install ./amdgpu-install_5.4.50402-1_all.deb sudo amdgpu-install --usecase=rocm,rocmdevtools,lrt,hip,hiplibsdk,mllib,mlsdk,dkms ``` ## 4. 驗證 rocm ```shell= rocminfo | grep 'Name:' /opt/rocm-5.4.2/opencl/bin/clinfo ``` ![](https://hackmd.io/_uploads/HyZHpdD43.png) ![](https://hackmd.io/_uploads/Sy0HAuPE3.png) :::info rocm@rocm-develop:~$ rocminfo | grep 'Name:' Name: Intel(R) Core(TM) i9-9900X CPU @ 3.50GHz Marketing Name: Intel(R) Core(TM) i9-9900X CPU @ 3.50GHz Vendor Name: CPU Name: gfx906 Marketing Name: AMD Radeon VII Vendor Name: AMD Name: amdgcn-amd-amdhsa--gfx906:sramecc+:xnack- rocm@rocm-develop:~$ rocm@rocm-develop:~$ rocm@rocm-develop:~$ rocm@rocm-develop:~$ /opt/rocm-5.5.0/opencl/bin/clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.1 AMD-APP (3558.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Vendor ID: 1002h Board name: AMD Radeon VII Device Topology: PCI[ B#195, D#0, F#0 ] Max compute units: 60 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 256 Preferred vector width char: 4 Preferred vector width short: 2 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 1 Native vector width char: 4 Native vector width short: 2 Native vector width int: 1 Native vector width long: 1 Native vector width float: 1 Native vector width double: 1 Max clock frequency: 1801Mhz Address bits: 64 Max memory allocation: 14588628168 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 16384 Max image 2D height: 16384 Max image 3D width: 16384 Max image 3D height: 16384 Max image 3D depth: 8192 Max samplers within kernel: 26287 Max size of kernel argument: 1024 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: Read/Write Cache line size: 64 Cache size: 16384 Global memory size: 17163091968 Constant buffer size: 14588628168 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 65536 Max pipe arguments: 16 Max pipe active reservations: 16 Max pipe packet size: 1703726280 Max global variable size: 14588628168 Max global variable preferred total size: 17163091968 Max read/write image args: 64 Max on device events: 1024 Queue on device max size: 8388608 Max on device queues: 1 Queue on device preferred size: 262144 SVM capabilities: Coarse grain buffer: Yes Fine grain buffer: Yes Fine grain system: No Atomics: No Preferred platform atomic alignment: 0 Preferred global atomic alignment: 0 Preferred local atomic alignment: 0 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue on Host properties: Out-of-Order: No Profiling : Yes Queue on Device properties: Out-of-Order: Yes Profiling : Yes Platform ID: 0x7f7955730e50 Name: gfx906:sramecc+:xnack- Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 2.0 Driver version: 3558.0 (HSA1.1,LC) Profile: FULL_PROFILE Version: OpenCL 2.0 Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program rocm@rocm-develop:~$ ::: ## 5. 安裝 pytorch ### a) 安裝 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 ```shell= $ sudo apt install python3-pip $ pip uninstall torch torchaudio torchvision spacy -y $ pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2 $ pip install cupy # or install from source $ pip install spacy ``` ### b) Try PyTorch examples ```shell= sudo apt install git ``` Clone the PyTorch examples repository: ```shell= git clone https://github.com/pytorch/examples.git ``` Run individual example: MNIST ```shell= cd examples/mnist ``` Follow instructions in README.md, in this case: ```shell= pip install -r requirements.txt python3 main.py ``` Run individual example: Try ImageNet training cd ../imagenet ### 驗證 Pytorch-ROCm ```shell= git clone https://github.com/pytorch/pytorch.git cd pytorch ``` ## 6. Tensorflow Installation 參考link: https://sep5.readthedocs.io/en/latest/Deep_learning/Deep-learning.html install these other relevant ROCm packages: ```shell= sudo apt update sudo apt install rocm-libs miopen-hip rccl ``` And finally, install TensorFlow itself (via the Python Package Index): ```shell= sudo apt install wget python3-pip # Pip install the whl package from PyPI pip install --user tensorflow-rocm==2.13.0 ``` :::info **這會安裝以下 tensorflow_rocm-2.13.0 版本 tensorflow_rocm-2.11.0.540-cp38-cp38-manylinux2014_x86_64.whl** ::: ## 7. Uninstallation ROCm STEP01: uninstall ROCm packages: ```shell= $ sudo amdgpu-uninstall ``` STEP02: Uninstall ROCm packages ```shell= sudo apt autoremove rocm-core ``` STEP03: Uninstall Kernel-mode Driver ```shell= sudo apt autoremove amdgpu-dkms ``` ## 8. Monitor AMD GPUs Using Open Source Drivers in Linux 參考link: https://linuxhint.com/apps-monitor-amd-gpu-linux/ To install Radeontop in Ubuntu, execute the command specified below: ``` $ sudo apt install radeontop ``` In other Linux distributions, you can install Radeontop from the package manager. You can also compile its source code to get executable binary files. To run Radeontop, use a command in the following format: ```shell= $ sudo radeontop -c ``` 畫面是執行 pytorch MNIST 訓練，AMD GPU 使用率 ![](https://hackmd.io/_uploads/rkfpaKON3.png) --- --- --- --- --- ROCm 是 ROCm 開放軟件平台（軟件）或 ROCm 開放平台生態系統（包括 FPGA 或其他 CPU 架構等硬件）的品牌名稱。除了使用本機包管理器的安裝方法外，AMD ROCm 還引入了安裝和升級 ROCm 的附加方法。您現在可以使用升級機制將現有 ROCm 安裝升級到特定或最新的 ROCm 版本。在此版本中，ROCm 安裝使用 amdgpu-install 和 amdgpu-uninstall 腳本。 amdgpu-install 腳本通過以下方式簡化了安裝過程： Link: https://docs.amd.com/bundle/ROCm-Installation-Guide-v5.5/page/How_to_Install_ROCm.html#_How_to_Install 您可以使用以下安裝方法來安裝 ROCm： • 安裝程序腳本方法 • 包管理器方法 For information on installing ROCm on devices with NVIDIA GPUs, refer to the HIP Installation Guide. ![](https://hackmd.io/_uploads/By7b87wEn.png) ### 1.Installer Script Method 安裝程序腳本方法會自動執行 AMDGPU 和 ROCm 堆棧的安裝過程。安裝程序腳本處理 ROCm 的完整安裝過程，包括設置存儲庫、清理系統、更新和安裝所需的驅動程序和元數據包。通過這種方法，系統可以更好地控制 ROCm 安裝過程。因此，對Linux標準命令不太熟悉的用戶可以選擇這種方式安裝ROCm。對於在 Linux 發行版上使用安裝程序腳本方法安裝 AMDGPU 和 ROCm，請按照以下步驟操作： 1. 滿足條件 – 確保在使用安裝程序腳本方法下載和安裝安裝程序之前滿足先決條件。 2. 下載並安裝安裝程序 – 確保從推薦的 URL 下載並安裝安裝程序腳本。 ![](https://hackmd.io/_uploads/Bk6XLQPVh.png) :::warning *注意：安裝程序包會定期更新以解決已知問題並添加新功能。每個 Linux 發行版的鏈接始終指向最新的可用版本。 ::: 在 Linux 發行版上使用安裝程序腳本 – 確保執行用於安裝用例的腳本。 ### 2. Downloading and Installing the Installer Script on Ubuntu Ubuntu v20.04 要在系統上下載 amdgpu-install 腳本，請使用以下命令。 ``` shell sudo apt-get update wget https://repo.radeon.com/amdgpu-install/5.5/ubuntu/focal/amdgpu-install_5.5.50500-1_all.deb sudo apt-get install ./amdgpu-install_5.5.50500-1_all.deb ``` ### 3. Using the Installer Script for Single-version ROCm Installation 要安裝特定於您的要求的用例，請使用安裝程序 amdgpu-install，如下所示： To install a single use case: (此實驗使用這單一版本) ``` sudo amdgpu-install –y --usecase=rocm ``` To install kernel-mode driver: ``` sudo amdgpu-install --usecase=dkms ``` To install multiple use cases: ``` sudo amdgpu-install --usecase=hiplibsdk,rocm ``` :::info 本節中的列表僅代表 ROCm 的可用用例示例： sudo amdgpu-install --list-usecase ::: :::info 如果 --usecase 選項不存在，默認選擇是“graphics,opencl,hip” Available use cases: rocm(for users and developers requiring full ROCm stack) - OpenCL (ROCr/KFD based) runtime - HIP runtimes - Machine learning framework - All ROCm libraries and applications - ROCm Compiler and device libraries - ROCr runtime and thunk lrt(for users of applications requiring ROCm runtime) - ROCm Compiler and device libraries - ROCr runtime and thunk opencl(for users of applications requiring OpenCL on Vega or later products) - ROCr based OpenCL - ROCm Language runtime openclsdk (for application developers requiring ROCr based OpenCL) - ROCr based OpenCL - ROCm Language runtime - development and SDK files for ROCr based OpenCL hip(for users of HIP runtime on AMD products) - HIP runtimes hiplibsdk (for application developers requiring HIP on AMD products) - HIP runtimes - ROCm math libraries - HIP development libraries ::: 備註：將 -y 作為參數添加到 amdgpu-install 會跳過用戶提示（用於自動化）。例子： amdgpu-install -y --usecase=rocm Install the kernel-mode driver. ``` sudo apt install amdgpu-dkms ``` Reboot the system. ``` sudo reboot ``` ### 4. Add ROCm Stack Repository To add the ROCm repository, use the following steps: Ubuntu v20.04 ```#Base URL pointing to Repositories with Latest Packages echo 'deb [arch=amd64 signed-by=/etc/apt/trusted.gpg.d/rocm-keyring.gpg] https://repo.radeon.com/rocm/apt/debian/ focal main' | sudo tee /etc/apt/sources.list.d/rocm.list echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600 sudo apt-get update ``` ### 5. Install ROCm Meta-packages 安裝 ROCm 元包。將要安裝的元包的名稱指定為 <package-name>，如下所示： sudo apt install <package-name> 此安裝範例執行以下這兩個套件 ``` sudo apt install rocm-hip-sdk sudo apt install rocm-hip-sdk rocm-opencl-sdk ``` Verifying Kernel-mode Driver Installation 通過鍵入下面給出的命令來檢查內核模式驅動程序的安裝： ``` dkms status ``` #### e) Verifying ROCm Installation 完成ROCm安裝後，在系統上執行以下命令來驗證是否安裝成功。如果您看到兩個命令都列出了您的 GPU，則表明安裝成功： ``` /opt/rocm-5.5.0/bin/rocminfo OR /opt/rocm-5.5/opencl/bin/clinfo ``` ![](https://hackmd.io/_uploads/SyPqRQwVh.png) #### f) Verifying Package Installation 為確保軟件包安裝成功，請使用以下命令： ``` sudo apt list --installed ``` #### g) Uninstallation Using Uninstall Script The following commands uninstall all installed ROCm packages: ``` $ sudo amdgpu-uninstall ``` #### h) Complete Uninstallation of ROCm Packages If you want to uninstall all installed ROCm packages from a ROCm release, use the following command: :::info NOTE: The uninstallation of the rocm-core package removes all ROCm-specific packages from the system. ::: Ubuntu/Debian #Uninstall all single-version ROCm packages ``` sudo apt autoremove rocm-core ``` #### i) Uninstall Kernel-mode Driver ``` sudo apt autoremove amdgpu-dkms ``` ## IV. RadeonVII ROCm https://docs.amd.com/category/ROCm%E2%84%A2%20v5.x 1. ROCm pytoch & tensorflow框架兼容性 2. Frameworks Installation https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.5/page/Frameworks_Installation.html 以下部分介紹了 ROCm 和深度學習應用程序的不同框架安裝。圖 5 提供了使用每個框架的順序流程。 Install Docker Engine https://docs.docker.com/engine/install/ubuntu/ #============================================================================== # install docker # 1. Update the apt package index and install packages to allow apt to use a repository over HTTPS: sudo apt-get update sudo apt-get install \ ca-certificates \ curl \ gnupg # 2. Add Docker’s official GPG key: sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg # 3. Use the following command to set up the repository: echo \ "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \ "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null # Install Docker Engine # Update the apt package index: sudo apt-get update # Install Docker Engine, containerd, and Docker Compose. sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin Verify that the Docker Engine installation is successful by running the hello-world image: $ sudo docker run hello-world #============================================================================== 3. PyTorch 安裝 Option 1 (Recommended): Use Docker Image with PyTorch Preinstalled https://docs.amd.com/bundle/ROCm-Deep-Learning-Guide-v5.5/page/Frameworks_Installation.html 使用 Docker 可為您提供可移植性和訪問預先構建的 Docker 容器的權限，該容器已在 AMD 內部進行了嚴格測試。這也可以節省編譯時間，並且應該像測試時那樣執行，而不會面臨潛在的安裝問題。要使用預裝 PyTorch 的 Docker 鏡像，請按照以下步驟操作： 1. Pull the latest public PyTorch Docker image. sudo docker pull rocm/pytorch:latest 2. Start a Docker container using the downloaded image. sudo docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G rocm/pytorch:latest Option 2: Install PyTorch Using Wheels Package PyTorch 通過提供經過測試的 wheels 包來支持 ROCm 平台。要訪問此功能，請參閱 https://pytorch.org/get-started/locally/ 並選擇 ROCm 計算平台。 3. Clone the PyTorch repository. cd ~ git clone https://github.com/pytorch/pytorch.git cd pytorch git submodule update --init –recursive FIX: 執行git submodule update --init 4. Build PyTorch for ROCm. NOTE：默認情況下，在 rocm/pytorch:latest-base 中，PyTorch 同時為這些架構構建：gfx900、gfx906、gfx908、gfx90a 和 gfx1030。 5. To determine your AMD uarch, run: rocminfo | grep gfx 6. In the event you want to compile only for your uarch, use: export PYTORCH_ROCM_ARCH=<uarch> <uarch> 是 rocminfo 命令報告的體系結構。 export PYTORCH_ROCM_ARCH=amdgcn-amd-amdhsa--gfx906:sramecc+:xnack- 7. Build PyTorch using the following command: ./.jenkins/pytorch/build.sh 這將首先轉換 PyTorch 源以實現 HIP 兼容性並構建 PyTorch 框架。 8. 或者，通過發出以下命令構建 PyTorch： python3 tools/amd_build/build_amd.py USE_ROCM=1 MAX_JOBS=4 python3 setup.py install ––user 4. Test the PyTorch Installation 您可以使用 PyTorch 單元測試來驗證 PyTorch 安裝。如果使用來自 AMD ROCm DockerHub 的預構建 PyTorch Docker 映像或安裝官方 wheels 包，這些測試已經在這些配置上運行。或者，您可以手動運行單元測試以完全驗證 PyTorch 安裝。要測試 PyTorch 安裝，請執行以下步驟： 1. 通過在 Python 中導入 torch 包來測試 PyTorch 是否已安裝並可訪問。 NOTE: Do not run in the PyTorch git folder. python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure' 2. 測試 GPU 是否可從 PyTorch 訪問。在 PyTorch 框架中，torch.cuda 是訪問 GPU 的通用機制；它僅在可用時訪問 AMD GPU。 python3 -c 'import torch; print(torch.cuda.is_available())' 3. 運行單元測試以完全驗證 PyTorch 安裝。從 PyTorch 主目錄運行以下命令： BUILD_ENVIRONMENT=${BUILD_ENVIRONMENT:-rocm} ./.jenkins/pytorch/test.sh 這確保即使在非受控環境中安裝輪子，所需的環境變量也被設置為跳過 ROCm 的某些單元測試。 NOTE：確保 PyTorch 源代碼與 PyTorch 輪或 Docker 映像中的安裝相對應。不兼容的 PyTorch 源代碼可能會在運行單元測試時出錯。這將首先安裝一些依賴項，例如 PyTorch 支持的 torchvision 版本。 Torchvision 在一些 PyTorch 測試中用於加載模型。接下來，這將運行所有單元測試。 NOTE：根據您的系統配置，您的系統可能會酌情跳過一些測試。 ROCm 不支持 PyTorch 的所有功能，您的系統會跳過評估這些功能的測試。此外，根據主機內存或可用 GPU 的數量，您的系統可能會跳過其他測試。如果編譯和安裝正確，測試應該不會失敗。 4. 使用以下命令運行單個單元測試： PYTORCH\_TEST\_WITH\_ROCM=1 python3 test/test\_nn.py --verbose test_nn.py 可以替換為任何其他測試集。 5. Run a Basic PyTorch Example ###### tags: `AMD`