Azure / VM / Parabricks 執行環境設定 === ###### tags: `Parabricks-v3.5` ###### tags: `基因體`, `NVIDIA`, `Clara`, `Parabricks`, `二級分析`, `Azure` <br> [TOC] <br> ## 教學文件 ### [Azure / 磁碟(Disk)](/8f1YasxKSY-Tv6yPdCMh8w) ### [Azure / 虛擬機器(VM) (包含GPU)](/bMasy0__T3-lqFnNFklgvw) <br> <hr> <br> ## 上機操作 - 環境準備 ### 檢查 cuda 版本 透過 nvcc (NVIDIA CUDA Compiler) 指令查詢 ```bash= $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Tue_Sep_15_19:10:02_PDT_2020 Cuda compilation tools, release 11.1, V11.1.74 Build cuda_11.1.TC455_06.29069683_0 ``` 若找不到 `nvcc` 指令,亦可透過 `nvidia-smi` 指令 ![](https://i.imgur.com/N5lNi1J.png) <br> ```bash= $ nvidia-smi $ nvidia-smi | head -4 Fri Jun 11 08:39:51 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 | |-------------------------------+----------------------+----------------------+ ``` ![](https://i.imgur.com/MTAw64A.png) - cuda 版本為 `11.3` - 參考資料 - [How to Check CUDA Version Easily](https://varhowto.com/check-cuda-version/#Method_1_%E2%80%94_Use_nvcc_to_check_CUDA_version) <br> ### 檢查單顆 GPU 的記憶體 ```bash= $ nvidia-smi ``` ![](https://i.imgur.com/7ypD7NJ.png) - `16280MiB` >= 12GB <br> ### 安裝 docker > docker not found. Please check installation of docker. - 安裝 docker ```bash= #移除掉可能安裝了舊版或是Ubuntu自帶的 Docker #sudo apt-get remove docker docker-engine docker.io #安裝後續所需抓 Docker 所需的必要程式 sudo apt-get update sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common #讓系統信賴 Docker 安裝倉庫 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - sudo apt-key fingerprint 0EBFCD88 sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable" sudo apt-get update sudo apt-get install -y docker-ce ``` - 檢查 docker 是否安裝成功 (待整理,此為 TWCC 資訊) ```bash= $ docker --version Docker version 20.10.7, build f0df350 ``` - 沒有 sudo,會有 permission denied ```bash= $ docker ps -a Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1: dial unix /var/run/docker.sock: connect: permission denied $ sudo docker ps -a # should be OK CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES # 查看 group 權限 $ cat /etc/group docker:x:998: # 將當前的 user 添加到 docker group $ sudo usermod -aG docker $USER # 查看 group 權限 $ cat /etc/group docker:x:998:azureuser # 需要重新登入才能套用 $ logout # 重新登入 $ ssh -i parabricks-test_key.pem azureuser@65.52.39.149 # 再一次測試指令 $ docker ps -a ``` - [Docker-in-Docker, DiD] 容器內無法再啟用 docker ![](https://i.imgur.com/nycZiiA.png) 來源:TWCC / 開發型容器 <br> ### 安裝 nvidia-container-toolkit > #`--runtime nvidia` #`--runtime=nvidia` :::warning :warning: **nvidia-docker2, nvidia-container-toolkit, `--gpus all`?** Docker 版本: - 19.03 以前的版本需要使用 nvidia-docker2 和`--runtime=nvidia` 旗標。 - 19.03 以後的版本則需要使用 nvidia-container-toolkit 套件和 `--gpus all` 旗標。您可以在上方網頁連結中找到這兩個選項。 - [參考資料](https://www.tensorflow.org/install/docker?hl=zh-tw) ![](https://i.imgur.com/L3OSuEi.png) ::: - 安裝 parabricks 套件,會有底下錯誤訊息 ![](https://i.imgur.com/qTatlZX.png) > Docker does not have nvidia runtime or native GPU support. Please either add nvidia runtime to docker, install the nvidia-container-toolkit for docker >= 19.03, or install nvidia-docker. Exiting... ```bash= # 新增套件&系統更新&安裝 distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker ``` - [Ubuntu18.04 安裝 docker 和 NVIDIA Container Toolkit(使用外接顯卡)](https://grady1006.medium.com/1e3c404c517d) ``` # 新增套件&系統更新&安裝 $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list $ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit $ sudo systemctl restart docker ``` - 現在新版的Docker以支持原生GPU顯卡,不需使用 nvidia-docker2 packages - 現在統一正名為NVIDIA Container Toolkit - 如果在 docker-in-docker 中,要重啟 docker (2021/06/26) - `systemctl restart docker` 無效 - `service docker restart` 有效 - 測試 `nvidia-smi` ![](https://i.imgur.com/v3ZRp6a.png) <br> ### 亦可安裝 nvidia-docker2 > nvidia-container-toolkit 和 nvidia-docker2 二擇一 > 2021/07/05 測試:可以跑 pbrun fq2bam - [nvidia-docker2 安裝方式](https://ithelp.ithome.com.tw/articles/10205391) ```bash= # 安裝 nvidia-docker2 sudo apt-get install -y nvidia-docker2 #重新載入 Docker daemon sudo pkill -SIGHUP dockerd ``` - `sudo service docker restart` 亦可 <br> ### 上傳 parabricks.tar.gz 到 VM ```bash= $ scp -i parabricks-test_key.pem \ parabricks.tar.gz \ azureuser@65.52.39.149:/home/azureuser/ ``` <br> ### [安裝 parabricks.tar.gz](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#installation-guide) ```bash= # Step 1: Unzip the package. $ tar -xzf parabricks.tar.gz # Step 2 (Node Lock License): Run the installer. $ sudo ./parabricks/installer.py # 預設選項,都是 yes # Step 2 (Flexera License): Run the installer replacing [hostname] with the hostname of the license server. $ sudo ./parabricks/installer.py --flexera-server [hostname]:7070 # Step 3: verify your installation. # This should display the parabricks version number: $ pbrun version Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation pbrun: v3.5.0 $ pbrun --version pbrun: v3.5.0 ``` - 詳細的安裝過程,可參考 [**installer.py 執行過程**](https://hackmd.io/dpOMeTmfQtmh51XCRWE8xw) <br> ### 使用官方提供的[資料集](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-4-example-run) ``` $ wget -O parabricks_sample.tar.gz "https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz" ``` - AWS S3 -> Azure,平均速度 46.40MB/s :exclamation::exclamation::exclamation: (是 MB) - 9.24GB 花了 6m24s <br> ### 從 ESC4000 下載資料集 ``` $ rsync --progress -zh \ diatango_lin@10.78.26.241:/mnt/ssdraid/Gene/GATK/everythings-misc.zip . ``` - `--progress` 顯示進度列 - `-z` 壓縮檔案後再上傳 - `-h` 檔案大小,易讀 <br> <hr> <br> ## 建置環境(Ubuntu20.04) - 完整指令 `sudo sh setup_env.sh` ```bash= # install Nvidia runtime ( CUDA Toolkit ) wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /" sudo apt-get update sudo apt-get -y install cuda # install docker sudo apt-get update sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common #讓系統信賴 Docker 安裝倉庫 curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - sudo apt-key fingerprint 0EBFCD88 sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable" sudo apt-get update sudo apt-get install -y docker-ce # install nvidia-docker (nvidia-container-toolkit) distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker ``` ```bash= # 將當前的 user 添加到 docker group sudo usermod -aG docker $USER # in order to apply the docker group logout ```