Azure / VM / Parabricks 執行環境設定
===
###### tags: `Parabricks-v3.5`
###### tags: `基因體`, `NVIDIA`, `Clara`, `Parabricks`, `二級分析`, `Azure`
<br>
[TOC]
<br>
## 教學文件
### [Azure / 磁碟(Disk)](/8f1YasxKSY-Tv6yPdCMh8w)
### [Azure / 虛擬機器(VM) (包含GPU)](/bMasy0__T3-lqFnNFklgvw)
<br>
<hr>
<br>
## 上機操作 - 環境準備
### 檢查 cuda 版本
透過 nvcc (NVIDIA CUDA Compiler) 指令查詢
```bash=
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
```
若找不到 `nvcc` 指令,亦可透過 `nvidia-smi` 指令

<br>
```bash=
$ nvidia-smi
$ nvidia-smi | head -4
Fri Jun 11 08:39:51 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
```

- cuda 版本為 `11.3`
- 參考資料
- [How to Check CUDA Version Easily](https://varhowto.com/check-cuda-version/#Method_1_%E2%80%94_Use_nvcc_to_check_CUDA_version)
<br>
### 檢查單顆 GPU 的記憶體
```bash=
$ nvidia-smi
```

- `16280MiB` >= 12GB
<br>
### 安裝 docker
> docker not found. Please check installation of docker.
- 安裝 docker
```bash=
#移除掉可能安裝了舊版或是Ubuntu自帶的 Docker
#sudo apt-get remove docker docker-engine docker.io
#安裝後續所需抓 Docker 所需的必要程式
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
#讓系統信賴 Docker 安裝倉庫
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
sudo apt-get install -y docker-ce
```
- 檢查 docker 是否安裝成功 (待整理,此為 TWCC 資訊)
```bash=
$ docker --version
Docker version 20.10.7, build f0df350
```
- 沒有 sudo,會有 permission denied
```bash=
$ docker ps -a
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1: dial unix /var/run/docker.sock: connect: permission denied
$ sudo docker ps -a # should be OK
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 查看 group 權限
$ cat /etc/group
docker:x:998:
# 將當前的 user 添加到 docker group
$ sudo usermod -aG docker $USER
# 查看 group 權限
$ cat /etc/group
docker:x:998:azureuser
# 需要重新登入才能套用
$ logout
# 重新登入
$ ssh -i parabricks-test_key.pem azureuser@65.52.39.149
# 再一次測試指令
$ docker ps -a
```
- [Docker-in-Docker, DiD] 容器內無法再啟用 docker

來源:TWCC / 開發型容器
<br>
### 安裝 nvidia-container-toolkit
> #`--runtime nvidia` #`--runtime=nvidia`
:::warning
:warning: **nvidia-docker2, nvidia-container-toolkit, `--gpus all`?**
Docker 版本:
- 19.03 以前的版本需要使用 nvidia-docker2 和`--runtime=nvidia` 旗標。
- 19.03 以後的版本則需要使用 nvidia-container-toolkit 套件和 `--gpus all` 旗標。您可以在上方網頁連結中找到這兩個選項。
- [參考資料](https://www.tensorflow.org/install/docker?hl=zh-tw)

:::
- 安裝 parabricks 套件,會有底下錯誤訊息

> Docker does not have nvidia runtime or native GPU support. Please either add nvidia runtime to docker, install the nvidia-container-toolkit for docker >= 19.03, or install nvidia-docker. Exiting...
```bash=
# 新增套件&系統更新&安裝
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```
- [Ubuntu18.04 安裝 docker 和 NVIDIA Container Toolkit(使用外接顯卡)](https://grady1006.medium.com/1e3c404c517d)
```
# 新增套件&系統更新&安裝
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker
```
- 現在新版的Docker以支持原生GPU顯卡,不需使用 nvidia-docker2 packages
- 現在統一正名為NVIDIA Container Toolkit
- 如果在 docker-in-docker 中,要重啟 docker (2021/06/26)
- `systemctl restart docker` 無效
- `service docker restart` 有效
- 測試 `nvidia-smi`

<br>
### 亦可安裝 nvidia-docker2
> nvidia-container-toolkit 和 nvidia-docker2 二擇一
> 2021/07/05 測試:可以跑 pbrun fq2bam
- [nvidia-docker2 安裝方式](https://ithelp.ithome.com.tw/articles/10205391)
```bash=
# 安裝 nvidia-docker2
sudo apt-get install -y nvidia-docker2
#重新載入 Docker daemon
sudo pkill -SIGHUP dockerd
```
- `sudo service docker restart` 亦可
<br>
### 上傳 parabricks.tar.gz 到 VM
```bash=
$ scp -i parabricks-test_key.pem \
parabricks.tar.gz \
azureuser@65.52.39.149:/home/azureuser/
```
<br>
### [安裝 parabricks.tar.gz](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#installation-guide)
```bash=
# Step 1: Unzip the package.
$ tar -xzf parabricks.tar.gz
# Step 2 (Node Lock License): Run the installer.
$ sudo ./parabricks/installer.py
# 預設選項,都是 yes
# Step 2 (Flexera License): Run the installer replacing [hostname] with the hostname of the license server.
$ sudo ./parabricks/installer.py --flexera-server [hostname]:7070
# Step 3: verify your installation.
# This should display the parabricks version number:
$ pbrun version
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
pbrun: v3.5.0
$ pbrun --version
pbrun: v3.5.0
```
- 詳細的安裝過程,可參考 [**installer.py 執行過程**](https://hackmd.io/dpOMeTmfQtmh51XCRWE8xw)
<br>
### 使用官方提供的[資料集](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-4-example-run)
```
$ wget -O parabricks_sample.tar.gz "https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz"
```
- AWS S3 -> Azure,平均速度 46.40MB/s :exclamation::exclamation::exclamation: (是 MB)
- 9.24GB 花了 6m24s
<br>
### 從 ESC4000 下載資料集
```
$ rsync --progress -zh \
diatango_lin@10.78.26.241:/mnt/ssdraid/Gene/GATK/everythings-misc.zip .
```
- `--progress` 顯示進度列
- `-z` 壓縮檔案後再上傳
- `-h` 檔案大小,易讀
<br>
<hr>
<br>
## 建置環境(Ubuntu20.04) - 完整指令
`sudo sh setup_env.sh`
```bash=
# install Nvidia runtime ( CUDA Toolkit )
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda
# install docker
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl software-properties-common
#讓系統信賴 Docker 安裝倉庫
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get update
sudo apt-get install -y docker-ce
# install nvidia-docker (nvidia-container-toolkit)
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```
```bash=
# 將當前的 user 添加到 docker group
sudo usermod -aG docker $USER
# in order to apply the docker group
logout
```