AWS / EC2(VM) / Parabricks 執行環境設定 (使用 Docker Container)
===
###### tags: `Parabricks`
###### tags: `基因體`, `NVIDIA`, `Clara`, `Parabricks`, `二級分析`, `Azure`
<br>
[TOC]
<br>
## [Flexera License, Docker Container](https://docs.nvidia.com/clara/parabricks/3.7.0/GettingStarted/FlexeraDocker.html#software-installation)
### [Software Installation - v3.7.0](https://docs.nvidia.com/clara/parabricks/3.7.0/GettingStarted/FlexeraDocker.html#software-installation)
```bash
# Unzip the downloaded file. This extracts parabricks.tar.gz from files.zip
$ unzip files.zip
# Extract the installation package.
$ tar -xzf parabricks.tar.gz
# Install the software.
$ sudo ./parabricks/installer.py --flexera-server [hostname]:7070
# Verify your installation.
# This should display the parabricks version number:
$ pbrun version
```
<br>
<hr>
<br>
## Parabricks 執行環境設定
:::info
:bulb: **按照底下這篇進行安裝**
[[HackMD] Azure / VM / Parabricks 執行環境設定](/52Xj-oMnRXKMjP3JOZ__Yg)
:::
<br>
<hr>
<br>
## 錯誤排解
> 執行 `installer.py` 會遇到的錯誤問題,需先排解
>
### 1. `ModuleNotFoundError: No module named 'distutils.dir_util'`
```
$ sudo ./parabricks/installer.py --flexera-server localhost:7070
Traceback (most recent call last):
File "./parabricks/installer.py", line 12, in <module>
from distutils.dir_util import copy_tree
ModuleNotFoundError: No module named 'distutils.dir_util'
```
- [[已解決][Python] ModuleNotFoundError: No module named 'distutils.util']([sudo apt install python3-distutils](https://clay-atlas.com/blog/2021/05/24/python-cn-module-not-found-distutils-pip/))
```bash
$ sudo apt install python3-distutils
```
<br>
### 2. docker not found (記得裝 docker)
```
...
...
Checking curl installation
Checking docker installation
docker not found. Please check installation of docker.
For technical support, updated user guides and other Parabricks documentation can be found at https://docs.nvidia.com/clara/#parabricks
Answers to most FAQ's can be found on the developer forum https://forums.developer.nvidia.com/c/healthcare/Parabricks/290
Customers with paid Parabricks licenses have direct access to support and can contact EnterpriseSupport@nvidia.com
Users of free evaluation licenses can contact parabricks-eval-support@nvidia.com for troubleshooting any questions.
```
- 確認 docker 指令
```
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
```
<br>
### 3. Docker does not have nvidia runtime or native GPU support.
```
...
...
Checking curl installation
Checking docker installation
Docker does not have nvidia runtime or native GPU support. Please either add nvidia runtime to docker, install the nvidia-container-toolkit for docker >= 19.03, or install nvidia-docker. Exiting...
For technical support, updated user guides and other Parabricks documentation can be found at https://docs.nvidia.com/clara/#parabricks
Answers to most FAQ's can be found on the developer forum https://forums.developer.nvidia.com/c/healthcare/Parabricks/290
Customers with paid Parabricks licenses have direct access to support and can contact EnterpriseSupport@nvidia.com
Users of free evaluation licenses can contact parabricks-eval-support@nvidia.com for troubleshooting any questions.
/usr/bin/docker
```
- ### [安裝 nvidia-container-toolkit](/52Xj-oMnRXKMjP3JOZ__Yg#安裝-nvidia-container-toolkit) (這是[新式用法](https://hackmd.io/RTrbMzh1Te2ACGAsXN0PeA#--gpus))
- 好像依舊**無效**
- 忘了 logout 測試、或是重開機測試
- ### [安裝 nvidia-docker2](/52Xj-oMnRXKMjP3JOZ__Yg#亦可安裝-nvidia-docker2) (這是[舊式用法](https://hackmd.io/RTrbMzh1Te2ACGAsXN0PeA#--runtimenvidia))
> 可解決此問題
>
```bash=
# 安裝 nvidia-docker2
sudo apt-get install -y nvidia-docker2
#重新載入 Docker daemon
sudo pkill -SIGHUP dockerd
```
- `sudo service docker restart` 亦可
<br>
<hr>
<br>
## [log] `installer.py` 執行過程
> 點此對照 Parabricks v3.5 版本:**[[HackMD][log] installer.py 執行過程](https://hackmd.io/dpOMeTmfQtmh51XCRWE8xw)**
```
$ sudo ./parabricks/installer.py --flexera-server localhost:7070
END USER LICENSE AGREEMENT FOR PARABRICKS SOFTWARE
This end user license agreement, including the exhibit attached ("Agreement”) is a legal agreement between you and
NVIDIA Corporation ("NVIDIA") and governs your use of the NVIDIA Parabricks software and materials (“SOFTWARE”).
If you are entering into this Agreement on behalf of a company or other legal entity, you represent that you have the
legal authority to bind the entity to this Agreement, in which case “you” will mean the entity you represent.
If you don’t have the required authority to accept this Agreement, or if you don’t accept all the terms and conditions
of this Agreement, do not download, install or use the SOFTWARE.
You agree to use the SOFTWARE only for purposes that are permitted by (a) this Agreement, and (b) any applicable law,
regulation or generally accepted practices or guidelines in the relevant jurisdictions.
1. License.
2. Limitations.
3. Ownership.
4. No Warranties.
5. Limitations of Liability.
6. Termination.
7. Data Collection.
8. General.
(v. March 18, 2020)
PARABRICKS SOFTWARE SERVICES SUPPLEMENT
1. Scope.
2. Services.
3. Exclusions.
4. Your Responsibilities.
5. Service Fees; Payment Terms.
6. Definitions.
(v. March 18, 2020)
Online link for EULA
https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/healthcare-parabricks-end-user-license-agreement.pdf
The software can be used only with the above End User License Agreement(EULA) stated above. Do you agree to the EULA?
Type yes or no only: yes
Do you want to create a symlink to /usr/bin/pbrun ?
Type yes or no only: yes
Parabricks installation has two options: i) exclusively for ampere GPUs, ii) exclusively for non-ampere GPUs. Do you
want to install exclusively for ampere GPUs?
Type yes or no only: no
====================================
Installing Parabricks
Final Selection:
Install Directory: /opt/parabricks
Install Version: 3.7.0-1
Install Container Type: docker
Install Architecture: x86_64
Container Path: nvcr.io
====================================
Are the above installation parameters correct?
Type yes or no only: yes
Checking curl installation
Checking docker installation
Checking if image is already present
Downloading image
3.7.0-1: Pulling from nvidia/clara/clara-parabricks
284055322776: Pull complete
add8cbaa9fce: Pull complete
a884b5401cbf: Pull complete
dafa3e5bf39d: Pull complete
be7367c7df12: Pull complete
c3d031c6a256: Pull complete
6890df1c1b43: Pull complete
Digest: sha256:04030b682a3e68a2901569fbc97feefe3f12afb5663c79f025b627682ff1891d
Status: Downloaded newer image for nvcr.io/nvidia/clara/clara-parabricks:3.7.0-1
nvcr.io/nvidia/clara/clara-parabricks:3.7.0-1
Installing image
/workspace/parabricks
Sending build context to Docker daemon 5.632kB
Step 1/23 : FROM nvcr.io/nvidia/clara/clara-parabricks:3.7.0-1
Step 2/23 : RUN rm -rf /var/lib/apt/lists/*
Step 3/23 : RUN apt update
Step 4/23 : RUN DEBIAN_FRONTEND=noninteractive apt install --no-install-recommends -y libssl-dev libcurl4-openssl-dev python3.7 default-jre libboost-iostreams-dev libboost-iostreams-dev libboost-filesystem-dev libboost-system-dev libboost-program-options-dev python3-pip python3-setuptools python3.7-dev gcc g++ perl cmake automake tabix samtools bcftools wget libjpeg-dev libpython3.7-dev libbz2-dev liblzma-dev unzip
Step 5/23 : RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 1
Step 6/23 : RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.7 2
Step 7/23 : RUN pip3 install wheel Cython
Step 8/23 : RUN pip3 install matplotlib
Step 9/23 : RUN pip3 install pysam
Step 10/23 : RUN pip3 install requests
Step 11/23 : RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
Step 12/23 : RUN bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/miniconda
Step 13/23 : ENV PATH $PATH:/opt/miniconda/condabin
Step 14/23 : RUN conda config --remove channels defaults
Step 15/23 : RUN eval "$(/opt/miniconda/bin/conda shell.bash hook)" && conda create -y --name manta_strelka
Step 16/23 : COPY installMantaAndStrelka.sh /parabricks
Step 17/23 : RUN /parabricks/installMantaAndStrelka.sh
Step 18/23 : RUN cp /parabricks/starfusion_req/TiedHash.pm /etc/perl/
Step 19/23 : RUN wget https://github.com/brentp/gsort/releases/download/v0.1.4/gsort_linux_amd64 -O /usr/bin/gsort && chmod +x
Step 20/23 : RUN wget https://github.com/arq5x/lumpy-sv/releases/download/0.3.0/lumpy-sv.tar.gz && tar -xvzf lumpy-sv.tar.gz && cd lumpy-sv && make && cd ../ && cp lumpy-sv/bin/* /usr/bin/
Step 21/23 : RUN wget https://github.com/wwylab/MuSE/releases/download/v2.0/MuSE-2.0.ubuntu1804_x86_64.tar.gz && tar -xvzf MuSE-2.0.ubuntu1804_x86_64.tar.gz && chmod +x MuSE && mv MuSE /usr/bin
Step 22/23 : RUN wget https://github.com/fulcrumgenomics/fgbio/releases/download/1.4.0/fgbio-1.4.0.jar && mv fgbio-1.4.0.jar /parabricks/resources/
Step 23/23 : RUN wget https://github.com/broadinstitute/gatk/releases/download/4.2.4.0/gatk-4.2.4.0.zip && unzip -j gatk-4.2.4.0.zip gatk-4.2.4.0/gatk-package-4.2.4.0-local.jar && cp gatk-package-4.2.4.0-local.jar /usr/local/cuda/.pb/binaries/bin/
Successfully built 98da555b40b8
Successfully tagged parabricks/release:3.7.0-1
Image Installation successful.
Copying Scripts
Installation successful
```
- 忽略 EULA 內容,只留標題
- 忽略 docker build 細節,只留 `Step X/23` 標題
<br>
<hr>
<br>
## 檢視安裝目錄 & 設定
```bash=
ubuntu@ip-172-31-31-33:/$ tree /opt/parabricks/
/opt/parabricks/
├── BRANCHES.txt
├── Dockerfile
├── README.md
├── __pycache__
│ ├── bed_file_creator.cpython-36.pyc
│ ├── common_err_mesg.cpython-36.pyc
│ ├── pb_compose.cpython-36.pyc
│ ├── pbargs.cpython-36.pyc
│ ├── pbargs_check.cpython-36.pyc
│ ├── pbargs_pipeline_check.cpython-36.pyc
│ ├── pbmaster.cpython-36.pyc
│ ├── pbutils.cpython-36.pyc
│ ├── pbversion.cpython-36.pyc
│ └── toolversion.cpython-36.pyc
├── bed_file_creator.py
├── common_err_mesg.py
├── config.txt
├── docs
│ └── EULA.txt
├── gatkversion.py
├── installMantaAndStrelka.sh
├── pb_compose.py
├── pbargs.py
├── pbargs_check.py
├── pbargs_pipeline_check.py
├── pbmaster.py
├── pbrun
├── pbutils.py
├── pbversion.py
└── toolversion.py
2 directories, 28 files
```
```bash=
ubuntu@ip-172-31-31-33:/$ cat /opt/parabricks/config.txt
nvidia-docker
x86_64
flexera-server:localhost:7070 <---
```
關鍵字搜尋:
[](https://i.imgur.com/Ol8Z9C0.png)
<br>
<hr>
<br>
## 快速測試
### 1. [下載 sample](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-4-example-run)
```
$ wget -O parabricks_sample.tar.gz "https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz"
```
<br>
### 2. [執行 pbrun germline](https://docs.nvidia.com/clara/parabricks/v3.5/text/germline_pipeline.html)
```bash=
pbrun germline \
--ref Ref/Homo_sapiens_assembly38.fasta \
--in-fq Data/sample_1.fq.gz Data/sample_2.fq.gz \
--knownSites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--out-bam output.bam \
--out-variants output.vcf \
--out-recal-file report.txt
```
```
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /workspace/parabricks_sample/Data/sample_1.fq.gz and
/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Read group created for /workspace/parabricks_sample/Data/sample_1.fq.gz and
/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
[PB Info 2022-Apr-12 04:50:14] Logger not initialized!
[PB Info 2022-Apr-12 04:50:14] ------------------------------------------------------------------------------
[PB Info 2022-Apr-12 04:50:14] || Parabricks accelerated Genomics Pipeline ||
[PB Info 2022-Apr-12 04:50:14] || Version 3.7.0-1 ||
[PB Info 2022-Apr-12 04:50:14] || GPU-BWA mem, Sorting Phase-I ||
[PB Info 2022-Apr-12 04:50:14] || Contact: Parabricks-Support@nvidia.com ||
[PB Info 2022-Apr-12 04:50:14] ------------------------------------------------------------------------------
[PB Warning 2022-Apr-12 04:50:15][FlexeraClient.cpp:606] Error: failed server communication: err 0x74000008 sys 0x7: [1,7df,3,0[74000008,7,110001d0]] Generic communications error.
[1,7df,3,0[75000001,7,300101b5]] General data transfer failure. Couldn't connect to server
For technical support visit https://docs.nvidia.com/clara/parabricks/3.7.0/index.html#how-to-get-help
Exiting...
[PB Warning 2022-Apr-12 04:50:15][FlexeraClient.cpp:606] Error: failed server communication: err 0x74000008 sys 0x7: [1,7df,3,0[74000008,7,110001d0]] Generic communications error.
[1,7df,3,0[75000001,7,300101b5]] General data transfer failure. Couldn't connect to server
Could not run fq2bam as part of germline pipeline
Exiting pbrun ...
```
- 錯誤訊息
```
[FlexeraClient.cpp:606] Error:
failed server communication: err 0x74000008 sys 0x7
```
<br>
### 如果把 `license.bin` 放到 /opt/parabricks 下
```bash=
$ pbrun germline \
> --ref Ref/Homo_sapiens_assembly38.fasta \
> --in-fq Data/sample_1.fq.gz Data/sample_2.fq.gz \
> --knownSites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
> --out-bam output.bam \
> --out-variants output.vcf \
> --out-recal-file report.txt
Please visit https://docs.nvidia.com/clara/#parabricks for detailed documentation
[Parabricks Options Mesg]: Automatically generating ID prefix
[Parabricks Options Mesg]: Read group created for /workspace/parabricks_sample/Data/sample_1.fq.gz and
/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
[Parabricks Options Mesg]: Checking argument compatibility
[Parabricks Options Mesg]: Read group created for /workspace/parabricks_sample/Data/sample_1.fq.gz and
/workspace/parabricks_sample/Data/sample_2.fq.gz
[Parabricks Options Mesg]: @RG\tID:HK3TJBCX2.1\tLB:lib1\tPL:bar\tSM:sample\tPU:HK3TJBCX2.1
Cannot use separate license.bin in conjunction with flexera licensing. Either remove the separate license.bin or remove the flexera server line from the installation config.txt
For technical support visit https://docs.nvidia.com/clara/parabricks/3.7.0/index.html#how-to-get-help
Exiting...
Could not run fq2bam as part of germline pipeline
Exiting pbrun ...
```
- 錯誤訊息
> Cannot use separate license.bin in conjunction with flexera licensing. Either remove the separate license.bin or remove the flexera server line from the installation config.txt