Parabricks 軟體下載&安裝說明
===
###### tags: `Parabricks-v3.5`
###### tags: `基因體`, `NVIDIA`, `Clara`, `Parabricks`, `二級分析`
> 這裡的 Parabricks 是指 Parabricks Pipeline 工具
<br>
[TOC]
<br>
:::warning
:information_source: **[HACKMD] Parabricks 2.5 & 3.1 的舊版資訊**
- [NVIDIA / Parabricks / 軟體下載與安裝(v1)](/jbvnEl5YRwWBa3QMGBcHbQ)
:::
<br>
## [官方文件(v3.5)](https://docs.nvidia.com/clara/parabricks/v3.5/index.html)
:::warning
:information_source: **入口點**
https://docs.nvidia.com/clara/#parabricks

:::
- [[概要][PDF] NVIDIA CLARA PARABRICKS PIPELINES FOR GENOMICS ANALYSIS](https://resources.nvidia.com/c/healthcare-genomics-?x=sFVHf4&lx=OhKlSJ&topic=Solution%20Brief)
[](https://i.imgur.com/W96hKlW.png)
[](https://i.imgur.com/lxQsSBs.png)
<br>
<hr>
<br>
## 下載 Parabricks 安裝套件
### Step1 - 填寫申請表格
- ### https://www.nvidia.com/en-us/docs/nvidia-parabricks-general/

- ### 使用 gmail 帳號申請,收到的信件有可能跑到促銷廣告裡

- ### 在接下來的 72 小時內,可取得 license key

<br>
### Step2 - 審核通過,接著註冊 Nvidia 帳號

- → [**SIGN IN TO NVIDIA GPU CLOUD**](http://tmailclick.nvidia.com/track/click/30646027/ngc.nvidia.com?p=eyJzIjoiemk4a0ZzYjBSZDBoR3FkVzRRSVVDY3M0cUdBIiwidiI6MSwicCI6IntcInVcIjozMDY0NjAyNyxcInZcIjoxLFwidXJsXCI6XCJodHRwczpcXFwvXFxcL25nYy5udmlkaWEuY29tXFxcL3NpZ25pblwiLFwiaWRcIjpcIjY3NTExNDU4ZGZjNTQyZDViMDlhMzhmOGY5MDMzNDllXCIsXCJ1cmxfaWRzXCI6W1wiOTMyMjA0ZTc3NjA5ZDY4MzNkN2JjYzQ5MTI4YzI4OTdlMWU2OGFiNFwiXX0ifQ)
- 必須使用 Step1 申請的 email,去註冊為 Nvidia 帳號,這樣才看的到資源。
也就是說,權限 `external-parabricks-trial-users` 是綁定到該 email
- 有權限情況下,在註冊後,可以看到 `parabricks_free_trial` 資源

- 沒有權限情況下,看不到資源

<br>
### Step3 - 下載 license key
[](https://i.imgur.com/dwhlOde.png)
- ### [Here](https://ngc.nvidia.com/resources/external-parabricks-trial-users:parabricks_free_trial) is a link to the trial license resource.
- https://ngc.nvidia.com/resources/external-parabricks-trial-users:parabricks_free_trial

這一頁是用來下載 license key
- ### [Here](https://forums.developer.nvidia.com/uploads/short-url/5kvnj7M9EiMH70vInNOHsj3GOiM.pdf) are the download instructions.
- https://forums.developer.nvidia.com/uploads/short-url/5kvnj7M9EiMH70vInNOHsj3GOiM.pdf
這個 link 則是 pdf 教學文件,教導如何下載 license key
- ### 下載方法有 3 種 (請查看 pdf)
1. 直接點選「資源清單入口」的下載點 ==(不 work)==

2. 進入「parabricks_free_trial 資源」,從 File Browser 點選下載

- https://ngc.nvidia.com/resources/external-parabricks-trial-users:parabricks_free_trial/files?version=3.7.0-1_2022-05-31#
- https://ngc.nvidia.com/resources/external-parabricks-trial-users:parabricks_free_trial/files?version=3.7.0-1_2022-08-31
3. 進入「parabricks_free_trial 資源」,執行 CLI command ==(不 work)==
1. 需事先安裝 CLI
2. 取得 API key
3. 執行 ngc 指令 (CLI command)
註:CLI command 主要是給終端機使用
- ### 亦可在論壇下載教學文件
https://forums.developer.nvidia.com/t/how-to-download-clara-parabricks-pipelines-trial-version/165217

- ### 下載的 license key 長相
```
parabricks.tar.gz
├── [ 793] Dockerfile_post
├── [2.4K] Dockerfile_post_extra_tools
├── [ 32K] EULA.txt
├── [ 24K] installer.py <--- pbrun 安裝程式
├── [ 697] installMantaAndStrelka.sh
├── [ 181] license.bin <--- license key
├── [ 698] singularity_post
└── [1.6K] singularity_post_extra_tools
```
<br>
<hr>
<br>
## [補充說明] 下載 license key
### 透過 CLI command
:::info
:bulb: 關於 Ubuntu20.04 (2022/03/04 補充)
- [Account > Setup](https://ngc.nvidia.com/setup)
- [CLI](https://ngc.nvidia.com/setup/installers/cli)
- AMD64 Linux: 執行檔可執行
```
$ ngc
usage: ngc [--debug] [--format_type <fmt>] [--version] [-h] {config,diag,registry} ...
ngc: error: too few arguments
```
- ARM64 Linux: 執行檔不可執行
```
$ ngc
bash: ./ngc: cannot execute binary file: Exec format error
```
- [Generate API Key](https://ngc.nvidia.com/setup/api-key)
:::
- - 2021/06/04 - 操作指令(沒有成功)
```
$ ngc config set
$ docker login nvcr.io
$ ngc registry resource download-version \
"external-parabricks-trial-users/parabricks_free_trial:v3.5.0_2021-07-31"
```
[](https://i.imgur.com/t7LeNtm.png)
[](https://i.imgur.com/fhnZsle.png)
- 2022/03/04 - 有成功,可以 work
```
$ ngc config set
Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: N29******************************************************************************DBk
Enter CLI output format type [ascii]. Choices: [ascii, csv, json]: json
Enter org [no-org]. Choices: ['ea-nvidia-clara-train', 'external-parabricks-trial-users']: 'external-parabricks-trial-users'
{
"error": "Invalid org. Please re-enter."
}
Enter org [no-org]. Choices: ['ea-nvidia-clara-train', 'external-parabricks-trial-users']: external-parabricks-trial-users
Enter team [no-team]. Choices: ['no-team']:
Enter ace [no-ace]. Choices: ['no-ace']:
```
```bash
$ docker login nvcr.io
Username: $oauthtoken
Password:
WARNING! Your password will be stored unencrypted in /home/tj/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
```
```json
$ ngc registry resource download-version "external-parabricks-trial-users/parabricks_free_trial:3.7.0-1_2022-05-31"
{
"download_end": "2022-03-04 13:01:21.155012",
"download_start": "2022-03-04 13:01:17.149735",
"download_time": "4s",
"files_downloaded": 1,
"local_path": "/home/tj/Downloads/parabricks_free_trial_v3.7.0-1_2022-05-31",
"size_downloaded": "20.16 KB",
"status": "Completed", <-----
"transfer_id": "parabricks_free_trial_v3.7.0-1_2022-05-31"
}
```
<br>
<hr>
<br>
## 安裝說明
:::warning
:bulb: 先行了解,再上機操作
:::
### [Step1 - 前置作業:檢查執行環境](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-1-make-sure-installation-requirements-are-met)
- `>=` cuda-10.0
- nvidia-docker
- python3
- curl
- 單顆 GPU 的記憶體,至少要 12GB 以上
- GPU 配置
| GPU num | CPU RAM | CPU threads|
| ------- | ------- | ---------- |
| 1 | 50GB ? | 16 |
| 2 | 100GB | 24 |
| 4 | 196GB | 32 |
| 8 | 392GB | 48 |
- [透過 pbrun 參數](https://docs.nvidia.com/clara/parabricks/v3.5/text/germline_pipeline.html)
- ```--num-gpus NUM_GPUS```
Number of GPUs to use for a run. GPUs 0..(NUM_GPUS-1) will be used. If you are using flexera, please include –gpu-devices too.
- 單顆 GPU 至少要 16 條執行緒
```
WARNING
The system has 12 threads, however recommended number of threads with 1 GPU is 16.
The run might not finish or might have less than expected performance.
```
- 單顆 CPU 也可完成 Germline
| Time | Function |
| ---- | -------- |
| 5m 50s | GPU-BWA mem, Sorting Phase-I |
| 0m 10s | Sorting Phase-II |
| 2m 02s | Marking Duplicates, BQSR |
| 6m 45s | GPU-GATK4 HaplotypeCaller |
| **14m 47s** | **Total** |
<br>
### [Step2 - 下載 parabricks 套件](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-2-downloading-installation-package)
- 就是 parabricks.tar.gz 那包
<br>
### [Step3 - 安裝 parabricks 套件](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-3-install-the-parabricks-suite)
<br>
### [Step4 - 下載測試樣本,並執行工具](https://docs.nvidia.com/clara/parabricks/v3.5/text/getting_started.html#step-4-example-run)
<br>
<hr>
<br>
## 一些重點資訊
- [What is CLARA PARABRICKS Pipelines?](https://docs.nvidia.com/clara/parabricks/v3.5/text/software_overview.html#what-is-clara-parabricks-pipelines)
- Parabricks 是軟體套件,執行二級分析
- 主要優點:速度極快、低成本
- 30x WGS 資料,傳統約需 30 小時,Parabricks 只需 45 分
- [SOFTWARE OVERVIEW](SOFTWARE OVERVIEW)
- NEW
[](https://i.imgur.com/2md6GPy.png)
- OLD
[](https://i.imgur.com/AyMBNKQ.png)
- 差異
1. 舊的 [RNA] alignment 併到 [ALIGNMENT]
2. [PREPROCESSING] BWA Mem Alignment 換成 [MerBam]
<br>
<hr>
<br>
## 一些新的資訊
### 文件整個大翻修
- 針對不同版本工具,對應到不同份文件
- 流程圖圖片也都重新翻修
### Pipeline
- 示意圖
[]
[PDF](https://cf-store.widencdn.net/nvdam/e/d/8/ed89685e-4e84-449f-bd83-bc0ff92e4180.pdf?response-content-disposition=attachment%3B%20filename%3D%22Parabricks%20Pipelines%20Updated%20Data%20Sheet.pdf%22&Expires=1625115735&Signature=JhVZTu8xLcSIBAAyLPkPreDlSLZ4X00NUew70iyL3J3EHp0-~K6yf3rHXz4FZPsBCt5Il05NySRxDpfIPJvW9OAx0YJHr7y9gEwVWD~D0ZrJwmchufm2Ade5vgqA2OgHwBtn3xQQ0INgE2fLdCpZFW3WOeY4K7ZnPveGMAuICDtRva4f2S7ukcxied0ZYzc3KiiNUKvbS3MnZzJ65dsrfvWXeTMbEGYBirasuAz1JKB2KDUlpR2Rx7ky7rckBDBhZtY8A8awqCaMue3CGMDrZa5dDbIgK~apRYdr4Y9x8Z~U-T80fwfls1JQkAwfDRV4~oU4sbupXI~3G0z7wMjlrg__&Key-Pair-Id=APKAJD5XONOBVWWOA65A) ([資料來源](https://resources.nvidia.com/c/healthcare-genomics-?x=sFVHf4&lx=OhKlSJ&topic=Solution%20Brief))
- 新增 [DEEPVARIANT GERMLINE PIPELINE](https://docs.nvidia.com/clara/parabricks/v3.5/text/deep_pipeline.html#rst-deep)
- 但沒有提供 pipeline 指令
- 過去是用 fq2bam + deepvariant 兩個單一工具組裝
- **GERMLINE** vs. **DEEPVARIANT GERMLINE**
- GERMLINE
[](https://i.imgur.com/fQucTgf.png)
- DEEPVARIANT GERMLINE
[](https://i.imgur.com/iftYYS0.png)
- 新增 [RNA PIPELINE](https://docs.nvidia.com/clara/parabricks/v3.5/text/rna_pipeline.html#rst-rna)
- **GERMLINE** vs. **RNAseq**
- GERMLINE
[](https://i.imgur.com/fQucTgf.png)
- RNAseq
[](https://i.imgur.com/aAThMud.png)
- 參考資料
- [BWA or STAR for RNAseq?](https://www.biostars.org/p/330942/)
> Q: 為何不使用 BWA 就好了,要用 STAR 映射到參考框架?
> A: BWA 不會認知到有剪接,除非你正在處理的是沒有內含子的細菌
- [比对软件STAR的简单使用](https://www.bioinfo-scrounger.com/archives/288/)
- [【生信小课堂】如何直观的理解RNA-seq分析流程](https://zhuanlan.zhihu.com/p/139773946) :+1:
- [生物信息学流程:mRNA Analysis Pipeline](https://www.jianshu.com/p/b0d3a24e349e)
- [最新版针对RNA-seq数据的GATK找变异流程](https://cloud.tencent.com/developer/article/1536221)
- [STAR-Fusion 寻找融合基因](https://www.sohu.com/a/396207045_652735)
<br>
## 參考資料
- [TWCC Parabricks Quickstart & Tutorial](https://hackmd.io/@kmo/twcc_hpc_parabricks)