NVIDIA / Parabricks
===
###### tags: `Parabricks-v3.1`
###### tags: `基因體`, `NVIDIA`, `Clara`, `Parabricks`, `二級分析`
<br>
[TOC]
<br>
## ==[Parabricks 工具軟體](https://www.nvidia.com/zh-tw/healthcare/clara-parabricks/)==
> - Intro
> - https://www.nvidia.com/zh-tw/healthcare/clara-parabricks/
> - https://www.nvidia.com/en-us/healthcare/clara-parabricks/
> - For developer
> - https://developer.nvidia.com/clara-parabricks
> - https://www.nvidia.com/en-us/docs/parabricks/quickstart-guide/
- ### 本質
- 一個計算框架 (a computational framework)
支援從 DNA 到 RNA 的基因體應用 (supporting genomics applications from DNA to RNA)
- 也可以說:
[一個基因組分析工具包 (Genome Analysis Toolkit)](https://udn.com/news/story/7086/4431596)
- ### 具體內容
- 一個完整的現成(off-the-shelf)解決方案的組合
([a complete portfolio of off-the-shelf solutions](https://developer.nvidia.com/clara-parabricks))
(GPU 版本的 GATK)
- 一組工具箱(toolkit)
- GPU加速函數庫
- Medaka、Racon、Raven、Reticulatus,以及Unicycler
- ### 開發目的
- 滿足基因體實驗室的計算需求
- 透過 GPU 帶來的效益
[](https://www.nvidia.com/zh-tw/healthcare/clara-parabricks/)
- ### [價格資訊](https://www.nvidia.com/zh-tw/healthcare/clara-parabricks/)
- [免費試用一個月](https://www.nvidia.com/en-us/docs/nvidia-parabricks-general/)

- [取得報價](https://www.nvidia.com/en-us/docs/nvidia-parabricks-buy-now/)
- ### 流程採用者

- ### 影片簡介
- [NVIDIA Parabricks: GPU-Accelerated GATK](https://www.youtube.com/watch?v=r5iWLqguRLk) (影片長度:5m3s)
- [使用 NVIDIA PARABRICKS和 CLARA GENOMICS加速基因组学分析](https://www.bilibili.com/video/BV1Cf4y117up)
- ### 開發者文件
- [NVIDIA Clara Parabricks](https://developer.nvidia.com/clara-parabricks) (入口點)
- Clara Parabricks Toolkit
- [github](https://github.com/clara-parabricks)
- Clara Parabricks Pipelines
- [NVIDIA Clara Parabricks Pipelines3.0.0](https://www.nvidia.com/en-us/docs/parabricks/quickstart-guide/)
- [[PDF] Parabricks Product Sheet](https://resources.nvidia.com/c/healthcare-genomics-?x=sFVHf4&lx=OhKlSJ&topic=Solution%20Brief)
- [Nvidia Parabricks 論壇](http://go.nvidianews.com/Bv060O0NM000dnvF0gZ6a9E)
<br>
<hr>
<br>
## ==Clara Parabricks Toolkit==
### [系統架構圖 (nvidia-docker)](https://github.com/NVIDIA/nvidia-docker)
[](https://i.imgur.com/VqSTdO1.png)
<br>
### [Parabricks 工具架構圖](https://developer.nvidia.com/clara-parabricks)

- ### 函式庫
- CUDA Mapper
- CUDA Aligner
- CUDA POA
- POA: [Partial Order Alignment](https://simpsonlab.github.io/2015/05/01/understanding-poa/)

-
- ### Nvidia 提供的應用程式範例
- Atac-Seq Deep Learning Denoising (Github: [AtacWorks](https://github.com/clara-parabricks/AtacWorks))
> a deep learning application to improve coverage track denoising and peak calling from low-coverage or low-quality ATAC-Seq data. Outputs are in the standard file format.
- RNA-Seq Analytics (Github: [rapids-single-cell-examples](https://github.com/clara-parabricks/rapids-single-cell-examples))
> an interactive notebook for single cell RNA-Seq data.
- DL Variant Caller (Github: [DL4VC](https://github.com/clara-parabricks/DL4VC))
> a deep learning based variant caller that outputs in the standard file format.
- Long Read Mapping
> based on the minimap2 software, this application maps long sequencing read data and outputs in the standard file format.
- ### 第三方開發的應用程式
-
<br>
## ==Clara Parabricks Pipelines 3.0.0==
> GPU 版本的 GATK:
### [Germline Pipeline](https://www.nvidia.com/en-us/docs/parabricks/germline/)

- 管線用的指令
```
$ pbrun germline --ref Homo_sapiens_assembly38.fasta ...
```
- 等同循序執行 fq2bam + applybqsr + haplotypecaller 指令
- 單一獨立工具
- [fq2bam](https://www.nvidia.com/en-us/docs/parabricks/fastq-and-bam-processing/fq2bam/)
```
$ pbrun fq2bam --ref Ref/Homo_sapiens_assembly38.fasta \
--in-fq S1_1.fastq.gz S1_2.fastq.gz \
--knownSites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--out-bam mark_dups_gpu.bam \
--out-recal-file recal_gpu.txt \
--tmp-dir /raid/myrun
```
- [bqsr](https://www.nvidia.com/en-us/docs/parabricks/fastq-and-bam-processing/bqsr/)
```
$ pbrun bqsr --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam S1.bam \
--knownSites Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--out-recal-file recal_gpu.txt
```
- [applybqsr](https://www.nvidia.com/en-us/docs/parabricks/fastq-and-bam-processing/applybqsr/)
```
$ pbrun applybqsr --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam S1.bam \
--in-recal-file S1_report.txt \
--out-bam S1_updated.bam
```
- [haplotypecaller](https://www.nvidia.com/en-us/docs/parabricks/variant-callers/haplotypecaller/)
```
$ pbrun haplotypecaller --ref Ref/Homo_sapiens_assembly38.fasta \
--in-bam mark_dups_gpu.bam \
--in-recal-file recal_gpu.txt \
--out-variants result.vcf
```
- etc.
### [Human Par Pipeline](https://www.nvidia.com/en-us/docs/parabricks/human-par-pipeline/)
### [Somatic Pipeline](https://www.nvidia.com/en-us/docs/parabricks/somatic/)

### [Population Studies](https://www.nvidia.com/en-us/docs/parabricks/population-studies/)
[](https://i.imgur.com/oB9DXg5.png)
<br>
## ==Parabricks Server 建置白皮書==
- [Accelerating Next Generation Sequencing Secondary Analysis with
Dell EMC Isilon and NVIDIA Parabricks](https://www.dellemc.com/en-in/collaterals/unauth/white-papers/solutions/parabricks-isilon-nvidia-wp.pdf)
- 軟硬體
- NVIDIA Parabricks applications
- NVIDIA DGX-1 system
- Dell EMC Isilon network-attached storage (NAS)
- 處理量
- 能應付 Illumina NovaSeq 6000 每日的產量
- 或是 每日 24 人的 40x 全基因體定序資料
- **測試樣本**
- [NA12878 Sample (NIST RM 8398) ](https://www.ebi.ac.uk/ena/browser/view/ERR194147)
<br>
## 安裝 Parabricks
- ### 詳細的下載與安裝過程,請參考:
[[Hackmd] NVIDIA / Parabricks / 軟體下載與安裝](/jbvnEl5YRwWBa3QMGBcHbQ)
- ### 詳細的執行結果,請參考:
[[Hackmd] NVIDIA / Parabricks / docker inspect …](/dicZf7wIT_6P_cqASd2EDQ)
- ### 模擬 v3 版本的 pbrun
[NVIDIA / Parabricks / germline_v2.sh](/Gj6m21XiTiWMq5ZyACNhBQ)
- ### pbrun & docker 指令對照
[NVIDIA / Parabricks / pbrun & docker 指令對照](/ATvhDQ9DRNOvTn6HtaBoYg)
- ### container 裡面的細節
[NVIDIA / Parabricks / container 裡的細節](/NHdIBUiOSyGeNdJiAh82mw)
<br>
<hr>
<br>
## ==參考資料(官網)==
- [NVIDIA / 醫療保健與生命科學](https://www.nvidia.com/zh-tw/healthcare/)
- [NVIDIA CLARA PARABRICKS 端對端基因體定序分析 (intro)](https://www.nvidia.com/zh-tw/healthcare/clara-parabricks/)
- [NVIDIA Clara Parabricks (for developer)](https://developer.nvidia.com/clara-parabricks)
- Clara Parabricks Toolkit
- Clara Parabricks Pipelines
- [NVIDIA Clara Parabricks Pipelines3.0.0](https://www.nvidia.com/en-us/docs/parabricks/)
<br>
## ==參考資料(非官網)==
- [[PDF] Accelerating Next Generation Sequencing Secondary Analysis with Dell EMC Isilon and NVIDIA Parabricks](https://www.dellemc.com/en-in/collaterals/unauth/white-papers/solutions/parabricks-isilon-nvidia-wp.pdf)
> Software: Parabricks Application Suite (p10)
- 2020/08/21 - [NVIDIA希望藉由GPU加速病毒研究 開放更多運算資源協助抗疫](https://udn.com/news/story/7086/4431596)
> Parabricks軟體是以著名基因組分析工具包 (Genome Analysis Toolkit)為基礎,藉由GPU加速方式讓分析基因組定序資料速度提升50倍。藉由GPU加速將可讓單一伺服器上分析人類全基因組變異點偵測所需時間,從原本數日降低至不到一小時即可完成。
- [Nvidia新一代AI超級電腦系統DGX A100亮相,搭載8張全新7奈米GPU加速卡,單臺效能更翻倍可達5 PetaFLOPS](https://www.ithome.com.tw/news/137617)
> 不只運算效能比前一代DGX-2多一倍,價格也只要原來一半,約新臺幣600萬元
- POA
- [Understanding Partial Order Alignment for Multiple Sequence Alignment](https://simpsonlab.github.io/2015/05/01/understanding-poa/)
- ATAC-seq
- [ATAC-seq著色非編碼DNA 繪出人類癌症染色質圖譜](https://geneonline.news/index.php/2018/10/29/atac-seq-mapping-the-chromatin-landscape-of-human-cancers/)
- Videos
- [使用 NVIDIA PARABRICKS和 CLARA GENOMICS加速基因组学分析](https://www.bilibili.com/video/BV1Cf4y117up)

- [Clara Genomics Analysis軟件開發套件實現開源](https://itw01.com/URGNKEW.html)
> 案例示範:Racon的GPU加速
> 該模組是一個GPU加速的Partial Order Alignment (POA)演算法執行,用於多序列對比。
<br>
## ==對照 DRAGEN==
- [Illumina DRAGEN Bio-IT Platform](https://www.illumina.com/products/by-type/informatics-products/dragen-bio-it-platform.html)
> Processes an entire human genome at 30× coverage in about 25 minutes
>