Singularity
===
###### tags: `TWCC`
###### tags: `TWCC`, `Singularity`, `Slurm`, `Docker`, `container`
<br>
[TOC]
<br>
## TWCC 平台
1. **登入平台**
- 登入方式
```
$ ssh 主機帳號@ln01.twcc.ai
```
- 登入密碼
主機密碼加上OTP動態碼
<br>
2. **下載 image 與試跑 image**
- ubuntu
```
$ singularity pull docker://ubuntu:20.04
$ ls ubuntu_20.04.sif
$ singularity exec ubuntu_20.04.sif bash
Singularity> cat /etc/os-release
```
- parabricks
```
$ singularity pull docker://nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1
$ ls clara-parabricks_4.0.0-1.sif
$ singularity exec --nv clara-parabricks_4.0.0-1.sif bash
Singularity> pbrun -h
```
- `--nv`
- [Singularity GPU Containers Options](https://modinst.lu.lv/wp-content/uploads/2021/03/Singularity_seminars_Aleksandrs_Gutcaits.pdf)
GPU use: If your host system has an NVIDIA GPU card and a driver installed, you can leverage the card with the `--nv` option
<br>
3. **測試 singularity batch**
```
$ sbatch --gpus-per-node=1 --account=ENT110209 run_parabricks.sh
```
- 未帶底下參數,會有對應錯誤訊息,再補上即可
- `--gpus-per-node=<num>`
```
$ sbatch run_parabricks.sh
sbatch: error: Missing assigned gpus, try to use --gpus-per-node=<num>
sbatch: error: Batch job submission failed: Unspecified error
```
- `--account=<project_id>`
```
$ sbatch --gpus-per-node=1 run_parabricks.sh
sbatch: error: Missing assigned project, try to use --account=<project_id>
sbatch: error: Please check the wallet information below :)
sbatch: error: ----------------------------------- wallet info -----------------------------------
sbatch: error: PROJECT_ID: EN******09, PROJECT_NAME: O20-*************專案, SU_BALANCE: 3540725.447
sbatch: error: PROJECT_ID: EN******43, PROJECT_NAME: P**-************-POC, SU_BALANCE: 0
sbatch: error: PROJECT_ID: GO******65, PROJECT_NAME: NC**-*************-1, SU_BALANCE: 263439.4348
sbatch: error: PROJECT_ID: GO******41, PROJECT_NAME: OneAI***************, SU_BALANCE: ******
sbatch: error: -----------------------------------------------------------------------------------
sbatch: error: Batch job submission failed: Unspecified error
```
- 相關內容:

```bash
$ ls
clara-parabricks_4.0.0-1.sif run_parabricks.sh
$ cat run_parabricks.sh
#!/bin/bash
nvidia-smi
singularity exec clara-parabricks_4.0.0-1.sif pbrun version
```
- 執行完結果
```
$ ls
clara-parabricks_4.0.0-1.sif run_parabricks.sh slurm-424037.out
```

<br>
3. **測試 parabricks batch**
```
$ sbatch --gpus-per-node=1 --account=EN******09 run_parabricks.sh
sbatch: INFO: It is recommended to specify `--nodes` and `--ntasks-per-node` together
Submitted batch job 424658
```
- 前處理:下載 dataset
```
$ wget -O parabricks_sample.tar.gz "https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz"
$ tar -xvzf parabricks_sample.tar.gz
```
- `run_parabricks.sh`
```bash
$ cat run_parabricks.sh
#!/bin/bash
nvidia-smi
singularity exec clara-parabricks_4.0.0-1.sif pbrun version
cd workspace/parabricks/
ls -ls
singularity exec --nv ../../clara-parabricks_4.0.0-1.sif pbrun germline \
--ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \
--in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz \
--knownSites parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \
--out-bam output.bam \
--out-variants output.vcf \
--out-recal-file report.txt \
--x3
```
- [確認 slurm 任務狀態](https://man.twcc.ai/@twccdocs/doc-twnia2-main-zh/https%3A%2F%2Fman.twcc.ai%2F%40twccdocs%2Fguide-twnia2-job-state-zh)
:::spoiler `$ scontrol show job 424658`
```
JobId=424658 JobName=run_parabricks.sh
UserId=tjtsai29863(15444) GroupId=ENT110209(57477) MCS_label=N/A
Priority=10410272 Nice=0 Account=ent110209 QOS=normal
JobState=FAILED Reason=NonZeroExitCode Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=255:0
RunTime=00:00:43 TimeLimit=2-00:00:00 TimeMin=N/A
SubmitTime=2022-11-02T14:46:44 EligibleTime=2022-11-02T14:46:44
AccrueTime=2022-11-02T14:46:44
StartTime=2022-11-02T14:46:45 EndTime=2022-11-02T14:47:28 Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2022-11-02T14:46:45
Partition=gp2d AllocNode:Sid=ln01-twnia2:158147
ReqNodeList=(null) ExcNodeList=(null)
NodeList=gn1221
BatchHost=gn1221
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=90G,node=1,billing=1,gres/gpu=1
Socks/Node=* NtasksPerN:B:S:C=1:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=90G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/tjtsai29863/run_parabricks.sh
WorkDir=/home/tjtsai29863
StdErr=/home/tjtsai29863/slurm-424658.out
StdIn=/dev/null
StdOut=/home/tjtsai29863/slurm-424658.out
Power=
TresPerNode=gpu:1
```
:::
:::spoiler `$ squeue -u tjtsai29863`
```
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
424659 gp2d run_para tjtsai29 R 1:08:49 1 gn1221
$ squeue # show all jobs
...
```
:::
:::spoiler `$ sacct -j 424658`
```
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
424658 run_parab+ gp2d ent110209 1 RUNNING 0:0
424658.exte+ extern ent110209 1 RUNNING 0:0
```
:::
```
$ sinfo
$ scancel job_id
```
<br>
## 參考資料
- [[他人HackMD] Singularity](https://hackmd.io/@chialu/H1xZf5PQu)

- [[HackMD] 客製化 image 建立](https://hackmd.io/@praexisio/BJdzfcEGY/%2F%40praexisio%2Fbuild)
- [[twcc][doc] 查詢主機帳號、重置密碼與取得 OTP 認證碼](https://man.twcc.ai/@twccdocs/guide-service-hostname-pwd-otp-zh)
- [[twcc][doc] TWCC Parabricks Quickstart & Tutorial](https://man.twcc.ai/@Ldk_QYrOR2yo3m8Cb1549A/rkQGosieK)
- [國網生科雲](https://docs.google.com/presentation/d/1dSoj1Eygj8I1BjGzvrtzm9Rs1hIDTYmWPJRM1m5VBik/edit#slide=id.p1)
- [Singularity GPU Containers Options](https://modinst.lu.lv/wp-content/uploads/2021/03/Singularity_seminars_Aleksandrs_Gutcaits.pdf)
- [[twcc][doc] HowTo:建立 TWNIA2 容器](https://man.twcc.ai/@twccdocs/doc-twnia2-main-zh/https%3A%2F%2Fman.twcc.ai%2F%40twccdocs%2Fhowto-twnia2-create-sglrt-container-zh)
- [[iservice][doc] Singularity 使用說明Container 可以用來包裝使用者已客製化的](https://iservice.nchc.org.tw/download_file.php?f=sgE9EJyhGrMjM-DYM8dUpuMOGtiNXL_iBZOhRBnjhcRAwvLHNUwaIP1Mr6OmC46g7LczlfgqVv_VG9rYwvk2zw)