Singularity === ###### tags: `TWCC` ###### tags: `TWCC`, `Singularity`, `Slurm`, `Docker`, `container` <br> [TOC] <br> ## TWCC 平台 1. **登入平台** - 登入方式 ``` $ ssh 主機帳號@ln01.twcc.ai ``` - 登入密碼 主機密碼加上OTP動態碼 <br> 2. **下載 image 與試跑 image** - ubuntu ``` $ singularity pull docker://ubuntu:20.04 $ ls ubuntu_20.04.sif $ singularity exec ubuntu_20.04.sif bash Singularity> cat /etc/os-release ``` - parabricks ``` $ singularity pull docker://nvcr.io/nvidia/clara/clara-parabricks:4.0.0-1 $ ls clara-parabricks_4.0.0-1.sif $ singularity exec --nv clara-parabricks_4.0.0-1.sif bash Singularity> pbrun -h ``` - `--nv` - [Singularity GPU Containers Options](https://modinst.lu.lv/wp-content/uploads/2021/03/Singularity_seminars_Aleksandrs_Gutcaits.pdf) GPU use: If your host system has an NVIDIA GPU card and a driver installed, you can leverage the card with the `--nv` option <br> 3. **測試 singularity batch** ``` $ sbatch --gpus-per-node=1 --account=ENT110209 run_parabricks.sh ``` - 未帶底下參數,會有對應錯誤訊息,再補上即可 - `--gpus-per-node=<num>` ``` $ sbatch run_parabricks.sh sbatch: error: Missing assigned gpus, try to use --gpus-per-node=<num> sbatch: error: Batch job submission failed: Unspecified error ``` - `--account=<project_id>` ``` $ sbatch --gpus-per-node=1 run_parabricks.sh sbatch: error: Missing assigned project, try to use --account=<project_id> sbatch: error: Please check the wallet information below :) sbatch: error: ----------------------------------- wallet info ----------------------------------- sbatch: error: PROJECT_ID: EN******09, PROJECT_NAME: O20-*************專案, SU_BALANCE: 3540725.447 sbatch: error: PROJECT_ID: EN******43, PROJECT_NAME: P**-************-POC, SU_BALANCE: 0 sbatch: error: PROJECT_ID: GO******65, PROJECT_NAME: NC**-*************-1, SU_BALANCE: 263439.4348 sbatch: error: PROJECT_ID: GO******41, PROJECT_NAME: OneAI***************, SU_BALANCE: ****** sbatch: error: ----------------------------------------------------------------------------------- sbatch: error: Batch job submission failed: Unspecified error ``` - 相關內容: ![](https://i.imgur.com/3D25oSG.png) ```bash $ ls clara-parabricks_4.0.0-1.sif run_parabricks.sh $ cat run_parabricks.sh #!/bin/bash nvidia-smi singularity exec clara-parabricks_4.0.0-1.sif pbrun version ``` - 執行完結果 ``` $ ls clara-parabricks_4.0.0-1.sif run_parabricks.sh slurm-424037.out ``` ![](https://i.imgur.com/6kbsi8N.png) <br> 3. **測試 parabricks batch** ``` $ sbatch --gpus-per-node=1 --account=EN******09 run_parabricks.sh sbatch: INFO: It is recommended to specify `--nodes` and `--ntasks-per-node` together Submitted batch job 424658 ``` - 前處理:下載 dataset ``` $ wget -O parabricks_sample.tar.gz "https://s3.amazonaws.com/parabricks.sample/parabricks_sample.tar.gz" $ tar -xvzf parabricks_sample.tar.gz ``` - `run_parabricks.sh` ```bash $ cat run_parabricks.sh #!/bin/bash nvidia-smi singularity exec clara-parabricks_4.0.0-1.sif pbrun version cd workspace/parabricks/ ls -ls singularity exec --nv ../../clara-parabricks_4.0.0-1.sif pbrun germline \ --ref parabricks_sample/Ref/Homo_sapiens_assembly38.fasta \ --in-fq parabricks_sample/Data/sample_1.fq.gz parabricks_sample/Data/sample_2.fq.gz \ --knownSites parabricks_sample/Ref/Homo_sapiens_assembly38.known_indels.vcf.gz \ --out-bam output.bam \ --out-variants output.vcf \ --out-recal-file report.txt \ --x3 ``` - [確認 slurm 任務狀態](https://man.twcc.ai/@twccdocs/doc-twnia2-main-zh/https%3A%2F%2Fman.twcc.ai%2F%40twccdocs%2Fguide-twnia2-job-state-zh) :::spoiler `$ scontrol show job 424658` ``` JobId=424658 JobName=run_parabricks.sh UserId=tjtsai29863(15444) GroupId=ENT110209(57477) MCS_label=N/A Priority=10410272 Nice=0 Account=ent110209 QOS=normal JobState=FAILED Reason=NonZeroExitCode Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=255:0 RunTime=00:00:43 TimeLimit=2-00:00:00 TimeMin=N/A SubmitTime=2022-11-02T14:46:44 EligibleTime=2022-11-02T14:46:44 AccrueTime=2022-11-02T14:46:44 StartTime=2022-11-02T14:46:45 EndTime=2022-11-02T14:47:28 Deadline=N/A PreemptTime=None SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-11-02T14:46:45 Partition=gp2d AllocNode:Sid=ln01-twnia2:158147 ReqNodeList=(null) ExcNodeList=(null) NodeList=gn1221 BatchHost=gn1221 NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=1,mem=90G,node=1,billing=1,gres/gpu=1 Socks/Node=* NtasksPerN:B:S:C=1:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=90G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/tjtsai29863/run_parabricks.sh WorkDir=/home/tjtsai29863 StdErr=/home/tjtsai29863/slurm-424658.out StdIn=/dev/null StdOut=/home/tjtsai29863/slurm-424658.out Power= TresPerNode=gpu:1 ``` ::: :::spoiler `$ squeue -u tjtsai29863` ``` JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 424659 gp2d run_para tjtsai29 R 1:08:49 1 gn1221 $ squeue # show all jobs ... ``` ::: :::spoiler `$ sacct -j 424658` ``` JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 424658 run_parab+ gp2d ent110209 1 RUNNING 0:0 424658.exte+ extern ent110209 1 RUNNING 0:0 ``` ::: ``` $ sinfo $ scancel job_id ``` <br> ## 參考資料 - [[他人HackMD] Singularity](https://hackmd.io/@chialu/H1xZf5PQu) ![](https://i.imgur.com/KdAkAt6.png) - [[HackMD] 客製化 image 建立](https://hackmd.io/@praexisio/BJdzfcEGY/%2F%40praexisio%2Fbuild) - [[twcc][doc] 查詢主機帳號、重置密碼與取得 OTP 認證碼](https://man.twcc.ai/@twccdocs/guide-service-hostname-pwd-otp-zh) - [[twcc][doc] TWCC Parabricks Quickstart & Tutorial](https://man.twcc.ai/@Ldk_QYrOR2yo3m8Cb1549A/rkQGosieK) - [國網生科雲](https://docs.google.com/presentation/d/1dSoj1Eygj8I1BjGzvrtzm9Rs1hIDTYmWPJRM1m5VBik/edit#slide=id.p1) - [Singularity GPU Containers Options](https://modinst.lu.lv/wp-content/uploads/2021/03/Singularity_seminars_Aleksandrs_Gutcaits.pdf) - [[twcc][doc] HowTo:建立 TWNIA2 容器](https://man.twcc.ai/@twccdocs/doc-twnia2-main-zh/https%3A%2F%2Fman.twcc.ai%2F%40twccdocs%2Fhowto-twnia2-create-sglrt-container-zh) - [[iservice][doc] Singularity 使用說明Container 可以用來包裝使用者已客製化的](https://iservice.nchc.org.tw/download_file.php?f=sgE9EJyhGrMjM-DYM8dUpuMOGtiNXL_iBZOhRBnjhcRAwvLHNUwaIP1Mr6OmC46g7LczlfgqVv_VG9rYwvk2zw)