owned this note
owned this note
Published
Linked with GitHub
# Human Brain Atlas Processing Tutorial [using SLURM to run ANTs]
# Queueing the ANTs Multivariate Template Script using SLURM on a HPC
## Table of Contents
[TOC]
## About
This guide will go through the steps used to queue the ANTs Multivariate Template Script on a HPC using SLURM.
This guide assumes that you have 1) an input template, and 2) the files you wish to align to make a non-linear template. If not, you will need to make these files using either this guide (input template) [link coming soon] or this guide (without an input template) [link coming soon].
Otherwise, if you dont have your own dataset, you can download the demo dataset linked below (which this guide will be going over). Download the `demo-slurm` folder as a zip file:
| Link to sample dataset used in this guide | https://osf.io/jrh5v/files/?view_only=2d48452b19cf4fb68d892072be41e575 |
| ----------------- |:--------------------------------------------------------------------- |
Note: This guide will go over how to run this code on [MASSIVE](https://docs.massive.org.au/M3/m3users.html) - an HPC located in Australia. There may be some details of this guide that may not work for the HPC you're using, but most steps in principle should be the same/similar.
For more information about MASSIVE/SLURM, see these [Google Slides](https://docs.google.com/presentation/d/1dbDDgE7kIAJwb8ne1ExL52vASOFj1okODEzrjqvLSw4/edit?usp=sharing).
Here are some useful resources related to SLURM:
1. Quick start guide: https://slurm.schedmd.com/quickstart.html
2. Cheat Sheet: https://slurm.schedmd.com/pdfs/summary.pdf
Any queries can be sent to Zoey Isherwood (zoey.isherwood@gmail.com) or Mark Schira (mark.schira@gmail.com)
## List of software packages needed
| Software/Programs | Website |
| ----------------- |:--------------------------------------------------------------------- |
| ANTS |[http://stnava.github.io/ANTs/](https://) |
| ITKSNAP |[http://www.itksnap.org/pmwiki/pmwiki.php?n=Downloads.SNAP3](https://) |
| Filezilla (optional) | [https://filezilla-project.org/download.php?type=client](https://) |
## List of scripts used to process data
**NOTE:** if you download the dataset, all the necessary code is included in the zip folder.
Only scripts really needed for this guide are the scripts that come with ANTs, as well as a custom script listed below. The custom script contains the code block(s) contained in this guide.
| Scripts | Website |
| ----------------- |:--------------------------------------------------------------------- |
| `hba-sample-dataset-slurm.sh` |https://osf.io/6nbzh/?view_only=2d48452b19cf4fb68d892072be41e575 |
| `hba-sample-dataset-slurm-iteration-01.sh` |https://osf.io/c5284/?view_only=2d48452b19cf4fb68d892072be41e575 |
| `hba-sample-dataset-slurm-iteration-02.sh` |https://osf.io/ucy2a/?view_only=2d48452b19cf4fb68d892072be41e575 |
## 1. Transferring data to the HPC
You can transfer your data a variety of ways (e.g. via the command line, connecting to the HPC using your file browswer and draging and dropping the files). I recommend using FileZilla for ease of use. For this you will need to get the IP address of the computer you're transferring too, and you also need to have log in credentials to access the HPC.
1. Open Filezilla, and click on the `Site Manager` button (circled in the figure below). Then enter the relevant information to connect to your HPC (see arrows in figure below and fill out each section). You need the `Host` name, `Protocol` (SFTP - SSH in the example below), `User`, and `Password`. Once you have fill everything out, click `Ok` and you will now connect to the HPC.
![](https://i.imgur.com/tj9xQI6.png)
2. Navigate to where your data folder is located on your local machine on the left hand panel, then navigate to where you want to transfer your data on the HPC on the right hand panel. Now you can drag and drop the data folder from the left panel to the right panel to transfer the data to the HPC.
![](https://i.imgur.com/O95BfSa.png)
3. Now your data is on the HPC. Now you can get the code ready to run...
## 2. Writing a SLURM compatible code
SLURM is a coding system that enables the queueing of jobs on a HPC. If everyone ran their jobs all at once on an HPC it’d probably crash… So one way to circumvent this problem is by scheduling jobs. Just like a queue in real life, it’s first come first serve. The sooner you queue your job the sooner it’ll start...
All you need to do to have your shell script work with the SLURM queueing system is add a few lines of code at the beginning of your shell script to indicate certain parameters you want used on the HPC.
The code block below includes both 1) the SLURM code necessary to queue the shell script and 2) the ANTs Multivariate Template code to create a non-linear template of the files your transferred in Step 1.
```bash=1
#!/bin/bash
# hba-sample-dataset-slurm.sh
#SBATCH --job-name=hba-sample-dataset-slurm
# indicate the your assigned account name for your to be billed/hours counted:
#SBATCH --account=vr61
# indicate the HPC partition you want to use and the max RAM you want used
#SBATCH --partition=m3j
#SBATCH --mem=342000
# Request CPU resource for a serial job, number of cpus you want to use.
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=36
# Memory usage (MB)
# Set your minimum acceptable walltime, format: day-hours:minutes:seconds. This was set to the max allowed time on MASSIVE.
#SBATCH --time=7-00:00:00
# Set the filepath and filename for output log (stdout)
#SBATCH --output=/projects/vr61/logs/hba-sample-dataset-slurm-%u%j.out
# Set the filepath and filename for error log (stderr)
#SBATCH --error=/projects/vr61/logs/hba-sample-dataset-slurm-%u%j.err
# send e-mail to this email address to notify of job updates (started, ended, error, failed):
#SBATCH --mail-user=johnsmith@example.com
#SBATCH --mail-type=ALL
# beginning of code:
module purge
module load ants/20190910 # indicate the ants version you want using the module commande
FILEDIR=/projects/vr61/hba-sample-dataset-slurm # where your files are located.
TEMPLATE=${FILEDIR}/0p40-sub-01_t1_ACPC.nii.gz
IMAGES=(${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-01_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-02_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-03_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-04_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz)
DIMS=3
GRADIENT=0.1
NUM_CORES=18 #12
NUM_MODS=1
N4BIASCORRECT=0
STEPSIZES=20x15x5 #from Lüsebrink, F., Sciarra, A., Mattern, H., Yakupov, R. & Speck, O. T1-weighted in vivo human whole brain MRI dataset with an ultrahigh isotropic resolution of 250 μm. Scientifc data 4, 170032, https://doi.org/10.1038/sdata.2017.32 (2017).
ITERATIONS=2 #4 from ants paper
#ITERATION="01"
###############################
#
# Set number of threads
#
###############################
ORIGINALNUMBEROFTHREADS=${ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS}
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$NUM_CORES
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
########### ants
outputPath=${FILEDIR}
cd $outputPath
# outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}_iteration_${ITERATION}
outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}
mkdir $outputPath
antsMultivariateTemplateConstruction.sh \
-d $DIMS \
-k $NUM_MODS \
-r 0 \
-c 0 \
-m $STEPSIZES \
-n $N4BIASCORRECT \
-s CC \
-t GR \
-i $ITERATIONS \
-g $GRADIENT \
-b 1 \
-o ${outputPath}/T_ \
-y 1 \ #left to default setting.
-z $TEMPLATE \
${IMAGES[@]}
###############################
#
# Restore original number of threads
#
###############################
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$ORIGINALNUMBEROFTHREADS
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
```
## 3. Running a SLURM compatible script on a HPC
In order to run your script using SLURM scripting, you need to first save your script somewhere on the HPC. You can do so by transferring your script (e.g. the code block above, `hba-sample-dataset-slurm.sh`) using Filezilla. Alternatively, you can copy/paste your script and save it directly onto the HPC via SSH. Follow the steps in the code block below to do this:
```bash=1
# ssh onto your hpc
ssh user@m3.massive.org.au
# enter your password and hit enter
#go to the directory where your data is located (or wherever you want to save your script)
cd /projects/vr61/hba-sample-dataset-slurm
# copy paste the name of your script and enter it below using the nano command
nano hba-sample-dataset-slurm.sh
# now a blank screen should pop up on your terminal window.
# now copy/paste your script (e.g. the code block above in Step 2.) into this window
# after copy/pasting your script, press ctrl + O on your keyboard. then hit enter to save the script
# now press ctrl + X to exit the nano window in your terminal
# now your script is saved!
```
Now you're ready to run your script. To queue it, you need to SSH onto the HPC, then use `sbatch` to run your script. See the code block below for more detail:
```bash=1
ssh user@m3.massive.org.au
#enter password and hit enter
#now run your script by using the sbatch command, along with the full path to your script
sbatch /projects/vr61/sub-01-mvt-flaws-pdcorr-slurm-test.sh
#If you set your SLURM parameters correct in your script, your code will be in the queue. If not, you’ll get an error message.
```
If you ran the code block above, your script should be in the queue ready to go!
--
If you want to check the status of your queued script, you can use SLURM commands such as the following as long as you're SSH'd onto the HPC via a terminal.
-`show_job`: this command will show you all the jobs that you have queued using your account
-`scancel`: if you want to cancel your job that is currently running/queued, use this command followed by the job number (get the job number from the `show_job` command).
-Dependencies: if you want one job to depend on the completion of another, you can set up dependencies. So when you want to start your job with a dependency, run the following command:
```bash=1
sbatch --dependency=afterok:${JOBNUMBER}
```
-`squeue`: will show you all the jobs in the queue from yourself and other people on massive. -If you type `squeue -u ${USERNAME}` it’ll just show your jobs (like `show_job`).
-Can `ssh` into nodes: when your job gets assigned to a node, you now have permission to ssh into that node by running something like `ssh m3j004`. When you do this, you can run commands like `htop` or `top` to see how much RAM and CPU your script is using.
## EXTRA: Running ANTs iterations separately
Sometimes multiple iterations of the ANTs Multivariate Template code can take over a week. At least for us, when accessing an HPC we could only request 1 week of wall time... As such we could only run 1 iteration at a time.
I've listed some scripts below which break up the ANTs Multivariate Template process into two iterations (which can be easily modified to tack on extra interations). Just like in Step 2 above, to run these scripts you have to transfer them over to your HPC using your preferred method (e.g. Filezilla, via SSH terminal), then run them using the `sbatch` command.
You can do fancy things here with SLURM queue
ing, where you can queue both jobs at the same time, but have the 2nd iteration dependent on the first - e.g. the 2nd iteration will only start when the 1st is complete. I won't go into that here as I found it to be more trouble that it was worth, but here is some extra documentation that goes over how to do that (also see the dot points listed in Step 3): https://slurm.schedmd.com/job_array.html#dependencies.
The simplest way is to run Iteration 1 first, wait for it to finish, quality check it, and if it looks fine you can move onto running Iteration 2.
ITERATION 1 SCRIPT:
```bash=1
#!/bin/bash
# hba-sample-dataset-slurm-iteration-01.sh
#SBATCH --job-name=hba-sample-dataset-slurm-iteration-01
# indicate the your assigned account name for your to be billed/hours counted:
#SBATCH --account=vr61
# indicate the HPC partition you want to use and the max RAM you want used
#SBATCH --partition=m3j
#SBATCH --mem=342000
# Request CPU resource for a serial job, number of cpus you want to use.
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=36
# Memory usage (MB)
# Set your minimum acceptable walltime, format: day-hours:minutes:seconds. This was set to the max allowed time on MASSIVE.
#SBATCH --time=7-00:00:00
# Set the filepath and filename for output log (stdout)
#SBATCH --output=/projects/vr61/logs/hba-sample-dataset-slurm-iteration-01-%u%j.out
# Set the filepath and filename for error log (stderr)
#SBATCH --error=/projects/vr61/logs/hba-sample-dataset-slurm-iteration-01-%u%j.err
# send e-mail to this email address to notify of job updates (started, ended, error, failed):
#SBATCH --mail-user=johnsmith@example.com
#SBATCH --mail-type=ALL
# beginning of code:
module purge
module load ants/20190910 #indicate the ants version you want using the module commande
FILEDIR=/projects/vr61/hba-sample-dataset-slurm # where your files are located.
TEMPLATE=${FILEDIR}/0p40-sub-01_t1_ACPC.nii.gz
IMAGES=(${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-01_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-02_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-03_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-04_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz)
DIMS=3
GRADIENT=0.1
NUM_CORES=18 #12
NUM_MODS=1
N4BIASCORRECT=0
STEPSIZES=20x15x5 #from Lüsebrink, F., Sciarra, A., Mattern, H., Yakupov, R. & Speck, O. T1-weighted in vivo human whole brain MRI dataset with an ultrahigh isotropic resolution of 250 μm. Scientifc data 4, 170032, https://doi.org/10.1038/sdata.2017.32 (2017).
ITERATIONS=1 #4 from ants paper
ITERATION="01"
###############################
#
# Set number of threads
#
###############################
ORIGINALNUMBEROFTHREADS=${ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS}
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$NUM_CORES
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
########### ants
outputPath=${FILEDIR}
cd $outputPath
outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}_iteration_${ITERATION}
#outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}
mkdir $outputPath
antsMultivariateTemplateConstruction.sh \
-d $DIMS \
-k $NUM_MODS \
-r 0 \
-c 0 \
-m $STEPSIZES \
-n $N4BIASCORRECT \
-s CC \
-t GR \
-i $ITERATIONS \
-g $GRADIENT \
-b 1 \
-o ${outputPath}/T_ \
-y 1 \ #left to default setting.
-z $TEMPLATE \
${IMAGES[@]}
###############################
#
# Restore original number of threads
#
###############################
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$ORIGINALNUMBEROFTHREADS
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
```
ITERATION 2 SCRIPT:
```bash=1
#!/bin/bash
# hba-sample-dataset-slurm.sh
#SBATCH --job-name=hba-sample-dataset-slurm-iteration-02
# indicate the your assigned account name for your to be billed/hours counted:
#SBATCH --account=vr61
# indicate the HPC partition you want to use and the max RAM you want used
#SBATCH --partition=m3j
#SBATCH --mem=342000
# Request CPU resource for a serial job, number of cpus you want to use.
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=36
# Memory usage (MB)
# Set your minimum acceptable walltime, format: day-hours:minutes:seconds. This was set to the max allowed time on MASSIVE.
#SBATCH --time=7-00:00:00
# Set the filepath and filename for output log (stdout)
#SBATCH --output=/projects/vr61/logs/hba-sample-dataset-slurm-iteration-02-%u%j.out
# Set the filepath and filename for error log (stderr)
#SBATCH --error=/projects/vr61/logs/hba-sample-dataset-slurm-iteration-02-%u%j.err
# send e-mail to this email address to notify of job updates (started, ended, error, failed):
#SBATCH --mail-user=johnsmith@example.com
#SBATCH --mail-type=ALL
# beginning of code:
module purge
module load ants/20190910 #indicate the ants version you want using the module commande
FILEDIR=/projects/vr61/hba-sample-dataset-slurm # where your files are located.
TEMPLATE=${FILEDIR}/TemplateMultivariateBSplineSyN_20x15x5_iteration_01/T_template0.nii.gz #use template made from last iteration
IMAGES=(${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-01_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-02_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-03_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz
${FILEDIR}/n4corr-msk_raw-sub-01_ses-01_run-04_acq-mp2rage-wip944_UNI_DEN_defaced.nii.gz)
DIMS=3
GRADIENT=0.1
NUM_CORES=18 #12
NUM_MODS=1
N4BIASCORRECT=0
STEPSIZES=20x15x5 #from Lüsebrink, F., Sciarra, A., Mattern, H., Yakupov, R. & Speck, O. T1-weighted in vivo human whole brain MRI dataset with an ultrahigh isotropic resolution of 250 μm. Scientifc data 4, 170032, https://doi.org/10.1038/sdata.2017.32 (2017).
ITERATIONS=1 #4 from ants paper
ITERATION="02"
###############################
#
# Set number of threads
#
###############################
ORIGINALNUMBEROFTHREADS=${ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS}
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$NUM_CORES
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
########### ants
outputPath=${FILEDIR}
cd $outputPath
outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}_iteration_${ITERATION}
#outputPath=${outputPath}/TemplateMultivariateBSplineSyN_${STEPSIZES}
mkdir $outputPath
antsMultivariateTemplateConstruction.sh \
-d $DIMS \
-k $NUM_MODS \
-r 0 \
-c 0 \
-m $STEPSIZES \
-n $N4BIASCORRECT \
-s CC \
-t GR \
-i $ITERATIONS \
-g $GRADIENT \
-b 1 \
-o ${outputPath}/T_ \
-y 1 \ #left to default setting.
-z $TEMPLATE \
${IMAGES[@]}
###############################
#
# Restore original number of threads
#
###############################
ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS=$ORIGINALNUMBEROFTHREADS
export ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS
```