# Running fmriprep or mriqc on a cluster
## Connect to the cluster
Open your terminal and type:
```bash
ssh <username>@graham.computecanada.ca # Graham login node
```
### Add your ssh key
SSH key pairs are very useful to avoid typing passwords
## Modules
There are pre-installed modules that you’ll need to load in order to use. To see all modules available:
``module avail``
To load module (you can put this in your .bashrc if you need the module all the time):
``module load <module_name>``
Example: Check if git is available and load it
```bash
module avail git
module load apps/git/2.13.0
```
## Transfer files
### Use scp to copy individual files and directories
```bash
scp <filename> <username>@graham.computecanada.ca:<PATH/TO/FILE>
scp <username>@graham.computecanada.ca:<PATH/TO/FILE> <LocalPath>
```
### Use rsync to sync files or directories
```bash
rsync <LocalPath/filename> <username>@graham.computecanada.ca:<PATH/TO/FILE>
rsync <username>@graham.computecanada.ca:<PATH/TO/FILE> <LocalPath>
```
### Use datalad
Install Datalad on your cluster :
```bash
module load git-annex python/3
virtualenv ~/venv_datalad
source ~/venv_datalad/bin/activate
pip install datalad
```
https://cbs-discourse.uwo.ca/t/installing-datalad-on-compute-canada/23?fbclid=IwAR0cCi1HeA5uU0eHGmR9tdwlbtElpDAcdONRK5cPtPVo5g8RKAg_Iv37Kxo
## Running jobs
### Create job script
Here is an example of a simple bash script:
```bash
#!/bin/bash
#SBATCH --time=00:05:00
#SBATCH --account=def-flepore
echo 'Hello, world!'
sleep 20
```
### Submit job
In the cluster terminal
```bash
sbatch <name of the file>
```
Example:
```bash
sbatch simple.sh
Submitted batch job 65869853
```
### Check job status
Use squeue or sq to list jobs
```bash
sq
JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS TRES_PER_N MIN_MEM NODELIST (REASON)
65869853 mmaclean def-flepore_cpu simple.sh PD 5:00 1 1 N/A 256M (Priority)
```
Use email notification to learn when your job starts and ends by adding the following at the top of your script:
```bash
#SBATCH --mail-user=michele.maclean@umontreal.ca
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=REQUEUE
#SBATCH --mail-type=ALL
```
### Cancel job
```bash
scancel <jobid>
scancel 65869853
```
### Where does the output go
By default the output is placed in a file named "slurm-", suffixed with the job ID number and ".out", e.g. slurm-65869853.out, in the directory from which the job was submitted. Having the job ID as part of the file name is convenient for troubleshooting
Files will be output according to where you specified in your bash script
## Tip
1. Use `$SCRATCH` disk to run your scripts, because `$SCRATCH` is much faster than $HOME`.
1. Keep a working directory, if things crash you don't have to start from the beginning
1. check usage/space left on cluster: `diskusage_report`
1. See if you need to delete files from scratch once jobs are complete- fmriprep output is quite heavy
## Running singularity on a cluster
Download the containers for fmriprep and mriqc here:
Repro nim containers: https://github.com/ReproNim/containers
1. create directory
```bash
mkdir parallel_analysis
```
2. install containers from repronim
```bash
cd parallel_analysis
datalad install https://github.com/ReproNim/containers.git
```
3. retrieve the container you want, e.g., fmriprep
```bash
datalad get containers/images/bids/bids-fmriprep--21.0.1.sing
```
you might need to unlock the container to be able to use it
```bash
datalad unlock containers/images/bids/bids-fmriprep--21.0.1.sing
```
### Run fmriprep on cluster
Have your freesurfer license
Here is an example script
```bash
#!/bin/bash
#-------------------------------------------
#SBATCH -J fmriprep
#SBATCH --account=def-flepore
#SBATCH --time=15:00:00
#SBATCH -n 1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=8G
#SBATCH --mail-user=michele.maclean@umontreal.ca
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=REQUEUE
#SBATCH --mail-type=ALL
# ------------------------------------------
source ~/venv_datalad/bin/activate
module load git-annex/8.20200810
module load freesurfer/5.3.0
module load singularity/3.8
cd
singularity run --cleanenv \
-B /home/mmaclean/scratch:/scratch \
-B /home/mmaclean/projects/def-flepore/mmaclean:/mmaclean \
/home/mmaclean/projects/def-flepore/mmaclean/parallel_analysis/containers/images/bids/bids-fmriprep--21.0.1.sing \
/mmaclean/raw /mmaclean/fmriprep-output \
participant --participant-label CTL01 \
--work-dir /scratch/work-fmriprep \
--fs-license-file /mmaclean/license/freesurfer.txt \
--output-spaces MNI152NLin2009cAsym T1w \
--skip_bids_validation --notrack --stop-on-first-crash
```
### Run mriqc on cluster
Here is an example script
```bash
#!/bin/bash
#-------------------------------------------
#SBATCH -J mriqc
#SBATCH --account=def-flepore
#SBATCH --time=5:00:00
#SBATCH -n 1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=10G
#SBATCH --mail-user=michele.maclean@umontreal.ca
#SBATCH --mail-type=BEGIN
#SBATCH --mail-type=END
#SBATCH --mail-type=FAIL
#SBATCH --mail-type=REQUEUE
#SBATCH --mail-type=ALL
# ------------------------------------------
source ~/venv_datalad/bin/activate
module load git-annex/8.20200810
module load freesurfer/5.3.0
module load singularity/3.8
cd
singularity run --cleanenv \
-B /home/mmaclean/scratch:/scratch \
-B /home/mmaclean/projects/def-flepore/mmaclean:/mmaclean \
/home/mmaclean/projects/def-flepore/mmaclean/parallel_analysis/containers/images/bids/bids-mriqc--0.16.1.sing \
/mmaclean/raw /mmaclean/mriqc \
participant --participant-label CTL01 CTL02 CTL03 \
-w /scratch/work-mriqc \
--no-sub
```