bil-support@psc.edu
/bil/data/hackathon/2022_GYBS/
and is also available over HTTPS here./bil/data/hackathon/2022_GYBS/output/
.workshop.brainimagelibrary.org
with 2.5TB, 56 cores and RTX8000 GPU with 4608 cores and 48GB memory.workshop2.brainimagelibrary.org
with 1.5TB, 144 cores (hyperthreaded) and 2 NVIDIA V100 GPUs, each with 5120 cores and 32GB memory, coupled with NVLink.compute
.workshop
and workshop2
VMs over ssh
or using X2Go
.workshop
VM (not workshop2
).workshop
VM using X2GoInstructions on how to install X2Go can be found here.
workshop
VM using X2GoInstructions on how to install x2go can be found here.
workshop
VM using a TerminalDetailed instructions on how to connect using Terminal can be found here.
workshop
VM using a TerminalDetailed instructions on how to connect using Terminal can be found here.
workshop
VM using a Terminal Useful Tips and Tricks
Environment Modules provide a convenient way to dynamically change the users' environment through modulefiles.
module avail #to list all available modules module load <module-name> #to load module <module-name> module unload <module-name> #to unload module <module-name>
These are the most commonly used options you will be more than likely be using. Full documentation on how to use LMOD can be found here.
module avail ------------------------- /bil/modulefiles ------------------------- ANARI-SDK/anari-sdk gnu_parallel/20210522 ITK/5.2.1 gotop/3.3.0 ImageMagick/7.1.0 graphviz/2.44.0 ImageMagick/7.1.0-2 (D) htslib/1.9 R/3.5.1 ilastik/1.3.3 R/3.6.3 (D) imagej-fiji/1.52p Rust/1.58.1 itksnap/3.8.0 Scala/2.13.5 java/jdk8u201 TeraStitcher/1.10.18 java/jdk8u211 VisRTX/0.2.0 java/jdk8u241 (D) anaconda/3.2019.7 julia/1.0.5 anaconda3/4.9.2 knime/4.3.2 anaconda3/4.11.0 (D) lazygit/0.22.9 aspera/3.9.6 matlab/2019a aws-cli/2.4.17 matlab/2021a (D) bcftools/1.9 md5deep/4.4 bioformats/6.0.1 ncdu/1.16 bioformats/6.1.1 nextflow/21.10.6 --More--
Matlab 2021a
module load matlab/2021a matlab -nodesktop -nosplash MATLAB is selecting SOFTWARE OPENGL rendering. < M A T L A B (R) > Copyright 1984-2021 The MathWorks, Inc. R2021a Update 5 (9.10.0.1739362) 64-bit (glnxa64) August 9, 2021 To get started, type doc. For product information, visit www.mathworks.com. >>
Every user needs to request access to Matlab. To request access, click here.
Anaconda3
module load anaconda3 ipython Python 3.9.7 (default, Sep 16 2021, 13:09:58) Type 'copyright', 'credits' or 'license' for more information IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help. In [1]:
Useful Tips and Tricks
When building scripts, add as many calls to LMOD as needed.
For example,
#!/bin/bash module load bioformats/6.8.0 module load bioformats2raw/0.3.0 module load raw2ometiff/0.3.0 ...
loads Bio-Formats as well as some other Glencoe tools.
LMOD
is used to load software in the workshop
VM and the L-nodes.Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
sinfo #view information about Slurm nodes and partitions squeue #view information about jobs located in the Slurm scheduling queue scontrol #view or modify Slurm configuration and state sbatch #submit a batch script to Slurm
The commands above are the most common commands you might be using for this hackathon. For full documentation about SLURM, click here.
sinfo
- Example 1sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST compute* up 2-00:00:00 1 drain l008 compute* up 2-00:00:00 7 idle l[001-007]
As a participant of this hackathon, you should have access to the partition compute
using the reservation hackathon
.
squeue
- Example 1Use squeue -u $(whoami)
to list your jobs and their status
squeue -u $(whoami) JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 14243 compute script.s icaoberg R 15:34
sbatch
- Example 1Consider the following file named script.sh
cat script.sh #!/bin/bash module load anaconda3 pip install --user cowsay cowsay "Hello, World"
sbatch
is used to submit jobs to the scheduler. For more info on sbatch
, click here.
sbatch
- Example 1 (cont.) Remember to use the reservation
hackathon
when submitting a job to the scheduler.
sbatch -p compute -A tra220018p --reservation=hackathon script.sh
Submitted batch job 82721
For more info on sbatch
, click here.
sbatch
- Example 1 (cont.)If you do not specify an output filename, the scheduler will create a file automatically. In this example slurm-82721.out
cat slurm-82721.out
____________
| Hello, World |
============
\
\
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
For more info on sbatch
, click here.
sbatch
- Example 2sbatch -p compute -N1 script.sh #number of nodes - please avoid using!
sbatch -p compute -n1 script.sh #number of cores
sbatch -p compute --mem=64Gb script.sh #memory
sbatch -p compute -N1 -n10 --mem=128Gb script.sh #combine as needed
For more info on sbatch
, click here.
scancel
- Example 1scancel -u $(whoami) #cancel all my jobs
scancel -u <username> #cancel username's jobs
scancel 1234 #cancel job 1234
For more info on scancel
, click here.
LMOD
is used to load software in the workshop
VM and the L-nodes.SLURM
is used to submit jobs to the scheduler managing the large-memory nodes.interact
The interact command is an in-house script for starting interactive sessions.
> interact -h
Usage: interact [OPTIONS]
-d Turn on debugging information
--debug
--noconfig Do not process config files
-gpu Allocate 1 gpu in the GPU-shared partition
--gpu
--gres=<list> Specifies a comma delimited list of generic
consumable resources. e.g.: --gres=gpu:1
--mem=<MB> Real memory required per node in MegaBytes
...
interact
(cont.) Useful Tips and Tricks
interact
is a wrapper built in house.interact
and avoid using salloc
or srun
on BIL hardware.interact -A tra220018p -p compute -R hackathon -n <number-of-cores> --mem=<memory>
interact
tra220018p
hackathon
LMOD
is used to load software in the workshop
VM and the L-nodes.SLURM
is used to submit jobs to the scheduler managing the large-memory nodes.interact
is used to start interactive sessions on the large-memory nodes.Singularity's command line interface, singularity
allows you to build and interact with containers.
Singularity definition file
.Singularity image file
.The Singularity definition file includes similar instructions to installing the software in your local system. Generally speaking, just follow the developers' instructions.
Bootstrap: docker From: debian:stretch %environment export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ %post apt update apt install -y libblosc1 wget unzip openjdk-8-jdk cd /opt/ wget -nc https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.3.0/bioformats2raw-0.3.0.zip unzip bioformats2raw-0.3.0.zip && rm -f bioformats2raw-0.3.0.zip ln -s /opt/bioformats2raw-0.3.0/bin/bioformats2raw /usr/local/bin/bioformats2raw apt remove -y wget unzip apt clean %runscript /usr/local/bin/bioformats2raw
If you want to see the repository with all the recipes, then click here.
Bootstrap: docker From: debian:stretch
Bootstrap
determines the bootstrap agent that will be used to create the base operating system you want to use.Bootstrap: docker From: ubuntu:18.04
%environment export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
environment
section lets the user define enviroment variabled needed by the containerized app to run properly.JAVA_HOME
.%post apt update apt install -y libblosc1 wget unzip openjdk-8-jdk cd /opt/ wget -nc https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.3.0/bioformats2raw-0.3.0.zip unzip bioformats2raw-0.3.0.zip && rm -f bioformats2raw-0.3.0.zip ln -s /opt/bioformats2raw-0.3.0/bin/bioformats2raw /usr/local/bin/bioformats2raw apt remove -y wget unzip apt clean
post
section is where you can install and configure your container.bioformats2raw
in /opt
and then soft-linking to /usr/local/bin/
.%runscript /usr/local/bin/bioformats2raw
runscript
defines the binary/script to be run when the container is invoked.ENTRYPOINT
section in a Dockerfile.For simplicity, this Singularity
definition file is available on the GitHub
git clone git@github.com:pscedu/singularity-bioformats2raw.git cd singularity-bioformats2raw/3.0.0/ singularity build --remote bioformats2raw.sif Singularity
Running the previous command should build the image remotely
singularity build --remote bioformats2raw.sif Singularity INFO: Remote "cloud.sylabs.io" added. INFO: Access Token Verified! INFO: Token stored in /root/.singularity/remote.yaml INFO: Remote "cloud.sylabs.io" now in use. INFO: Starting build... Getting image source signatures Copying blob sha256:0030cc4ce25ce472fe488839def15ec8f2227bb916461b518cf534073c019a86 Copying config sha256:d8d0f98475c05ca0009ed1c2c4bad86f243ef7f80788fad7f9d6dc0c9ca58d03 Writing manifest to image destination ... INFO: Adding labels INFO: Creating SIF file... INFO: Build complete: /tmp/image-2973854208 WARNING: Skipping container verification INFO: Uploading 377360384 bytes INFO: Build complete: bioformats2raw.sif
If successful, then the command will build and download the image from SyLabs.io.
ls -lta *.sif
-rwxr-xr-x 1 icaoberg pscstaff 377360384 Apr 2 02:10 bioformats2raw.sif
To test this simple container, let's invoke the help
section.
singularity exec -B /bil/ bioformats2raw.sif bioformats2raw --help
It works!
singularity exec -B /bil/ bioformats2raw.sif bioformats2raw --help Missing required parameters: '<inputPath>', '<outputLocation>' Usage: <main class> [-p] [--no-hcs] [--[no-]nested] [--no-ome-meta-export] [--no-root-group] [--overwrite] [--use-existing-resolutions] [--version] [--debug [=<logLevel>]] [--extra-readers[=<extraReaders>[, <extraReaders>...]]]... [--options[=<readerOptions>[, <readerOptions>...]]]... [-s[=<seriesList>[, <seriesList>...]]]... ... -w, --tile_width=<tileWidth> Maximum tile width to read (default: 1024) -z, --chunk_depth=<chunkDepth> Maximum chunk depth to read (default: 1) -p, --progress Print progress bars during conversion --version Print version information and exit
# use this command to pull images from DockerHub and convert them # to Singularity image files singularity pull docker://openmicroscopy/bioformats2raw:0.4.0
Be careful downloading or pulling random containers from the cloud. Only do so from trusted organizations, e.g. PSC, or trusted or official collaborators/companies.
# Version 0.4.0 is released! singularity exec -B /bil bioformats2raw_0.4.0.sif /opt/bioformats2raw/bin/bioformats2raw --help Missing required parameters: '<inputPath>', '<outputLocation>' Usage: <main class> [-p] [--no-hcs] [--[no-]nested] [--no-ome-meta-export] [--no-root-group] [--overwrite] [--use-existing-resolutions] [--version] [--debug ...
bioformats2raw
and in mine I didn't have to.Consider the file example.sh
#!/bin/bash shopt -s expand_aliases alias bioformats2raw='singularity exec -B /bil bioformats2raw.sif bioformats2raw' FILE=/bil/data/84/c1/84c11fe5e4550ca0/SW170711-02A/SW170711-02A_4_06.tif OUTPUT=SW170711-02A_4_06.zarr bioformats2raw $FILE $OUTPUT --resolutions 6 --tile_width 128 --tile_height 128
# remember to use the account and reservation sbatch -p compute -A tra220018p --reservation=hackathon -n 2 --mem=16Gb example.sh Submitted batch job 82803
Let's take a look at the status of my job
squeue -u icaoberg JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 82803 compute example. icaoberg R 0:03 1 l005
Let's look at the output
cat slurm-82803.out 2022-04-02 06:44:10,549 [main] WARN loci.formats.Memoizer - skipping memo: directory not writeable - /bil/data/84/c1/84c11fe5e4550ca0/SW170711-02A 2022-04-02 06:44:10,836 [main] WARN loci.formats.Memoizer - skipping memo: directory not writeable - /bil/data/84/c1/84c11fe5e4550ca0/SW170711-02A ...
and wait…
Of course! You can use as many Singularity image files as needed. Consider this example,
# copy raw2ometiff Singularity container to current working directory cp /bil/data/hackathon/2022_GYBS/src/PSC/icaoberg/singularity/singularity-raw2ometiff-3.0.0.sif raw2ometiff.sif
Now I have 2 Singularity images in my current working directory
ls -lta *.sif -rwxr-xr-x 1 icaoberg pscstaff 258281472 Apr 2 02:48 raw2ometiff.sif -rwxr-xr-x 1 icaoberg pscstaff 377360384 Apr 2 02:10 bioformats2raw.sif
one for raw2ometiff
and another for bioformats2raw
.
Consider the updated file, example.sh
#!/bin/bash
shopt -s expand_aliases
alias bioformats2raw='singularity exec -B /bil bioformats2raw.sif bioformats2raw'
alias raw2ometiff='singularity exec -B /bil raw2ometiff.sif raw2ometiff'
FILE=/bil/data/84/c1/84c11fe5e4550ca0/SW170711-02A/SW170711-02A_4_06.tif
OUTPUT=SW170711-02A_4_06.zarr
bioformats2raw $FILE $OUTPUT --resolutions 6 --tile_width 128 --tile_height 128
OUTPUT_IMAGE=SW170711-02A_4_06.ome.tiff
raw2ometiff $OUTPUT $OUTPUT_IMAGE
You can submit the script example.sh
using the command
sbatch -p compute -A tra220018p --reservation=hackathon -n 2 --mem=16Gb example.sh
and wait…
When the script is done, you should find on OME.TIFF on disk
file SW170711-02A_4_06.ome.tiff SW170711-02A_4_06.ome.tiff: Big TIFF image data, big-endian du -h SW170711-02A_4_06.ome.tiff 723M SW170711-02A_4_06.ome.tiff
Useful Tips and Tricks
Useful Tips and Tricks
circos
- Example 1Bootstrap: docker
From: perl:5.32.1
%environment
export LANGUAGE=en_US.UTF-8
export LC_ALL=C
%post
export DEBIAN_FRONTEND=noninteractive
apt update && apt-get install -y locales libipc-run3-perl libgd-dev
locale-gen en_US.UTF-8
cpan install Math::Round
cpan install Font::TTF::Font
cpan install Config::General
cpan install Clone
cpan install GD::Polyline
cpan install Math::Bezier
cpan install GD
cpan install List::MoreUtils
cpan install Params::Validate
cpan install Readonly
cpan install Math::VecStat
cpan install Statistics::Basic
cpan install Set::IntSpan
cpan install Regexp::Common
cpan install SVG
cpan install Text::Format
cd /opt
wget http://circos.ca/distribution/circos-0.69-9.tgz
tar -xvf circos-0.69-9.tgz && rm -f circos-0.69-9.tgz
ln -s $(pwd)/circos-0.69-9/bin/circos /usr/local/bin/circos
mc
- Example 2Bootstrap: docker
From: alpine:edge
%post
apk update
apk add mc
cwltool
- Example 3Bootstrap: docker
From: debian:buster
%post
apt update
apt install -y python3 python3-pip nodejs
pip3 install cwltool==3.1.20220210171524 cwlref-runner
spades
- Example 4Bootstrap: docker
From: quay.io/biocontainers/spades:3.15.3--h95f258a_0
%labels
MAINTAINER icaoberg
EMAIL icaoberg@psc.edu
SUPPORT help@psc.edu
REPOSITORY http://github.com/pscedu/singularity-spades
COPYRIGHT Copyright © 2021 Pittsburgh Supercomputing Center. All Rights Reserved.
VERSION 3.15.3
visidata
- Example 5Bootstrap: docker
From: python:3.8-alpine
%environment
export TERM="xterm-256color"
%post
apk update
apk add git
pip install requests python-dateutil wcwidth tabulate
mkdir -p /opt/visidata
git clone https://github.com/saulpw/visidata.git
cd visidata
sh -c 'yes | pip install -vvv .'
rm -rfv visidata
To see a list of curated Singularity definition files maintained by the Pittsburgh Supercomputing Center? Click here.
LMOD
is used to load software in the workshop
VM and the L-nodes.SLURM
is used to submit jobs to the scheduler managing the large-memory nodes.interact
is used to start interactive sessions on the large-memory nodes.Singularity
allows you to create and run containers that package up pieces of software in a way that is portable and reproducible.interact -A tra220018p -p compute -R hackathon -n 2 --mem=16Gb module load anaconda3 pip install --user cwltool cwlref-runner pip install --user cowsay #needed for this exercise
The files needed for this exercise can be found here.
cowsay
cowsay "Hello, World\!"
_______________
< Hello, World! >
---------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
cowsay
(cont.)Basic workflow that uses cowsay
create the file cowsay.cwl
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
baseCommand: cowsay
inputs:
message:
type: string
inputBinding:
position: 1
outputs: []
CWL documents are written either in YAML or JSON. For example, we can create the input file message.cwl
message: Hello world!
cowsay
(cont.)cwltool cowsay.cwl message.cwl
INFO /bil/packages/anaconda3/4.11.0/bin/cwltool 3.1.20220210171524
INFO Resolved 'cowsay.cwl' to 'file:///bil/users/icaoberg/code/singularity-cowsay/3.04/cowsay.cwl'
INFO [job cowsay.cwl] /tmp/l7knmpt3$ cowsay \
'Hello world!'
____________
| Hello world! |
============
\
\
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
INFO [job cowsay.cwl] completed success
{}
INFO Final process status is success
cowsay
on DockerThis step cannot run on Brain Image Library hardware since we do not support Docker.
Consider the following Dockerfile
FROM ubuntu:latest
RUN apt-get update && apt-get install -y cowsay --no-install-recommends && rm -rf /var/lib/apt/lists/*
ENV PATH $PATH:/usr/games
CMD ["cowsay"]
cowsay
on Docker (cont.)The file above creates a Docker image with the cowsay
binary. I can be built and pushed using the commands
docker build -t icaoberg/cowsay .
docker push icaoberg/cowsay
cowsay
on Docker (cont.)This is a dummy example but technically now there exists a container with cowsay
on my account in DockerHub. Now, I can recycle the CWL workflow from before and have it pull the container from DockerHub by adding the lines
hints:
DockerRequirement:
dockerPull: icaoberg/cowsay
cowsay
on Docker (cont.)The cowsay.cwl
can be updated to include the previous lines
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
requirements:
SubworkflowFeatureRequirement: {}
DockerRequirement:
dockerPull: icaoberg/cowsay
baseCommand: cowsay
inputs:
message:
type: string
inputBinding:
position: 1
outputs: []
cowsay
on uDocker Just like with Singularity, you cannot build Docker container on BIL hardware. However, if the Docker containers were built properly, you can use uDocker to execute containerized apps in Docker on BIL.
module load anaconda3
pip install --user uDocker
cwltool --user-space-docker-cmd=udocker --debug cowsay.cwl message.yml
cowsay
on uDocker (cont.) ******************************************************************************
* *
* STARTING b8530cb4-5887-3c41-9a73-71cf189a90f4 *
* *
******************************************************************************
executing: cowsay
______________
< Hello world! >
--------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
INFO [job cowsay2.cwl] Max memory used: 17MiB
INFO [job cowsay2.cwl] completed success
DEBUG [job cowsay2.cwl] outputs {}
DEBUG [job cowsay2.cwl] Removing input staging directory /tmp/m5ppaxkq
DEBUG [job cowsay2.cwl] Removing temporary directory /tmp/npq32ztj
DEBUG Removing intermediate output directory /tmp/u5867sg7
cowsay
on SingularityThe main issue is that most HPC clusters do not support Docker and prefer Singularity or Apptainers. However, if the Docker image in DockerHub has proper entrypoints, then you could simply use the --singularity
option to ask CWL tools to convert the Docker image to Singularity.
If the Docker image does not a proper entry point this step might fail if you are not aware of how the image was built.
Using the option
cwltool --singularity cowsay2.cwl message.cwl
will run the workflow.
LMOD
is used to load software in the workshop
VM and the L-nodes.SLURM
is used to submit jobs to the scheduler managing the large-memory nodes.interact
is used to start interactive sessions on the large-memory nodes.Singularity
allows you to create and run containers that package up pieces of software in a way that is portable and reproducible.cwltool
can be used to create complex workflows for data processing.
As many as needed. You can expose as many parameters as your tool can use and can set default values as well. Consider cowsay
, it takes other input parameters
cowsay(6) Games Manual cowsay(6)
NAME
cowsay/cowthink - configurable speaking/thinking cow (and a bit more)
SYNOPSIS
cowsay [-e eye_string] [-f cowfile] [-h] [-l] [-n] [-T tongue_string] [-W column] [-bdg‐
pstwy]
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
requirements:
SubworkflowFeatureRequirement: {}
DockerRequirement:
dockerPull: icaoberg/cowsay
baseCommand: "cowsay"
inputs:
message:
type: string
inputBinding:
position: 2
format:
type: string
inputBinding:
position: 1
prefix: -f
default: "flaming-sheep"
outputs: []
cat message2.yml
message: Hello world!
format: flaming-sheep
cwltool --singularity cowsay3.cwl message2.yml
______________
< Hello world! >
--------------
\ . . .
\ . . . ` ,
\ .; . : .' : : : .
\ i..`: i` i.i.,i i .
\ `,--.|i |i|ii|ii|i:
UooU\.'@@@@@@`.||'
\__/(@@@@@@@@@@)'
(@@@@@@@@)
`YY~~~~YY'
|| ||
INFO [job cowsay3.cwl] completed success
{}
INFO Final process status is success
cwlVersion: v1.2
class: CommandLineTool
requirements:
SubworkflowFeatureRequirement: {}
DockerRequirement:
dockerPull: icaoberg/bioformats2raw:0.4.0
dockerOutputDirectory: /opt/bioformats2raw
inputs:
inputImage:
type: File
inputBinding:
position: 1
outputDirectory:
type: Directory
inputBinding:
position: 2
default: zarr
resolutions:
type: int
inputBinding:
position: 3
prefix: --resolutions
default: 6
tile_width:
type: int
inputBinding:
position: 4
prefix: --tile_width
default: 128
tile_height:
type: int
inputBinding:
position: 5
prefix: --tile_height
default: 128
outputs:
zarr_image:
type: Directory
outputBinding:
glob: $(inputs.outputDirectory)
baseCommand: ['bioformats2raw']
stdout: bioformats2raw.out