# Elevating Scientific Computing with Singularity Containers
Please use this guide for this experience.
Before we begin it is important to charge the proper allocation
```
change_primary_group cis230059p
```
and clone the exercises
```
cd /ocean/projects/cis230059p/$(whoami)
git clone https://github.com/pscedu/workflow-examples.git
```
## Part 1: Elevating Scientific Computing with Singularity Containers
### Exercise 0. Enabling the remote builder on Sylabs.io.
To enable the remote builder on Sylabs.io we need to follow these steps
* Create an account on SyLabs.io. Click login
![](https://hackmd.io/_uploads/SJ5njFOxp.png)
and `Sign up`
<img src="https://hackmd.io/_uploads/ry_SFTKgT.png" width="50%"/>
* Click `Access Tokens` on the left menu
![](https://hackmd.io/_uploads/ryTN3Ydxp.png)
* Click Create a `New Access Token`
<img src="https://hackmd.io/_uploads/Hk70tpFla.png" width="100%" />
* Add a label and click `Create Access Token`
![](https://hackmd.io/_uploads/rkor5pKe6.png)
* Click `Copy token to Clipboard`
![](https://i.imgur.com/ztbbC8B.png)
* Login to Bridges 2 and run the command
```
interact -p RM-shared --mem=20000Mb -n 10 --account cis230059p --time "02:00:00" --reservation workshop
```
to start an interactive session, then
```
singularity remote login
```
* Paste the token and click `Enter`.
![](https://hackmd.io/_uploads/B1uRq6YgT.png)
:::warning
:warning: If you are constantly building containers remotely, make sure to erase them from your account to avoid running out of space.
:::
#### `singularity pull`
The `singularity pull` command is used to download container images from a specified source, making it an essential tool for acquiring pre-built containers or custom images. This command simplifies the process of obtaining and using Singularity containers by pulling them directly onto your local system, ready for deployment in high-performance computing environments or other computing platforms.
Run the command
```
singularity pull shub://vsoch/hello-world
```
#### `singularity inspect`
The `singularity inspect` command allows users to gain valuable insights into the contents and metadata of a Singularity container image. By using this command, you can explore details such as environment variables, labels, and run scripts within the container. `singularity inspect` is a powerful tool for understanding the characteristics of a container image before execution, aiding in the configuration and optimization of your workflows.
Run the command
```
singularity inspect hello-world_latest.sif
```
#### `singularity run`
The `singularity run` command is the gateway to executing applications within a Singularity container. By employing this command, users can seamlessly run programs encapsulated in a container without the need to enter the container environment explicitly. It simplifies the execution process, making Singularity containers easily integrable into various computing workflows, ensuring a straightforward and efficient experience for running applications in diverse computing environments.
Run the command
```
singularity run hello-world_latest.sif
```
#### `singularity cache clean`
The `singularity cache clean` command serves as a housekeeping tool for managing the Singularity cache. By using this command, users can efficiently clear and manage cached container images, freeing up storage space. This is particularly useful for maintaining a streamlined and organized environment, ensuring that only the necessary container images are retained locally. The `singularity cache clean` command helps optimize disk usage and contributes to a more efficient management of Singularity container resources.
```
singularity cache clean
```
#### `singularity pull` from a DockerHub
The `singularity pull` command not only allows users to download Singularity container images but also extends its capabilities to pull images directly from DockerHub. This versatile command streamlines the acquisition of container images, whether they are Singularity-specific or sourced from DockerHub. By using `singularity pull`, users can seamlessly fetch pre-built containers or customized images from DockerHub, making it a powerful tool for accessing a wide range of containerized applications and environments.
For example, to build a Singularity container with a barebones copy of [Ubuntu 22.04](https://hub.docker.com/layers/library/ubuntu/22.04/images/sha256-965fbcae990b0467ed5657caceaec165018ef44a4d2d46c7cdea80a9dff0d1ea?context=explore) run
```
singularity pull docker://ubuntu:22.04
```
some of the benefits of building Singularity containers from Docker include
1. **Portability:** DockerHub is a widely used and central registry for Docker images. By converting Docker images to Singularity containers, you enhance the portability of your applications, making them accessible across diverse high-performance computing (HPC) environments.
2. **Compatibility:** Singularity enables seamless integration with Docker images, ensuring compatibility with the extensive library of applications available on DockerHub. This compatibility simplifies the deployment of a wide range of software without the need for modification.
3. **Ease of Access:** Leveraging Singularity to build containers from DockerHub images provides a straightforward and convenient method to access a rich ecosystem of pre-built software stacks. This streamlines the process of obtaining and using applications from the DockerHub repository.
4. **Reproducibility**
5. **Security and Isolation:** Singularity maintains the security and isolation principles of Docker containers. However, Singularity containers offer additional benefits such as running in user space without requiring escalated privileges, contributing to a more secure execution environment.
6. **HPC Integration**
7. **Community Support**
By amalgamating the strengths of Singularity with the extensive offerings of DockerHub, users can harness a robust and flexible containerization solution that caters to diverse computational needs.
#### Exploring the container using `singularity shell`
The `singularity shell` command opens an interactive shell within a Singularity container, providing users with direct access to the containerized environment. This command is instrumental for exploring and debugging the contents of the container, allowing users to interactively test and troubleshoot applications and dependencies encapsulated in the Singularity image.
Run the command
```
singularity shell ubuntu_22.04.sif
```
to explore the container.
For example
```
singularity pull docker://esolang/gnuplot
singularity shell gnuplot_latest.sif
```
containerizes `gnuplot`, hence it is available when we shell into the container
```
Singularity> which gnuplot
/usr/bin/gnuplot
```
### Exercise 1. lazygit.
#### `singularity pull`
In the preceding illustration, an image was acquired from Singularity Hub. Additionally, users have the capability to obtain images from SyLabs. To initiate the download of an image from SyLabs, execute the following command:
```
singularity pull lazygit.sif docker://docker.io/icaoberg/lazygit:0.34
```
[LazyGit](https://github.com/jesseduffield/lazygit) is a simple terminal UI for git commands.
![](https://raw.githubusercontent.com/jesseduffield/lazygit/assets/demo/commit_and_push-compressed.gif)
#### `singularity run`
Again, we can shell into the container
```
singularity shell lazygit.sif
Singularity> which lazygit
/usr/local/bin/lazygit
```
However if the tool is on `$PATH`, then we can access it using the command
```
singularity exec lazygit.sif lazygit
```
alternatively
```
singularity exec lazygit.sif /usr/local/bin/lazygit
```
<img src="https://hackmd.io/_uploads/ByMMu1qxT.png" width="75%" />
Singularity containers are very useful for containerization useful tools for software development. To see a list of vetted containers built by PSC, click [here](https://github.com/pscedu/singularity).
### Exercise 1a. Figlet.
You can use Figlet to pretty print strings. For example
```
echo "foobar" | ./figlet_latest.sif
__ _
/ _| ___ ___ | |__ __ _ _ __
| |_ / _ \ / _ \| '_ \ / _` | '__|
| _| (_) | (_) | |_) | (_| | |
|_| \___/ \___/|_.__/ \__,_|_|
```
Build or pull a Singularity from this [repository](https://hub.docker.com/r/hairyhenderson/figlet) and attempt to recreate the command above.
### Exercise 2. Singularity recipes.
Consider the following Singularity recipe
```
Bootstrap: docker
From: hello-world:latest
%labels
MAINTAINER icaoberg
EMAIL icaoberg@psc.edu
SUPPORT help@psc.edu
%post
```
this recipe is pulling a `Hello, world!` example from DockerHub. To build the container, run the command
```
IMAGE=singularity-hello-world.sif
DEFINITION=Singularity
singularity build --remote $IMAGE $DEFINITION
```
which builds the container remotely and downloads it locally
```
INFO: Starting build...
INFO: Setting maximum build duration to 1h0m0s
INFO: Remote "cloud.sylabs.io" added.
INFO: Access Token Verified!
INFO: Token stored in /root/.singularity/remote.yaml
INFO: Remote "cloud.sylabs.io" now in use.
INFO: Starting build...
Getting image source signatures
Copying blob sha256:719385e32844401d57ecfd3eacab360bf551a1491c05b85806ed8f1b08d792f6
Copying config sha256:0dcea989af054c9b5ab290a0c3ecc3f97947894f575fd08a93d3e048a157022a
Writing manifest to image destination
Storing signatures
2023/10/03 19:31:12 info unpack layer: sha256:719385e32844401d57ecfd3eacab360bf551a1491c05b85806ed8f1b08d792f6
INFO: Adding labels
INFO: Creating SIF file...
INFO: Build complete: /tmp/image-1800109761
INFO: Performing post-build operations
INFO: Format for SBOM is not set or file exceeds maximum size for SBOM generation.
INFO: Calculating SIF image checksum
INFO: Uploading image to library...
WARNING: Skipping container verification
INFO: Uploading 45056 bytes
INFO: Image uploaded successfully.
INFO: Build complete: singularity-hello-world.sif
```
Now that the file `singularity-hello-world.sif`
Since the container has an entry-point, we can simply run
```
./singularity-hello-world.sif
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
```
#### Exercise 3. More lazygit.
Consider the recipe
```
Bootstrap: docker
From: debian:latest
%labels
MAINTAINER icaoberg
EMAIL icaoberg@psc.edu
SUPPORT help@psc.edu
REPOSITORY http://gitub.com/pscedu/singularity-lazygit
COPYRIGHT Copyright © 2022-2023 Pittsburgh Supercomputing Center. All Rights Reserved.
VERSION 0.31.4
%post
apt update
apt install -y git wget
wget -nc https://github.com/jesseduffield/lazygit/releases/download/v0.31.4/lazygit_0.31.4_Linux_x86_64.tar.gz
tar -xvf lazygit_0.31.4_Linux_x86_64.tar.gz && rm -f lazygit_0.31.4_Linux_x86_64.tar.gz
mv lazygit /usr/local/bin
rm -f README.md LICENSE lazygit_0.31.4_Linux_x86_64.tar.gz
apt remove -y wget
apt clean
%runscript
/usr/local/bin/lazygit
```
use this recipe to build a container named `singularity-lazygit-0.31.4.sif`.
Notice this recipe has an entry-point. How would you start `lazygit`?
#### Exercise 4. ImageMagick.
Consider the following `Dockerfile`
```
# Dockerfile for ImageMagick
FROM debian:latest
LABEL MAINTAINER="icaoberg" \
EMAIL="icaoberg@psc.edu" \
SUPPORT="help@psc.edu" \
REPOSITORY="http://github.com/pscedu/singularity-imagemagick" \
COPYRIGHT="Copyright © 2023 Pittsburgh Supercomputing Center. All Rights Reserved." \
VERSION="7.1.1-15"
RUN apt-get update && \
apt-get -y upgrade && \
apt-get install -y imagemagick libtiff-tools && \
sed -i 's|<policy domain="resource" name="width" value="16KP"/>|<policy domain="resource" name="width" value="128KP"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<policy domain="resource" name="height" value="16KP"/>|<policy domain="resource" name="height" value="128KP"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<policy domain="resource" name="memory" value="256MiB"/>|<policy domain="resource" name="memory" value="32GiB"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<policy domain="resource" name="map" value="512MiB"/>|<policy domain="resource" name="map" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<!-- <policy domain="resource" name="temporary-path" value="/tmp"/> -->|<policy domain="resource" name="temporary-path" value="/tmp"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<policy domain="resource" name="map" value="512MiB"/>|<policy domain="resource" name="map" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml && \
sed -i 's|<policy domain="resource" name="disk" value="1GiB"/>|<policy domain="resource" name="disk" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml
CMD ["/bin/bash"]
```
convert the file above into a Singularity recipe and attempt to build it remotely and name the file `singularity-imagemagick-7.1.1-15.sif`.
Bonus: push singularity container to your SyLabs account.
# Part 2: Streamlining Reproducible Data Analysis using Workflow Management Systems and Singularity Containers
## Dive into Nextflow
Through hands-on exploration and interactive sessions, attendees will discover how Nextflow simplifies the orchestration of complex data analysis pipelines.
### Installing Nextflow on Bridges2
Nextflow is available on Bridges2
```
module avail nextflow/21.10.6
```
If you wish to install a newer version of Nextflow (and we will for this workshop), then run the following commands
```
cd /ocean/projects/cis230059p/$(whoami)
if [ -d sdkman ]; then rm -rf sdkman; fi
mkdir sdkman && ln -s $(pwd)/sdkman $HOME/.sdkman
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"
sdk install java 17.0.6-amzn
```
```
if [ ! -d ~/bin ]; then mkdir ~/bin; fi
cd ~/bin && curl -s https://get.nextflow.io | bash
chmod +x ~/bin/nextflow
export PATH=$PATH:~/bin
```
### Exercise. Hello World!
For the purpose of this workshop, we will be using workflows from nf-core. For example, run in terminal
```
module load nextflow
nextflow run hello
```
### Exercise. Reverse.
Consider the following Netflow definition file
```
nextflow.enable.dsl=1
params.in = "genome_tree.fasta"
sequences = file(params.in)
SPLIT = (System.properties['os.name'] == 'Mac OS X' ? 'gcsplit' : 'csplit')
process splitSequences {
input:
file 'input.fa' from sequences
output:
file 'seq_*' into records
"""
$SPLIT input.fa '%^>%' '/^>/' '{*}' -f seq_
"""
}
process reverse {
input:
file x from records
output:
stdout result
"""
cat $x | rev
"""
}
result.subscribe { println it }
```
What do you think this is doing?
To figure out what it does, run the commands
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/example1
bash ./script.sh
```
What output do you see?
Inspect the script
```
#!/bin/bash
#SBATCH -p RM-small
export NXF_DEFAULT_DSL=1
if [ ! -f genome_tree.fasta ]; then
cp -v /ocean/datasets/community/genomics/checkm/20210915/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta .
fi
module load nextflow
nextflow workflow
```
Notice that you can also submit it to the scheduler by typing
```
sbatch ./script.sh
```
### Exercise. bamtofasta.
Consider the following config file
```
singularity {
enabled = true
}
process {
executor = 'slurm'
queue = 'RM'
}
```
this file is needed if we wish to use a specific partition on Brides 2 as well as enabling the use of containers (this pipeline does not).
Run this command
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore/bamtofasta
```
Now look at this script
```
#!/bin/bash
#SBATCH -p RM-shared
#SBATCH -n 10
#SBATCH --mem=20000M
module load anaconda3
export SDKMAN_DIR="$HOME/.sdkman"
[[ -s "$HOME/.sdkman/bin/sdkman-init.sh" ]] && source "$HOME/.sdkman/bin/sdkman-init.sh"
export PATH=~/bin:~/.local/bin/:$PATH
export NXF_SINGULARITY_CACHEDIR=./containers
if [ ! -d ./containers ]; then mkdir ./containers; fi
nextflow run nf-core/bamtofastq -r 2.0.0 -profile test --outdir ./results
```
There are several tools needed by this Hint: FastQC, samtools are some of the tools missing.
If you have doubts about the missing dependencies, refer to the documentation [here](https://nf-co.re/bamtofastq/2.0.0/docs/usage).
### Exercise. rnaseq.
Change directory to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore/rnaseq
```
and consider the script
```
#!/bin/bash
#SBATCH -p RM-shared
if [ -f ~/bin/nextflow ]; then
export PATH=$PATH:~/bin/
fi
export NXF_SINGULARITY_CACHEDIR=./containers
if [ ! -d ./containers ]; then mkdir ./containers; fi
nextflow run nf-core/rnaseq -profile test --outdir results -c psc.config
```
This exercise requires
* the latest version of Nextflow (not the version available on Lmod)
* the creation of a folder to host the containers
:::warning
Do not submit this script until the end of the experience as we might run out of allocation and space. This workflow will generate at least 150Gb of data/results.
:::
### Exercise. More...
Explore the other exercises in
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore
```
:::warning
Do not submit this script until the end of the experience as we might run out of allocation and space.
:::
## Dive into Snakemake
Through practical exercises and interactive discussions, attendees will learn about Snakemake's potential in simplifying the creation and execution of data analysis pipelines.
### Installing Snakemake on Bridges2
```
module load anaconda3
pip install snakemake --user -q
```
### Exercise. Weather.
Change directory to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/snakemake/weather
```
Consider the Snakefile
```
rule wttr:
output:
touch("output.txt")
shell:
"curl wttr.in > {output}"
```
Notice the notation of this YAML file.
What is this workflow doing?
Visit this [site](https://github.com/chubin/wttr.in) and update the Snakefile to pull the weather information from another city.
### Exercise. fortune
Change directory to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/snakemake/fortune
```
Consider the following Snakefile
```
# Rule 1: Generate fortune message
rule fortune:
shell:
"fortune fortunes"
```
* What changes do you need to make to this Snakefile to have the output written to disk?
* What changes do you need to make the Snakefile and script to have the output of `fortune` piped into `cowsay`?
### Exercise. More...
At PSC the Biomed group uses Snakemake to perfornm tasks like
* Computing checksums on large datasets
* Extract data and metadata from public datasets
* To populate internal databases for assess metadata quality.
* Daily and weekly generation of private data reports.
Even though Snakemake was designed with a specific domain in mind, it is pretty flexible and easy to use.
Think of a project or workflow that could be easily implemented using Snakemake.
## Dive into CWL
Through interactive exercises and real-world examples, participants will learn how CWL standardizes workflow descriptions, making them portable and reproducible across different platforms.
### Installing cwl-tools
```
module laod anaconda3
pip install cwlref-runner --user
```
#### Exercise. empties.
Change directory to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/empties
```
Consider the following CWL file
```
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: Workflow
inputs:
directory:
type: Directory
outputs:
out_empties:
type: File
outputSource: quality_control_1/out_empties
steps:
quality_control_1:
run:
cwlVersion: v1.0
class: CommandLineTool
baseCommand: [bash,"/bil/pscstaff/rlagha/RL/forLuke/scripts/empties2.sh"]
stdout: out_empties.txt
inputs:
directory:
type: Directory
inputBinding: {}
outputs:
out_empties:
type: stdout
in:
directory: directory
out: [out_empties]
```
what do you think this is doing?
Modify the file above so that it runs `empties2.sh ` file in the current location.
Notice that we use a YAML file for the input parameters.
```
cat input.yml
directory:
class: Directory
location: /bil/data/inventory
```
This is really useful when doing it programmatically.
Modify the YAML file so that it validates
```
/ocean/projects/cis230059p/$(whoami)/workflow-examples
```
Remember to replace `$(whoami)` with your PSC user account.
### Exercise. weather.
Another weather example. Change directory to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/weather
```
Notice how this example looks very similar to the Snakemake version
```
# CWL Version 1.2 Declaration
cwlVersion: v1.2
# CommandLineTool Class Declaration
class: CommandLineTool
# Base Command to Execute
baseCommand: ["curl", "wttr.in"]
# Define Inputs (empty in this case)
inputs: []
# Define Outputs (empty in this case)
outputs: []
```
Run it. What is the difference between the output from Snakemake and CWL?
Now modify the file above so that it saves the output to disk
```
# CWL Version 1.2 Declaration
cwlVersion: v1.2
# CommandLineTool Class Declaration
class: CommandLineTool
# Base Command to Execute
baseCommand: ["curl", "wttr.in"]
# Define Inputs (empty in this case)
inputs: []
# Define Outputs
outputs:
# Define an Output Parameter Named 'output_message'
output_message:
# Specify the Type of the Output as 'stdout'
type: stdout
# Specify the Standard Output (stdout) File
stdout: output.txt
```
### Exercise. bioformats2raw.
Change directories to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/bioformats2raw
```
Consider the CWL file
```
#!/usr/bin/env cwl-runner
cwlVersion: v1.1
class: CommandLineTool
requirements:
DockerRequirement:
dockerPull: hubmap/ome-tiff-pyramid:latest
inputs:
ometiff_file:
type: File
inputBinding:
position: 0
base_directory:
type: string
inputBinding:
position: 1
processes:
type: int
inputBinding:
position: 2
rgb:
type: boolean?
default: false
inputBinding:
prefix: --rgb
position: 3
downsample_type:
type: string?
inputBinding:
prefix: --downsample-type
position: 4
outputs:
pyramid_dir:
type: Directory
outputBinding:
glob: 'ometiff-pyramids'
n5_dir:
type: Directory
outputBinding:
glob: 'n5'
baseCommand: ['python3', '/opt/ometiff_to_pyramid.py']
```
Notice how this CWL file is downloading a container from Docker Hub (the HuBMAP project has all their containers publicly available. For more info click [here](https://github.com/hubmapconsortium/portal-containers)).
We can call the `cwl-runner` using the Singularity option
```
cwl-runner --singularity bioformats2raw.cwl input.yaml
```
and it will run the tool inside the container.
As a side note, the only reason this workflow works is because the team made sure the container is well-formed. See the Dockerfile below
```
ARG BUILD_IMAGE=gradle:6.2.1-jdk8
#
# Build phase: Use the gradle image for building.
#
FROM ${BUILD_IMAGE} as build
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update -qq \
&& apt-get -y install \
libblosc1 \
tzdata \
zeroc-ice-all-runtime \
zip \
&& rm -rf /var/cache/apt/*
WORKDIR /opt
RUN wget https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.2.6/bioformats2raw-0.2.6.zip -O bioformats2raw.zip \
&& unzip bioformats2raw.zip \
&& mv bioformats2raw-0.2.6 bioformats2raw \
&& rm bioformats2raw.zip
WORKDIR /bioformats_pyramid
ENV RAW2OMETIFF_VERSION=v0.2.7
# Clone raw pyramid to tiff repo.
RUN git clone -b ${RAW2OMETIFF_VERSION} https://github.com/glencoesoftware/raw2ometiff.git \
&& cd raw2ometiff \
&& gradle build \
&& cd build/distributions \
&& rm raw2ometiff*tar \
&& unzip raw2ometiff*zip \
&& rm -f raw2ometiff*zip \
&& cd ../.. \
&& mv build/distributions/raw2ometiff* /opt/raw2ometiff
# Set working directory containing new cli tools.
WORKDIR /opt
COPY bin /opt
CMD ["/bin/bash"]
```
Submit the file `script.sh` to the scheduler.
### Exercise. OME-TIFF pyramid.
Change directories to
```
cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/ome-tiff-pyramid
```
Explore the file `pipeline.cwl`. This CWL file is defining a workflow with several steps
```
#!/usr/bin/env cwl-runner
class: Workflow
cwlVersion: v1.1
requirements:
ScatterFeatureRequirement: {}
inputs:
ometiff_directory:
type: Directory
processes:
type: int
default: 1
rgb:
type: boolean?
downsample_type:
type: string?
outputs:
pyramid_dir:
type: Directory[]
outputSource: convert_to_pyramid/pyramid_dir
n5_dir:
type: Directory[]
outputSource: convert_to_pyramid/n5_dir
steps:
collect_ometiff_files:
run: collect-ometiff-files.cwl
in:
ometiff_directory: ometiff_directory
out:
[ometiff_file, base_directory]
convert_to_pyramid:
scatter: [ometiff_file, base_directory]
scatterMethod: dotproduct
run: steps/ometiff-to-pyramid.cwl
in:
ometiff_file: collect_ometiff_files/ometiff_file
base_directory: collect_ometiff_files/base_directory
processes: processes
rgb: rgb
downsample_type: downsample_type
out: [pyramid_dir, n5_dir]
```
It is important to note that each step is defined in its own CWL file.
```
find . -type f -name "*cwl"
./collect-ometiff-files.cwl
./pipeline.cwl
./steps/ometiff-to-pyramid.cwl
```
What makes this workflow important?
* This workflow is using more than one step
* Each step is clearly defined by its own CWL file
* Each step uses a well-defined Docker container hosted in Docker Hub that can be converted to Singularity
* This is a HuBMAP workflow running on Brain Image Library data (talk about interoperability)