# Elevating Scientific Computing with Singularity Containers Please use this guide for this experience. Before we begin it is important to charge the proper allocation ``` change_primary_group cis230059p ``` and clone the exercises ``` cd /ocean/projects/cis230059p/$(whoami) git clone https://github.com/pscedu/workflow-examples.git ``` ## Part 1: Elevating Scientific Computing with Singularity Containers ### Exercise 0. Enabling the remote builder on Sylabs.io. To enable the remote builder on Sylabs.io we need to follow these steps * Create an account on SyLabs.io. Click login ![](https://hackmd.io/_uploads/SJ5njFOxp.png) and `Sign up` <img src="https://hackmd.io/_uploads/ry_SFTKgT.png" width="50%"/> * Click `Access Tokens` on the left menu ![](https://hackmd.io/_uploads/ryTN3Ydxp.png) * Click Create a `New Access Token` <img src="https://hackmd.io/_uploads/Hk70tpFla.png" width="100%" /> * Add a label and click `Create Access Token` ![](https://hackmd.io/_uploads/rkor5pKe6.png) * Click `Copy token to Clipboard` ![](https://i.imgur.com/ztbbC8B.png) * Login to Bridges 2 and run the command ``` interact -p RM-shared --mem=20000Mb -n 10 --account cis230059p --time "02:00:00" --reservation workshop ``` to start an interactive session, then ``` singularity remote login ``` * Paste the token and click `Enter`. ![](https://hackmd.io/_uploads/B1uRq6YgT.png) :::warning :warning: If you are constantly building containers remotely, make sure to erase them from your account to avoid running out of space. ::: #### `singularity pull` The `singularity pull` command is used to download container images from a specified source, making it an essential tool for acquiring pre-built containers or custom images. This command simplifies the process of obtaining and using Singularity containers by pulling them directly onto your local system, ready for deployment in high-performance computing environments or other computing platforms. Run the command ``` singularity pull shub://vsoch/hello-world ``` #### `singularity inspect` The `singularity inspect` command allows users to gain valuable insights into the contents and metadata of a Singularity container image. By using this command, you can explore details such as environment variables, labels, and run scripts within the container. `singularity inspect` is a powerful tool for understanding the characteristics of a container image before execution, aiding in the configuration and optimization of your workflows. Run the command ``` singularity inspect hello-world_latest.sif ``` #### `singularity run` The `singularity run` command is the gateway to executing applications within a Singularity container. By employing this command, users can seamlessly run programs encapsulated in a container without the need to enter the container environment explicitly. It simplifies the execution process, making Singularity containers easily integrable into various computing workflows, ensuring a straightforward and efficient experience for running applications in diverse computing environments. Run the command ``` singularity run hello-world_latest.sif ``` #### `singularity cache clean` The `singularity cache clean` command serves as a housekeeping tool for managing the Singularity cache. By using this command, users can efficiently clear and manage cached container images, freeing up storage space. This is particularly useful for maintaining a streamlined and organized environment, ensuring that only the necessary container images are retained locally. The `singularity cache clean` command helps optimize disk usage and contributes to a more efficient management of Singularity container resources. ``` singularity cache clean ``` #### `singularity pull` from a DockerHub The `singularity pull` command not only allows users to download Singularity container images but also extends its capabilities to pull images directly from DockerHub. This versatile command streamlines the acquisition of container images, whether they are Singularity-specific or sourced from DockerHub. By using `singularity pull`, users can seamlessly fetch pre-built containers or customized images from DockerHub, making it a powerful tool for accessing a wide range of containerized applications and environments. For example, to build a Singularity container with a barebones copy of [Ubuntu 22.04](https://hub.docker.com/layers/library/ubuntu/22.04/images/sha256-965fbcae990b0467ed5657caceaec165018ef44a4d2d46c7cdea80a9dff0d1ea?context=explore) run ``` singularity pull docker://ubuntu:22.04 ``` some of the benefits of building Singularity containers from Docker include 1. **Portability:** DockerHub is a widely used and central registry for Docker images. By converting Docker images to Singularity containers, you enhance the portability of your applications, making them accessible across diverse high-performance computing (HPC) environments. 2. **Compatibility:** Singularity enables seamless integration with Docker images, ensuring compatibility with the extensive library of applications available on DockerHub. This compatibility simplifies the deployment of a wide range of software without the need for modification. 3. **Ease of Access:** Leveraging Singularity to build containers from DockerHub images provides a straightforward and convenient method to access a rich ecosystem of pre-built software stacks. This streamlines the process of obtaining and using applications from the DockerHub repository. 4. **Reproducibility** 5. **Security and Isolation:** Singularity maintains the security and isolation principles of Docker containers. However, Singularity containers offer additional benefits such as running in user space without requiring escalated privileges, contributing to a more secure execution environment. 6. **HPC Integration** 7. **Community Support** By amalgamating the strengths of Singularity with the extensive offerings of DockerHub, users can harness a robust and flexible containerization solution that caters to diverse computational needs. #### Exploring the container using `singularity shell` The `singularity shell` command opens an interactive shell within a Singularity container, providing users with direct access to the containerized environment. This command is instrumental for exploring and debugging the contents of the container, allowing users to interactively test and troubleshoot applications and dependencies encapsulated in the Singularity image. Run the command ``` singularity shell ubuntu_22.04.sif ``` to explore the container. For example ``` singularity pull docker://esolang/gnuplot singularity shell gnuplot_latest.sif ``` containerizes `gnuplot`, hence it is available when we shell into the container ``` Singularity> which gnuplot /usr/bin/gnuplot ``` ### Exercise 1. lazygit. #### `singularity pull` In the preceding illustration, an image was acquired from Singularity Hub. Additionally, users have the capability to obtain images from SyLabs. To initiate the download of an image from SyLabs, execute the following command: ``` singularity pull lazygit.sif docker://docker.io/icaoberg/lazygit:0.34 ``` [LazyGit](https://github.com/jesseduffield/lazygit) is a simple terminal UI for git commands. ![](https://raw.githubusercontent.com/jesseduffield/lazygit/assets/demo/commit_and_push-compressed.gif) #### `singularity run` Again, we can shell into the container ``` singularity shell lazygit.sif Singularity> which lazygit /usr/local/bin/lazygit ``` However if the tool is on `$PATH`, then we can access it using the command ``` singularity exec lazygit.sif lazygit ``` alternatively ``` singularity exec lazygit.sif /usr/local/bin/lazygit ``` <img src="https://hackmd.io/_uploads/ByMMu1qxT.png" width="75%" /> Singularity containers are very useful for containerization useful tools for software development. To see a list of vetted containers built by PSC, click [here](https://github.com/pscedu/singularity). ### Exercise 1a. Figlet. You can use Figlet to pretty print strings. For example ``` echo "foobar" | ./figlet_latest.sif __ _ / _| ___ ___ | |__ __ _ _ __ | |_ / _ \ / _ \| '_ \ / _` | '__| | _| (_) | (_) | |_) | (_| | | |_| \___/ \___/|_.__/ \__,_|_| ``` Build or pull a Singularity from this [repository](https://hub.docker.com/r/hairyhenderson/figlet) and attempt to recreate the command above. ### Exercise 2. Singularity recipes. Consider the following Singularity recipe ``` Bootstrap: docker From: hello-world:latest %labels MAINTAINER icaoberg EMAIL icaoberg@psc.edu SUPPORT help@psc.edu %post ``` this recipe is pulling a `Hello, world!` example from DockerHub. To build the container, run the command ``` IMAGE=singularity-hello-world.sif DEFINITION=Singularity singularity build --remote $IMAGE $DEFINITION ``` which builds the container remotely and downloads it locally ``` INFO: Starting build... INFO: Setting maximum build duration to 1h0m0s INFO: Remote "cloud.sylabs.io" added. INFO: Access Token Verified! INFO: Token stored in /root/.singularity/remote.yaml INFO: Remote "cloud.sylabs.io" now in use. INFO: Starting build... Getting image source signatures Copying blob sha256:719385e32844401d57ecfd3eacab360bf551a1491c05b85806ed8f1b08d792f6 Copying config sha256:0dcea989af054c9b5ab290a0c3ecc3f97947894f575fd08a93d3e048a157022a Writing manifest to image destination Storing signatures 2023/10/03 19:31:12 info unpack layer: sha256:719385e32844401d57ecfd3eacab360bf551a1491c05b85806ed8f1b08d792f6 INFO: Adding labels INFO: Creating SIF file... INFO: Build complete: /tmp/image-1800109761 INFO: Performing post-build operations INFO: Format for SBOM is not set or file exceeds maximum size for SBOM generation. INFO: Calculating SIF image checksum INFO: Uploading image to library... WARNING: Skipping container verification INFO: Uploading 45056 bytes INFO: Image uploaded successfully. INFO: Build complete: singularity-hello-world.sif ``` Now that the file `singularity-hello-world.sif` Since the container has an entry-point, we can simply run ``` ./singularity-hello-world.sif Hello from Docker! This message shows that your installation appears to be working correctly. To generate this message, Docker took the following steps: 1. The Docker client contacted the Docker daemon. 2. The Docker daemon pulled the "hello-world" image from the Docker Hub. (amd64) 3. The Docker daemon created a new container from that image which runs the executable that produces the output you are currently reading. 4. The Docker daemon streamed that output to the Docker client, which sent it to your terminal. To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/ For more examples and ideas, visit: https://docs.docker.com/get-started/ ``` #### Exercise 3. More lazygit. Consider the recipe ``` Bootstrap: docker From: debian:latest %labels MAINTAINER icaoberg EMAIL icaoberg@psc.edu SUPPORT help@psc.edu REPOSITORY http://gitub.com/pscedu/singularity-lazygit COPYRIGHT Copyright © 2022-2023 Pittsburgh Supercomputing Center. All Rights Reserved. VERSION 0.31.4 %post apt update apt install -y git wget wget -nc https://github.com/jesseduffield/lazygit/releases/download/v0.31.4/lazygit_0.31.4_Linux_x86_64.tar.gz tar -xvf lazygit_0.31.4_Linux_x86_64.tar.gz && rm -f lazygit_0.31.4_Linux_x86_64.tar.gz mv lazygit /usr/local/bin rm -f README.md LICENSE lazygit_0.31.4_Linux_x86_64.tar.gz apt remove -y wget apt clean %runscript /usr/local/bin/lazygit ``` use this recipe to build a container named `singularity-lazygit-0.31.4.sif`. Notice this recipe has an entry-point. How would you start `lazygit`? #### Exercise 4. ImageMagick. Consider the following `Dockerfile` ``` # Dockerfile for ImageMagick FROM debian:latest LABEL MAINTAINER="icaoberg" \ EMAIL="icaoberg@psc.edu" \ SUPPORT="help@psc.edu" \ REPOSITORY="http://github.com/pscedu/singularity-imagemagick" \ COPYRIGHT="Copyright © 2023 Pittsburgh Supercomputing Center. All Rights Reserved." \ VERSION="7.1.1-15" RUN apt-get update && \ apt-get -y upgrade && \ apt-get install -y imagemagick libtiff-tools && \ sed -i 's|<policy domain="resource" name="width" value="16KP"/>|<policy domain="resource" name="width" value="128KP"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<policy domain="resource" name="height" value="16KP"/>|<policy domain="resource" name="height" value="128KP"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<policy domain="resource" name="memory" value="256MiB"/>|<policy domain="resource" name="memory" value="32GiB"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<policy domain="resource" name="map" value="512MiB"/>|<policy domain="resource" name="map" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<!-- <policy domain="resource" name="temporary-path" value="/tmp"/> -->|<policy domain="resource" name="temporary-path" value="/tmp"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<policy domain="resource" name="map" value="512MiB"/>|<policy domain="resource" name="map" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml && \ sed -i 's|<policy domain="resource" name="disk" value="1GiB"/>|<policy domain="resource" name="disk" value="12GiB"/>|g' /etc/ImageMagick-6/policy.xml CMD ["/bin/bash"] ``` convert the file above into a Singularity recipe and attempt to build it remotely and name the file `singularity-imagemagick-7.1.1-15.sif`. Bonus: push singularity container to your SyLabs account. # Part 2: Streamlining Reproducible Data Analysis using Workflow Management Systems and Singularity Containers ## Dive into Nextflow Through hands-on exploration and interactive sessions, attendees will discover how Nextflow simplifies the orchestration of complex data analysis pipelines. ### Installing Nextflow on Bridges2 Nextflow is available on Bridges2 ``` module avail nextflow/21.10.6 ``` If you wish to install a newer version of Nextflow (and we will for this workshop), then run the following commands ``` cd /ocean/projects/cis230059p/$(whoami) if [ -d sdkman ]; then rm -rf sdkman; fi mkdir sdkman && ln -s $(pwd)/sdkman $HOME/.sdkman curl -s "https://get.sdkman.io" | bash source "$HOME/.sdkman/bin/sdkman-init.sh" sdk install java 17.0.6-amzn ``` ``` if [ ! -d ~/bin ]; then mkdir ~/bin; fi cd ~/bin && curl -s https://get.nextflow.io | bash chmod +x ~/bin/nextflow export PATH=$PATH:~/bin ``` ### Exercise. Hello World! For the purpose of this workshop, we will be using workflows from nf-core. For example, run in terminal ``` module load nextflow nextflow run hello ``` ### Exercise. Reverse. Consider the following Netflow definition file ``` nextflow.enable.dsl=1 params.in = "genome_tree.fasta" sequences = file(params.in) SPLIT = (System.properties['os.name'] == 'Mac OS X' ? 'gcsplit' : 'csplit') process splitSequences { input: file 'input.fa' from sequences output: file 'seq_*' into records """ $SPLIT input.fa '%^>%' '/^>/' '{*}' -f seq_ """ } process reverse { input: file x from records output: stdout result """ cat $x | rev """ } result.subscribe { println it } ``` What do you think this is doing? To figure out what it does, run the commands ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/example1 bash ./script.sh ``` What output do you see? Inspect the script ``` #!/bin/bash #SBATCH -p RM-small export NXF_DEFAULT_DSL=1 if [ ! -f genome_tree.fasta ]; then cp -v /ocean/datasets/community/genomics/checkm/20210915/genome_tree/genome_tree_reduced.refpkg/genome_tree.fasta . fi module load nextflow nextflow workflow ``` Notice that you can also submit it to the scheduler by typing ``` sbatch ./script.sh ``` ### Exercise. bamtofasta. Consider the following config file ``` singularity { enabled = true } process { executor = 'slurm' queue = 'RM' } ``` this file is needed if we wish to use a specific partition on Brides 2 as well as enabling the use of containers (this pipeline does not). Run this command ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore/bamtofasta ``` Now look at this script ``` #!/bin/bash #SBATCH -p RM-shared #SBATCH -n 10 #SBATCH --mem=20000M module load anaconda3 export SDKMAN_DIR="$HOME/.sdkman" [[ -s "$HOME/.sdkman/bin/sdkman-init.sh" ]] && source "$HOME/.sdkman/bin/sdkman-init.sh" export PATH=~/bin:~/.local/bin/:$PATH export NXF_SINGULARITY_CACHEDIR=./containers if [ ! -d ./containers ]; then mkdir ./containers; fi nextflow run nf-core/bamtofastq -r 2.0.0 -profile test --outdir ./results ``` There are several tools needed by this Hint: FastQC, samtools are some of the tools missing. If you have doubts about the missing dependencies, refer to the documentation [here](https://nf-co.re/bamtofastq/2.0.0/docs/usage). ### Exercise. rnaseq. Change directory to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore/rnaseq ``` and consider the script ``` #!/bin/bash #SBATCH -p RM-shared if [ -f ~/bin/nextflow ]; then export PATH=$PATH:~/bin/ fi export NXF_SINGULARITY_CACHEDIR=./containers if [ ! -d ./containers ]; then mkdir ./containers; fi nextflow run nf-core/rnaseq -profile test --outdir results -c psc.config ``` This exercise requires * the latest version of Nextflow (not the version available on Lmod) * the creation of a folder to host the containers :::warning Do not submit this script until the end of the experience as we might run out of allocation and space. This workflow will generate at least 150Gb of data/results. ::: ### Exercise. More... Explore the other exercises in ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/nextflow/nfcore ``` :::warning Do not submit this script until the end of the experience as we might run out of allocation and space. ::: ## Dive into Snakemake Through practical exercises and interactive discussions, attendees will learn about Snakemake's potential in simplifying the creation and execution of data analysis pipelines. ### Installing Snakemake on Bridges2 ``` module load anaconda3 pip install snakemake --user -q ``` ### Exercise. Weather. Change directory to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/snakemake/weather ``` Consider the Snakefile ``` rule wttr: output: touch("output.txt") shell: "curl wttr.in > {output}" ``` Notice the notation of this YAML file. What is this workflow doing? Visit this [site](https://github.com/chubin/wttr.in) and update the Snakefile to pull the weather information from another city. ### Exercise. fortune Change directory to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/snakemake/fortune ``` Consider the following Snakefile ``` # Rule 1: Generate fortune message rule fortune: shell: "fortune fortunes" ``` * What changes do you need to make to this Snakefile to have the output written to disk? * What changes do you need to make the Snakefile and script to have the output of `fortune` piped into `cowsay`? ### Exercise. More... At PSC the Biomed group uses Snakemake to perfornm tasks like * Computing checksums on large datasets * Extract data and metadata from public datasets * To populate internal databases for assess metadata quality. * Daily and weekly generation of private data reports. Even though Snakemake was designed with a specific domain in mind, it is pretty flexible and easy to use. Think of a project or workflow that could be easily implemented using Snakemake. ## Dive into CWL Through interactive exercises and real-world examples, participants will learn how CWL standardizes workflow descriptions, making them portable and reproducible across different platforms. ### Installing cwl-tools ``` module laod anaconda3 pip install cwlref-runner --user ``` #### Exercise. empties. Change directory to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/empties ``` Consider the following CWL file ``` #!/usr/bin/env cwl-runner cwlVersion: v1.2 class: Workflow inputs: directory: type: Directory outputs: out_empties: type: File outputSource: quality_control_1/out_empties steps: quality_control_1: run: cwlVersion: v1.0 class: CommandLineTool baseCommand: [bash,"/bil/pscstaff/rlagha/RL/forLuke/scripts/empties2.sh"] stdout: out_empties.txt inputs: directory: type: Directory inputBinding: {} outputs: out_empties: type: stdout in: directory: directory out: [out_empties] ``` what do you think this is doing? Modify the file above so that it runs `empties2.sh ` file in the current location. Notice that we use a YAML file for the input parameters. ``` cat input.yml directory: class: Directory location: /bil/data/inventory ``` This is really useful when doing it programmatically. Modify the YAML file so that it validates ``` /ocean/projects/cis230059p/$(whoami)/workflow-examples ``` Remember to replace `$(whoami)` with your PSC user account. ### Exercise. weather. Another weather example. Change directory to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/weather ``` Notice how this example looks very similar to the Snakemake version ``` # CWL Version 1.2 Declaration cwlVersion: v1.2 # CommandLineTool Class Declaration class: CommandLineTool # Base Command to Execute baseCommand: ["curl", "wttr.in"] # Define Inputs (empty in this case) inputs: [] # Define Outputs (empty in this case) outputs: [] ``` Run it. What is the difference between the output from Snakemake and CWL? Now modify the file above so that it saves the output to disk ``` # CWL Version 1.2 Declaration cwlVersion: v1.2 # CommandLineTool Class Declaration class: CommandLineTool # Base Command to Execute baseCommand: ["curl", "wttr.in"] # Define Inputs (empty in this case) inputs: [] # Define Outputs outputs: # Define an Output Parameter Named 'output_message' output_message: # Specify the Type of the Output as 'stdout' type: stdout # Specify the Standard Output (stdout) File stdout: output.txt ``` ### Exercise. bioformats2raw. Change directories to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/bioformats2raw ``` Consider the CWL file ``` #!/usr/bin/env cwl-runner cwlVersion: v1.1 class: CommandLineTool requirements: DockerRequirement: dockerPull: hubmap/ome-tiff-pyramid:latest inputs: ometiff_file: type: File inputBinding: position: 0 base_directory: type: string inputBinding: position: 1 processes: type: int inputBinding: position: 2 rgb: type: boolean? default: false inputBinding: prefix: --rgb position: 3 downsample_type: type: string? inputBinding: prefix: --downsample-type position: 4 outputs: pyramid_dir: type: Directory outputBinding: glob: 'ometiff-pyramids' n5_dir: type: Directory outputBinding: glob: 'n5' baseCommand: ['python3', '/opt/ometiff_to_pyramid.py'] ``` Notice how this CWL file is downloading a container from Docker Hub (the HuBMAP project has all their containers publicly available. For more info click [here](https://github.com/hubmapconsortium/portal-containers)). We can call the `cwl-runner` using the Singularity option ``` cwl-runner --singularity bioformats2raw.cwl input.yaml ``` and it will run the tool inside the container. As a side note, the only reason this workflow works is because the team made sure the container is well-formed. See the Dockerfile below ``` ARG BUILD_IMAGE=gradle:6.2.1-jdk8 # # Build phase: Use the gradle image for building. # FROM ${BUILD_IMAGE} as build ENV DEBIAN_FRONTEND=noninteractive RUN apt-get update -qq \ && apt-get -y install \ libblosc1 \ tzdata \ zeroc-ice-all-runtime \ zip \ && rm -rf /var/cache/apt/* WORKDIR /opt RUN wget https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.2.6/bioformats2raw-0.2.6.zip -O bioformats2raw.zip \ && unzip bioformats2raw.zip \ && mv bioformats2raw-0.2.6 bioformats2raw \ && rm bioformats2raw.zip WORKDIR /bioformats_pyramid ENV RAW2OMETIFF_VERSION=v0.2.7 # Clone raw pyramid to tiff repo. RUN git clone -b ${RAW2OMETIFF_VERSION} https://github.com/glencoesoftware/raw2ometiff.git \ && cd raw2ometiff \ && gradle build \ && cd build/distributions \ && rm raw2ometiff*tar \ && unzip raw2ometiff*zip \ && rm -f raw2ometiff*zip \ && cd ../.. \ && mv build/distributions/raw2ometiff* /opt/raw2ometiff # Set working directory containing new cli tools. WORKDIR /opt COPY bin /opt CMD ["/bin/bash"] ``` Submit the file `script.sh` to the scheduler. ### Exercise. OME-TIFF pyramid. Change directories to ``` cd /ocean/projects/cis230059p/$(whoami)/workflow-examples/cwl/ome-tiff-pyramid ``` Explore the file `pipeline.cwl`. This CWL file is defining a workflow with several steps ``` #!/usr/bin/env cwl-runner class: Workflow cwlVersion: v1.1 requirements: ScatterFeatureRequirement: {} inputs: ometiff_directory: type: Directory processes: type: int default: 1 rgb: type: boolean? downsample_type: type: string? outputs: pyramid_dir: type: Directory[] outputSource: convert_to_pyramid/pyramid_dir n5_dir: type: Directory[] outputSource: convert_to_pyramid/n5_dir steps: collect_ometiff_files: run: collect-ometiff-files.cwl in: ometiff_directory: ometiff_directory out: [ometiff_file, base_directory] convert_to_pyramid: scatter: [ometiff_file, base_directory] scatterMethod: dotproduct run: steps/ometiff-to-pyramid.cwl in: ometiff_file: collect_ometiff_files/ometiff_file base_directory: collect_ometiff_files/base_directory processes: processes rgb: rgb downsample_type: downsample_type out: [pyramid_dir, n5_dir] ``` It is important to note that each step is defined in its own CWL file. ``` find . -type f -name "*cwl" ./collect-ometiff-files.cwl ./pipeline.cwl ./steps/ometiff-to-pyramid.cwl ``` What makes this workflow important? * This workflow is using more than one step * Each step is clearly defined by its own CWL file * Each step uses a well-defined Docker container hosted in Docker Hub that can be converted to Singularity * This is a HuBMAP workflow running on Brain Image Library data (talk about interoperability)