<style>
.reveal {
font-size: 18px;
}
.reveal pre {
2
font-size: 20px;
}
.reveal section p {
text-align: left;
font-size: 18px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section figcaption {
text-align: center;
font-size: 20px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section h1 {
font-size: 26pxem;
vertical-align: top;
}
.reveal section h2 {
font-size: 24px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section h3 {
font-size: 22px;
line-height: 1.2em;
vertical-align: top;
}
.reveal ul {
display: block;
}
.reveal ol {
display: block;
}
</style>
<img align="center" width="25%" src="https://hackmd.io/_uploads/Syhyrl9uT.png" />
# Spatial Transcriptomics Working Group
## Analysis Resources
Ivan E. Cao-Berg
Research Software Specialist
Brain Image Library
Biomedical Applications Group
Jan 9, 2024 1-4 pm ET
---
## Before we begin
- :question: Have a question during the presentation?
<a href="https://www.lifewire.com/raise-hand-in-zoom-5100882"><img src="https://hackmd.io/_uploads/r1PWX3YOa.png" width="50%" /></a>
- :warning: Have an issue or a question after the workshop?
- Send an email to the Help Desk `bil-support@psc.edu`
---
## Resources available during this workshop
* A SLURM reservation named `workshop` that lasts 24 hours.
* Access to the large memory nodes using the `compute` partition that is shared among all users.
---
<img align="left" src="https://slurm.schedmd.com/slurm_logo.png" width="15%"/>
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
```bash=
sinfo #view information about Slurm nodes and partitions
squeue #view information about jobs located in the Slurm scheduling queue
scontrol #view or modify Slurm configuration and state
sbatch #submit a batch script to Slurm
```
The commands above are the most common commands you might be using for this hackathon. For full documentation about SLURM, click [here](https://slurm.schedmd.com/documentation.html).
---
### `sinfo` - Example 1
```bash=
sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute* up 2-00:00:00 1 drain l008
compute* up 2-00:00:00 7 idle l[001-007]
```
As a participant of this hackathon, you should have access to the partition `compute` using the reservation `hackathon`.
---
### `squeue` - Example 1
Use `squeue -u $(whoami)` to list your jobs and their status
```bash=
squeue -u $(whoami)
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
14243 compute script.s icaoberg R 15:34
```
---
### `sbatch` - Example 1
Consider the following file named `script.sh`
```bash=
cat script.sh
#!/bin/bash
module load anaconda3
pip install --user cowsay
cowsay "Hello, World"
```
`sbatch` is used to submit jobs to the scheduler. For more info on `sbatch`, click [here](https://slurm.schedmd.com/sbatch.html).
---
### `sbatch` - Example 1 (cont.)
:::info
:bulb: Remember to use the reservation `hackathon` when submitting to the scheduler.
:::
```bash
sbatch -p compute --reservation=hackathon script.sh
Submitted batch job 82721
```
For more info on `sbatch`, click [here](https://slurm.schedmd.com/sbatch.html).
---
### `sbatch` - Example 1 (cont.)
If you do not specify an output filename, the scheduler will create a file automatically. In this example `slurm-82721.out`
```bash
cat slurm-82721.out
____________
| Hello, World |
============
\
\
^__^
(oo)\_______
(__)\ )\/\
||----w |
|| ||
```
For more info on `sbatch`, click [here](https://slurm.schedmd.com/sbatch.html).
---
### `sbatch` - Example 2
```bash
sbatch -p compute -N1 script.sh #number of nodes - please avoid using it!
sbatch -p compute -n1 script.sh #number of cores
sbatch -p compute --mem=64Gb script.sh #memory
sbatch -p compute -N1 -n10 --mem=128Gb script.sh #combine them all as needed
```
For more info on `sbatch`, click [here](https://slurm.schedmd.com/sbatch.html).
---
### `scancel` - Example 1
```bash
scancel -u $(whoami) #cancel all my jobs
scancel 1234 #cancel job 1234
```
----
## `interact`
The interact command is an in-house script for starting interactive sessions.
```bash
> interact -h
Usage: interact [OPTIONS]
-d Turn on debugging information
--debug
--noconfig Do not process config files
-gpu Allocate 1 gpu in the GPU-shared partition
--gpu
--gres=<list> Specifies a comma delimited list of generic consumable
resources. e.g.: --gres=gpu:1
--mem=<MB> Real memory required per node in MegaBytes
...
```
---
### `interact` - Useful Tips and Tricks
- `interact` is a wrapper built in house.
- Use `interact` and avoid using `salloc` or `srun` on BIL hardware.
- The template is
```bash
interact -A tra220018p -p compute -n <number-of-cores> --mem=<memory>
```
- Remember to specify the account and reservation when using `interact`
- Account: `tra220018p`
- Reservation: `reservation`
---
## In a nutshell
- `LMOD` is used to load software in the workshop.
- `SLURM` is used to submit jobs to the scheduler managing the large-memory nodes.
- `interact` is used to start interactive sessions on the large-memory nodes.
---
## Workflow management systems (WMS)
A practical understanding of WMS benefits, automation, and implementation.
1. **[Snakemake](https://snakemake.github.io/).** Snakemake is a workflow management system that uses a Python-based domain-specific language. It is known for its simplicity and flexibility, making it popular among bioinformaticians for defining and executing data analysis pipelines.
2. **[Nextflow](https://www.nextflow.io/).** Nextflow is a data-driven workflow management system that enables the creation of reproducible and scalable bioinformatics workflows. It uses a domain-specific language called DSL2, which is based on Groovy.
3. **[Common Workflow Language (CWL)](https://www.commonwl.org/).** CWL is not a specific workflow management system but a standardized way to describe and execute bioinformatics workflows. Several workflow engines, including Cromwell and Rabix/Benten, support CWL, making it a popular choice for interoperability.
These workflow management systems are widely used in bioinformatics to automate and streamline the analysis of biological data, from genomics to proteomics and beyond. Researchers often choose a system based on their specific requirements, familiarity with the tools, and the nature of their data analysis tasks.
---
[](https://ondemand.bil.psc.edu)
---

---

---

---
## Hands-on Activity: Running Cellpose

---

---

---

---

---

* without GPUs
---

---
{"slideOptions":"{\"theme\":\"white\",\"transition\":\"slide\"}","title":"Spatial Transcriptomics Working Group - Analysis Resources","contributors":"[{\"id\":\"95d26c43-541b-4d60-ba03-d5ba7942c504\",\"add\":7884,\"del\":11}]"}