<style>
.reveal {
font-size: 18px;
}
.reveal pre {
2
font-size: 20px;
}
.reveal section p {
text-align: left;
font-size: 18px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section figcaption {
text-align: center;
font-size: 20px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section h1 {
font-size: 26pxem;
vertical-align: top;
}
.reveal section h2 {
font-size: 24px;
line-height: 1.2em;
vertical-align: top;
}
.reveal section h3 {
font-size: 22px;
line-height: 1.2em;
vertical-align: top;
}
.reveal ul {
display: block;
}
.reveal ol {
display: block;
}
</style>

# Part 1: Elevating Scientific Computing with Singularity Containers
Ivan E. Cao-Berg
Research Software Specialist
Pittsburgh Supercomputing Center
Carnegie Mellon University
---
## Meet the Team and Introductions
<img src="https://hackmd.io/_uploads/SyxxS39gp.png" width="50%" />
---
## Before we begin
- :warning: Have an issue or question?
- Feel free to ask during the presentation, on chat or Slack
- Send an email to the Help Desk `help@psc.edu` after the workshop
- :computer: What is the project charge ID?
- `cis230059p`
- :computer: What is the reservation name?
- `workshop`
- :computer: Where can I find the code and data?
- The code and data is located in `/ocean/projects/cis230059p/shared`
- The code can be found in this [repo](https://github.com/pscedu/workflow-examples)
- :computer:Where do I save my output?
- You can save your output in `/ocean/projects/cis230059p/$(whoami)`.
- :computer: Where can I find the docs?
- You can find the documentation [here](https://hackmd.io/@icaoberg/Ske8b00oh).
---
## Before we begin (cont.)
- :warning: Have an issue or question during the workshop?
- Raise your hand on Zoom. message us on Slack or feel free to ask questions during the presentation and hands on
---
## Resources available during this experience
* 30 regular-memory compute nodes that can be accessed using SLURM from the partition named `RM-shared` and reservation `workshop`.
* If you do not wish to install software, then you can use OpenOnDemand to connect to Bridges 2 using the link `http://ondemand.bridges2.psc.edu`
* To connect to Bridges 2 use the official [documentation](https://www.psc.edu/resources/bridges-2/user-guide/#:~:text=Using%20your%20ssh%20client%2C%20connect,username%20and%20password%20when%20prompted).
---
## What to expect
* A gentle introduction to workflow management systems.
* Instructions on how to set up your user account for NextFlow, Snakemake and CWL-runner.
* Inspect and run some simple examples to get you started.
* This presentation is in the context of a basic power user (take some of my statements with a grain of salt since some things might be doable with the support of PSC engineers).
* We will monitor the Slack workspace for a week after the workshop for any questions or concerns.
* The presentations, documentation and video recording will be made available.
---
## Motivation for this workshop

----
## Motivation for this workshop (cont.)
- FAIR principles are used in data management and stewardship.
- FAIR stands for **Findable**, **Accessible**, **Interoperable**, and **Reproducible**.
- Generally FAIR principles are applied to data and metadata.
- FAIR principles are crucial for advancing data-driven research and innovation.
- Implementing FAIR practices enhances the overall quality and impact of scientific work.
- **A commitment to FAIR principles contributes to a more open, collaborative, and reproducible research ecosystem.**
---
## Motivation for this workshop (cont.)

---
## Containerization in Computing
A **container** is a lightweight, standalone software package that encapsulates everything needed to run an application, including code, runtime, libraries, and settings.
---
## Why is container technology popular?
**1. Isolation**
- *Lightweight:* Containers are lighter than virtual machines.
- *Isolation:* Each container isolates its application and dependencies.
**2. Portability**
- *Consistency:* Containers run consistently across environments.
- *Platform-agnostic:* Containers run on various platforms.
**3. Efficiency**
- *Resource Efficiency:* Containers share the host OS kernel.
- *Fast Start-up and Scaling:* Containers start quickly and scale easily.
**4. Flexibility**
- *Polyglot Environments:* Supports multiple programming languages.
---
**5. Resource Utilization**
- *Optimized Resource Utilization:* Containers efficiently use resources.
- *Density:* Many containers can run on a single host.
**6. Security**
- *Isolation:* Containers limit the impact of security breaches.
- *Immutable Infrastructure:* Containers, with immutable infrastructure, enhance security.
**7. Community and Ecosystem**
- *Open Source Ecosystem:* Strong open-source communities.
- *Standardization:* Containers are a standard unit of deployment.
---
## My Biased opinion about containers
* Users do not have to wait for an engineer to install a tools system-wide
* Users can install in their users space non-traditional applications, such as editors, utilities and more.
* Users can deploy applications that may not be built using the toolkits available on Bridges 2
* Users can easily deploy applications that are no longer supported, outdated or deprecated
---
## What is Docker?
* [Docker](https://www.docker.com/) is a popular **containerization platform** that simplifies the process of creating, deploying, and managing containers.
* While Docker is very popular, most HPC clusters do not support Docker out of the box :no_entry:.
* [Docker Hub](https://hub.docker.com/) is a **cloud-based registry** provided by Docker that serves as a centralized platform for managing and distributing Docker containers.
* **uDocker** is a user-level tool designed to enable the execution of Docker containers without requiring escalated privileges. It serves as a user-space replacement for Docker in scenarios where running Docker itself is not possible due to limitations such as the lack of root access (does not work with every container).
---
## What is Singularity?
* [Singularity](https://sylabs.io/singularity/) is an open-source container platform designed for high-performance computing (HPC) and scientific workloads.
* Singularity is designed for high compatibility with various Linux distributions and HPC environments.
* Singularity is relatively easy to use, especially for users familiar with containerization concepts.
* Singularity containers generally introduce minimal overhead, making them suitable for high-performance computing tasks.
* Singularity facilitates reproducibility by encapsulating the entire software stack and dependencies within containers.
* Singularity can convert Docker images, enhancing the usability of existing containerized applications.
* Singularity is well-suited for scientific workflows, particularly in research and data analysis.
---
## Limitations
* Even though most software can be containerized, there are many pieces of software that will not work properly due to their implementation.
* For example, this includes software that may require temp files in the container.
* Some microservices can be deployed in Singularity, however orchestration using Singularity can be challenging.
---
## Docker vs Singularity
| Feature | Singularity | Docker |
| ------------------------------ | ----------------------------------------------- | ------------------------------------------------ |
| **Use Case** | High-performance computing (HPC), Scientific workloads | General-purpose containerization |
| **Compatibility** | Optimized for HPC environments | Versatile, used in various environments and platforms |
| **User Privileges** | User-friendly, runs with user privileges | Typically requires administrative privileges |
| **Container Format** | Single-file format (.sif) | Multi-layer image format |
| **Daemon Requirement** | No daemon required | Requires a background daemon for running containers |
| **Security** | Emphasizes security, user namespace feature | Strong security features, with namespaces and cgroups |
| **Transport and Sharing** | Single-file container, easy to transport and share | Images can be shared via registries like Docker Hub |
| **Integration with Docker** | Can run Docker containers | Natively supports Docker container execution |
| **Popularity** | Commonly used in HPC and scientific communities | Widely adopted in the software development community |
*Note: This table provides a general comparison based on common characteristics, and specific use cases may influence the choice between Singularity and Docker.*
---
## What do I need to build a Singularity container?
1. **Base Operating System Image**
2. **Definition File (Singularity Recipe)**
3. **Bootstrap Process**
4. **Environment Setup**
Remember that Singularity simplifies many aspects of containerization, making it user-friendly and particularly suitable for high-performance computing environments.
---
## Before we continue
* Sylabs provides licensing, enterprise-level support, professional services, cloud services, and value-added tooling for performance-intensive, mission-critical compute environments and edge deployments.
* Apptainer is an open-source project with a friendly community of developers and users. The user base continues to expand, with Apptainer/Singularity now used across industry and academia in many areas of work.
---
## Exercises.
Click [here](https://hackmd.io/@icaoberg/SkeHG6Kxa).
---
{"slideOptions":"{\"theme\":\"white\",\"transition\":\"slide\"}","title":"Elevating Scientific Computing with Singularity Containers - Singularity","contributors":"[{\"id\":\"95d26c43-541b-4d60-ba03-d5ba7942c504\",\"add\":23485,\"del\":13211}]"}