TTT4HPC - Singularity episode

tags: `Training` `TTT4HPC`

Maikens comments/suggestions on material 06.05.24

The image: https://github.com/coderefinery/ttt4hpc_containers/blob/main/content/img/containerized_application.png - would be nice to keep the not-equal sign in the first three layers there as in the previous figure
After https://coderefinery.github.io/ttt4hpc_containers/intro_and_motivation/#apptainer-vs-singularity I think we should have something about different container technologies - and that docker is a very popular one, and most others are compatible with it.
In https://coderefinery.github.io/ttt4hpc_containers/basics_running_containers/#how-to-read-apptainer-commands - for the commands that apptainer ignores - should we also say that after you give the image file - the options that come after apply to the application that your image runs - depending on what subcommand you use?
In https://coderefinery.github.io/ttt4hpc_containers/basics_running_containers/#obtaining-a-container-from-container-registry - I think we should mention the naming of python.sif - if it is an existing file, or the filenam we chose to give the image file.
Should we have an example of an entry script? https://coderefinery.github.io/ttt4hpc_containers/basics_running_containers/#running-the-container - or maybe we should wait until we mention entry scripts - as it is not clear what these are yet, and may add confusion? I think that we should explain a bit more what is the difference between the runcommand and execcommand, i.e. that the run command typically runs the set of commands in the runscript and is not interactive, while exec bypasses this and instead runs what we tell it inside the container?
In https://coderefinery.github.io/ttt4hpc_containers/services_apps/ you do not explain the instance start command. And that we give a name to the service we start and that this name is used when running apps on the service. Maybe that could be added? Also it is worth mentioning what a "service" is. Something like: not just a program that executes and finishes but a program that is running in the background waiting for instructions. For services - does one in apptainer always start it with instance start and then additionally afterwards run the apps?

Notes from 26/4 meeting

Simo drafted 4 sections https://coderefinery.github.io/ttt4hpc_containers/
there's many flags and there might be obscure to most. We need to explain the commands inside apptainer
apptainer syntax (we can also point to apptainer documentation pages rather than making our own list. It can be a "box" which points to the docs https://apptainer.org/docs/user/main/cli.html )
maybe somebody did an apptainer cheatsheet that we can link. Even collecting all the commands we show might be interesting as a cheatsheet
we need to make sure to cover the basics and no assumptions (shipping applications -> shipping containers as used by boats (standard boxes) -> lots of shipping metaphors (e.g. dock) -> container and outside world do not interfer)

todo list:

making or finding a cheatsheet
adding a box about apptanier syntax from the apptainer docs (see above)
a brief intro on what is a "container image" already on page 1. This can be done during the talk "but what is an image simo?"
Tasks for materials
- …
- …
Other tasks to do:
- try the commands on various clusters
FAQs/typical-uses: should we add them?
- e.g. root did the installation and you cannot access it when running (do not be root when isntalling)
- existing container with existing code how do I edit (copy out, edit, re-bind)
- do we want an example for binding ports? mysql example? a web app, e.g. tensorboard or rstudio

timings morning

Enrico screen shares the docs + white bg terminal on compute node + history
general intro of ttt4hpc: 05m (10:00 - 10:05 EEST)
- Enrico and Richard: about the course in general, credits, exercises, etc
Intro to containers (on HPC): 15m (10:05 - 10:20)
- Simo leads and others asks
- Enrico starts from previous part with "Ok simo what is a container?"
- Simo goes through part "What even is a container?"
- Section: "What is the intended use case of Apptainer?" we can add this as a question to the users on the notes doc and see what they add. This part could start with Simo asking to Maiken "what do you think containers can help with?"
- Q-suggestion: What are libraries actually? Ref Figure 2 with caption "Without containers, your program uses libraries from the host system"
Basics of running containers 20m (10:20 - 10:40)
- Simo leads and other responds
- Q-suggestion: https://coderefinery.github.io/ttt4hpc_containers/basics_running_containers/#obtaining-a-container-from-container-registry Why are we using Docker here? Is Docker so important to our usage of Apptainer?
- Q-suggestion:: https://coderefinery.github.io/ttt4hpc_containers/basics_running_containers/#running-the-container - what are actual entrypoint files? And are they always present?
Intro to container images: 20m (10:40 - 11:00)
- Simo leads
break: 10m (11:00 - 11:10 EEST)
discussion from the notes doc: 05m (11:10 - 11:15 EEST)
- Enrico hosts discussion and filters questions
Building Apptainer images: 25m (11:15 - 11:40 EEST)
- Simo leads
Binding folders into your container: 15m (11:40 - 11:55 EEST)
- Simo leads
end discussions and see you at the exercises: 5m (11:55 - 12:00 EEST)
- Enrico can host/filter questions discussion and wrap up

timings afternoon

Exercises can be in the second table of content of the page (now called examples)
- test container
- build a simple cowsay with lolcat
- containerised conda? build and run something small
- test a gpu? (pytorch?)
Other more optional exercises later?

Notes from 18/4 meeting

Demos and exercises is confusing
Intro
- Simo drafted the intro https://coderefinery.github.io/ttt4hpc_containers/intro_and_motivation/
discussion on use cases/approaches
- maybe you just want to be a user (reuse existing containers)
- maybe you need to build
- and here what i need if I want to
main learning outcome
- you need to install something where you don't have permissions and containers are the only way (e.g. LUMI supercomputer)
- some uusual cases: pytorch container (e.g. pulled from their repo)
- we could ask Radovan and DIana
- the containerised conda (very important for CSC cluster) https://github.com/simo-tuomisto/micromamba-apptainer/
things to point out
- in the conda example, you can of course install ubuntu, add conda, etc etc… or get an existing container (like the micromamba) that already has tools
we can show where we are with an icon or textprompt (am I on the host, am I on the container)

TODOs

Simo adds placeholders for the 2h morning
MP and/or EG can help expand on placeholders or iterate/review
We are all free to add more "demos" stand alone examples following the template "https://raw.githubusercontent.com/coderefinery/ttt4hpc_containers/main/content/verify_installation.rst" (the solutions do not have to be there)

Touching base:

timeline

13/02 meeting: notes at the end. Simo and Maiken do a first write hre on hackmd. Enrico gives it a second pass next week
14/02

What's out there / references to cite

Use stories / use cases

think of cases for the people to identify themselves with

learning goals

expand on the goals (e.g. based on the use cases above)

Structure for the day

Stream + Lunch + Zoom-hands-on

0) prerequisites

run some singularity command before? (e.g. get an image)
check what you have in your cluster (singularity-ce or apptainer?)
install (but you are on your own)
basic of linux shell + how to connect to your cluster

1) Live stream on TwitchTV 2h

00:00 - 00:05 Intro on the day (Enrico + Richard)
00:5 - 00:30 First steps + first commands (SIMO?)
- Basic concepts added concept by concept
- Why singularity? When is it useful?
  - Mention the use cases we outline
- Some notes at the end of this doc: start simple and add concepts+commands
- first command
- very simple exercise singularity pull e.g.
- Introduce the concept of the container image (why is it named image? because of the layers)
- ssh into the image (singularity shell…) you are now on a different OS…
- not everything that is on the materials has to be on streaming
- image is read only, how do we do stuff with it? mounting / interacting with local filesystem
- definition files
01:00 - 01:15 Break
01:10 - 01:30 More complex stuff
- env vars
- mounting
- volumes
- converting from docker to singularity
- showing several container services communicating with eachother (?) At least a demo would be good.
- docker compose is not availble on singularity, so communication between containers can happen on the machine (e.g. a singularity container running mysql db and another container running the client) https://github.com/AaltoSciComp/secure-workflows
- when you develop stuff, the read-only version of the singularity image is annoying. But you can keep the dependencies fixed in the image, and mount the "live code". If you need to add packages all the time it is not the best solution.
- sandbox images … to advanced?
- minimizing image size?
01:50 - 02:00 Discussion, future directions, what we will do in the exercises

2) Lunch 1h

02:00 - 03:00 Go eat

3) Zoom 1.5h

TO BE CHANGED (ENRICO)

00:00 - 00:10 Intro to Exercise 1
00:10 - 00:30 Exercise 1
00:30 - 00:45 Exercise 1 solution
00:45 - 00:50 Intro to Exercise 2
00:50 - 01:10 Break + Exercise 2
01:10 - 01:30 Exercise 2 Solution + q&a + where to go from here

What could we cover?

(brainstorming list)

singularity vs apptainer
containerize a conda environment
build a container from dockerhub
build a contaner from local docker
build a container from recipe
shell in a container
run app from the container
graphical app in the container
- (useful but maybe too difficult for this workshop with various setups?)
singularity on nvidia/AMD gpu (singularity on lumi)
fakeroot stuff
container file format
other useful commands with singularity
singularity slurm example
env variables by apptainer / singularity
cache: how to use it, empty it
tempdir and cachedir and how to avoid going overquota
sharing singularity images

Notes from 13/2 meeting

getting it to work on various systems might be tricky
no hope for running it on mac/windows laptops
preprequisites / preparation for our audience is necessary (on HPC or on a VDI system?)
installation instructions if you want to run it on your laptop
make clear that it works on our clusters, others are welcome to watch but things might not work in their cluster
imporatnt to stress the reason on using singularity: is to move thinngs to other systems rather than developing on own computer
also showcase features that are beneficial for the admins: lack of many files (e.g. one file for a whole conda environment)
exercises:
- solutions that singularity offers, e.g. a custom conda setup to put in a container (this is how LUMI uses it for example)
- if something comes via ubuntu (e.g. apt get install something) it might be faster to put it inside of a container rather than trying to port it to the operating system of the cluster (redhat? centos? different ubuntu version?)
- reusing ready made images (from docker also)
- you can share your code with e.g. zenodo
users are not usuallay familiar to with container terms, clear intro is important independent of the solution e.g. from the CodeRefinery workshop https://coderefinery.github.io/reproducible-research/environments/
it would be best to go from the simple metaphores to the more complex things (e.g. file system mounting cannot be the first thing). Make it clear "this command is inside the container, this command is outside".
Metaphor of connecting to a remote computer
More advanced stuff: containers cannot be changed, modify things outside and mount them inside
metaphor of graphical tools where you work with layers and then squash them into a single picture
metaphors are nice but not too many metaphors
building 10 different images with similar structures is not helpful
in lumi containers are recommended ways of working for conda env. In practice people need to change things often and rebuild image often. We should stress this.
we can implicitly say why containers, why I want to wrap evertyhing into a single file and run it somewhere else
good to mention + outline usecases
use case for publications: put image on zenodo
graphical applications are tricky
the use of sudo in installations scripts of packages, it fails in singularity, so we should be clear that sudo is not needed. Containerize something
- would be good to find an example?
- or at least in the materials we can have a list of these "commonly encoutnered problems"

TTT4HPC - Singularity episode

tags: Training TTT4HPC

Maikens comments/suggestions on material 06.05.24

Notes from 26/4 meeting

todo list:

timings morning

timings afternoon

Other more optional exercises later?

Notes from 18/4 meeting

TODOs

Touching base:

timeline

What's out there / references to cite

Use stories / use cases

learning goals

Structure for the day

Stream + Lunch + Zoom-hands-on

0) prerequisites

1) Live stream on TwitchTV 2h

2) Lunch 1h

3) Zoom 1.5h

What could we cover?

Notes from 13/2 meeting

Read more

CodeRefinery 4 kick-off

CodeRefinery4 kick-off detailed planning

RSE training ecosystem

CodeRefinery meeting notes

tags: `Training` `TTT4HPC`