# RSH 015 internal: containers
## Planning
What do we want to convey to audience?
* When to use a container ~~vs~~ and when to produce an reproducible environment using [conda, virtualenv, installable software, ...] <-- I would not put versus
* How to build a container [pick some engine(s)]
* Basics of how to create a container definition file
## Outline
- what is a container?
- recipe -> image -> container
- maybe show the figure: https://journals.plos.org/ploscompbiol/article/figure/image?size=large&id=10.1371/journal.pcbi.1008316.g002
- how they differ from virtual machine ~~hypervisors~~ <-- may not be very relevant for our audience
- Basic example
- docker pull something
- docker images
- docker run
- docker shell
- inside the shell: cat /etc/os-release
* Taxonomy of containers in science (what are the types of **use cases**)
* simple code portability for a micro-task
* workflows
* whole development environment
* "works on my computer"
* transparency: documentation of dependencies
* testing of dependencies in isolation
* reproducibility
* distributing data - not a use case
* data cannot travel (too big, too sensitive), "computer" travels to the data
- how we use containers
- containers as abstraction and isolation
- Lessons learned from "10 rules" to have:
https://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/NZ_Defence_Force_assistance_to_OP_Rena.jpg/1280px-NZ_Defence_Force_assistance_to_OP_Rena.jpg
rather than https://upload.wikimedia.org/wikipedia/commons/thumb/7/77/Rena_ship_07.jpg/800px-Rena_ship_07.jpg
- https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008316
- contains a very nice analogy for recipes, images, and containers
- transparency/understandability vs performance (in space or time)
- containers are built from text file recipes which are understandable for humans and computers
- good for open-source software
- use existing tools
- repo2docker
- build on top of existing images
- official images
- how to order layers
- make sure you can inspect the recipe
- let CI build the container from recipe instead of building it locally
- use version-specific tags, avoid "latest"
- format for clarity
- document within the dockerfile
- add comments
- group related commands
- add metadata
- include usage instructions
- specify software versions
- pin versions
- balance: specify in Dockerfile or in requirements.txt/environment.yml?
- use version control
- put Dockerfile into the project repo
- mount datasets at runtime
- make the image one-click runnable
- define reasonable entrypoints and unsuprising default behavior
- again: usage instructions
- order the instructions
- first those that change the least often
- regularly use and rebuild containers
- eat your own medicine: use the container for your work, not only at the end
- singularity vs docker
- use cases
- how to use docker images in singularity
- there can be a short demo (Radovan)
- registries
- dockerhub
- quay
- singluarity hub vs the other one
- gitlab
- github
- zenodo
- Risks with containers?
- can invite to use practices which make it difficult to use a piece of software outside of a container