# EESSI hackathon (Jan'22) - general notes
- **when**: week of 17-21 Jan 2022
- **main goal**: focused effort on various tasks in EESSI
- **expectations**:
- joining kickoff/sync/show & tell meetings
- spending a couple of hours that week on one or more of the outlined tasks (in group)
- take extensive notes (to integrate into documentation later)
- registration: https://doodle.com/poll/z74qi2gcvau69kgr
- **main communication channel: EESSI Slack - join via https://eessi-hpc.org/join**
## Links
* [Dec'21 hackathon notes](https://github.com/EESSI/meetings/wiki/EESSI-hackathon-Dec'21)
* **GitHub repo for EESSI hackathon(s): https://github.com/EESSI/hackathons**
* show & tell shared slide deck: https://docs.google.com/presentation/d/1Ok5l1x1EydHg0ptsglSkPKOImPTHp_kFTvRQBDq-VjY
## Meetings
- **Mon 17 Jan 2022, 09:00 UTC**: kickoff
- clarify expectations
- overview of tasks
- getting organised: who works on what, form groups
- **Wed 19 Jan 2022, 09:00 UTC**: sync
- sync meeting notes:
- quick progress report per group
- briefly discuss next steps
- notes: https://hackmd.io/9yX5_kLeTJKbStedlX2jWg
- **Fri 21 Jan 2022, 13:00 UTC**: show & tell
- each group briefly demos/presents what they worked on
- outline follow-up steps
- slides: *(coming soon)*
- recording: *(coming soon)*
## Attendees
***If you plan to actively participate in this hackathon:***
- add your name + affiliation + GitHub handle below (or ask someone to do it for you)
- feel free to pick **ONE** task you would like to work on, add your name to the list for that task (see `people working on this`)
Joining:
- Kenneth Hoste (HPC-UGent) - `@boegel`
- Thomas Röblitz (HPC-UBergen) - `@trz42`
- Bob Dröge (HPC-UGroningen) - `@bedroge`
- Martin Errenst (University of Wuppertal) - `@stderr-enst`
- Bartosz Kostrzewa (University of Bonn) - `@kostrzewa`
- Terje Kvernes (University of Oslo) - `@terjekv`
- Jacob Ziemke
- Jure Pečar `@jpecar`
- Hugo Meiland
- Frank Everdij (Delft University of Technology) - `@frankeverdij`
- Jörg Sassmannshausen
- Michael Hübner
- Maicon Faria (HPCNow!)
## Available infrastructure
**Please use the virtual clusters we have set up for this hackathon!**
* EESSI pilot repository is readily available
* Different CPU types supported
* Singularity is installed
### Magic Castle clusters in AWS + FENIX + Azure
* managed by Alan
* all info at https://github.com/EESSI/hackathons/tree/main/2022-01/magic_castle
### Cluster-in-the-Cloud in AWS
* managed by Kenneth
* all info at https://github.com/EESSI/hackathons/tree/main/2022-01/citc
## Communication
If you need help, contact us via the [EESSI Slack](https://eessi-hpc.slack.com) (join via https://www.eessi-hpc.org/join)
General hackathon channel: [#hackathon](https://eessi-hpc.slack.com/archives/C02NB46EK9P).
**See also task-specific channels!**
## Selected tasks & task teams
A subset of main tasks was selected for this hackathon:
* [02] **Installing software on top of EESSI**
* task lead: Kenneth
* participating: Kenneth, Jacob, Frank, Martin
* notes: https://hackmd.io/sLBLV7RDQdmyYfh1rYHGSQ
* Slack channel: [#hackathon-software_on_top](https://eessi-hpc.slack.com/archives/C02PXMGTD46)
* Zoom: https://uib.zoom.us/j/63932793319?pwd=Ry9BQ3VUaHcwTncwSFFrdU1PWnF3QT09
* subtasks:
* [02.1] document how to install software on top of EESSI
* with EasyBuild
* manually (depends on [02.2]?)
* [02.2] implement support in EasyBuild to install RPATH wrapper scripts along with compiler (see https://github.com/easybuilders/easybuild-framework/issues/3918)
* [02.3] standalone script to install software on top of EESSI
* [03] **Workflow to propose additions to EESSI software stack**
* task lead: Bob
* participating: Bob, Kenneth, Jörg
* notes: https://hackmd.io/6V91CHRWRtuutANPaZRVPw
* Slack channel: [#hackathon-contribution_workflow](https://eessi-hpc.slack.com/archives/C02Q6LJBJ9J)
* Zoom: https://uib.zoom.us/j/69105180487?pwd=YldPU2ZPYWRGV1duV2JaV082MEVJdz09
* subtasks:
* [03.1] Use EESSI build container for the actual software builds
* [03.2] Monitor build/test jobs, handle failures
* [03.3] Let the app pick up the logs and tarball(s)
* [03.4] Reply results/failures back to PR (success/failure comment, logs as gist, ...)
* [03.5] Support more architectures
* [03.6] Add dependency structure between subtasks, e.g. tests should be triggered when build has succeeded
* [03.7] Handle many more events
* [03.8] Prepare slides for [Easybuild User Meeting talk](https://easybuild.io/eum22/#eessi-workflow)
* [05] **GPU support**
* task lead: Alan
* participating: Alan, Bartek, Michael, Maicon, (Hugo?)
* notes: https://hackmd.io/47FAwaeWRi66tdiqjy2Zvg
* Slack channel: [#hackathon-gpu_support](https://eessi-hpc.slack.com/archives/C02Q4DJT7J7)
* Zoom: https://uib.zoom.us/j/62193929082?pwd=em0wMUp4enorKzZIUkI5MWJqMDY2QT09
* subtasks:
* [05.01] Install GROMACS, OSU Microbenchmarks, TensorFlow on top of foss/2021a + CUDA for skylake + zen2
* [06] **EESSI test suite**
* task lead: Thomas
* participating: Thomas, Hugo
* notes: https://hackmd.io/wx2hjHiWQnmkERSVR2-a2A
* Slack channel: [#hackathon-test_suite](https://eessi-hpc.slack.com/archives/C02QH1XVAKT)
* Zoom: ask Thomas if you need one
* subtasks: collection of ideas/todos from last hackathon
* [06.1] add tests for compat layer (e.g. https://github.com/EESSI/compatibility-layer/issues/152)
* [06.2] repeatedly run existing tests at different places (hackathon resources, your HPC, your machine, ...)
* [06.3] integrate tests in CI pipeline (may benefit from [06.1], [06.2])
* [06.4] guidelines/cookbook on how to develop application tests
* [06.5] document how to run test suite for CI/monitoring (likely depends on [06.3] or [06.8])
* [06.6] develop tests for results by other tasks (e.g. GPU support)
* [06.7] decide on resourcedir setup to handle large inputs (WRF), finish WRF and create/merge PR
* [06.8] many more tests...... ("steal" from other places, e.g. CSCS ReFrame repo)
* [07] **Monitoring**
* task lead: Terje (?)
* participating: Terje (?)
* notes: https://hackmd.io/YWDG2GO5R3Sm3wS1SpvYrg
* Slack channel: [#hackathon-monitoring](https://eessi-hpc.slack.com/archives/C02Q1S6484X)
* Zoom: ask Thomas if you need one
* subtasks:
* [07.1] Finalize CVMFS monitoring
* Should involve role-based install of filesystem-layer for any node
* Working dashboards out of the box would be... nice.
* <span style="color:red">**PROBLEM:**</span> [cvmfs_exporter](https://gitlab.cern.ch/cloud/cvmfs-prometheus-exporter/) only works if the CVMFS volume is mounted. On Stratum1, that ain't happening. We will need (to write?) a monitoring tool with role comprehension.
* [07.2] Decide on naming scheme for Ansible repos and the roles shipped
* [07.3] Should probably migrate to the Ansible role repo to a collection
* Today's solutions works. Collection migration should probably be a task onto itself, depending on time.
* [07.4] Move repos to EESSI
* [08] **Setting up a private Stratum-1**
* task lead: ???
* participating: ???
* notes: https://hackmd.io/TyNaBqgkTHGjqwNZcasx5Q
* Slack channel: [#hackathon-monitoring](https://eessi-hpc.slack.com/archives/C02Q1S6484X)
* Zoom: ask Thomas if you need one
* subtasks:
* [08.1] develop ReFrame tests to verify that a Stratum 1 server is configured correctly (see https://github.com/EESSI/filesystem-layer/issues/111)
* [08.2] setup a new Stratum 1 server, make some intentional mistakes (?), and use the tests from [08.1] to detect the misconfiguration ... fix them and rerun tests
* [16] **Export a version of the EESSI stack to a tarball and/or container image**
* task lead: Jure
* participating: Jure
* notes: https://hackmd.io/2YpzQGgUSDyTvW3ILulzwA
* Slack channel: [#hackathon-export_software_stack](https://eessi-hpc.slack.com/archives/C02QH2H8TH7)
* Zoom: ask Thomas if you need one
* subtasks:
* [16.1] investigate if/how variant symlinks can be included
* [16.2] make script developed during last hackathon available & document its use
* [16.3] integrate existing ReFrame tests to simplify validation of running a container elsewhere
* [16.4] investigate if/how PIDs (persistent identifiers, e.g. DOI) could be attached to a container image
## TODOs
- shared presentation for show & tell on Friday
- https://docs.google.com/presentation/d/1Ok5l1x1EydHg0ptsglSkPKOImPTHp_kFTvRQBDq-VjY