# MultiXscale WP1+WP5 sync meetings
- Monthly, every 2nd Tuesday of the month at 10:00 CE(S)T
- Notes of previous meetings at https://github.com/multixscale/meetings/wiki
---------------------------
## Next meetings
- Tue 11 Feb 2025 10:00 CET
- Caspar can't make it due to time difference, Kenneth will chair
- Tue 11 March 2025 10:00 CET
- Caspar _probably_ can't make it (traveling around this time), Kenneth will chair
- Tue 8 April 2025 10:00 CEST
- Tue 13 May 2025 10:00 CEST
- Tue 10 June 2025 10:00 CEST
- Tue 8 July 2025 10:00 CEST (without Kenneth)
- reschedule to 1 July?
---------------------------
## Agenda/notes 2025-01-14
attending: Caspar (SURF) | Kenneth, Lara (UGent) | Thomas, Richard (UiB) | Helena, Eli, Pedro, Susana, Nadia (HPCNow!) | Alan (UB) | Bob, Pedro (RUG) | Julián (BSC) | Neja (NIC)
- Final word on deliverables (M24):
- D1.3 ...
- D6.2 ...
- D7.2 ...
- D8.5 ...
- All of these were succesfully submitted on time (except for WP3)
- Final versions available on Zenodo + (linked from our website)
- Upcoming deliverables (**M30 - June 2025**):
- (Alan) create Overleaf project
- by next sync meeting (Tue 11 Feb): come up with outline ASAP + make sure deliverable description from Grant Agreement is covered
- D1.4 => RIJKSUNI (Pedro)
- D1.5 => SURF (Caspar)
- D5.3 => UGent (Kenneth)
- D6.3 => NIC (Neja)
- keep deliverables short => ~15 pages max.
- set early internal deadline to get these fully done: 1st week of June?
- D1.4 Support for emerging system architectures (RIJKSUNI)
- Arm CPUs (in place for `aarch64/{generic,neoverse_n1,neoverse_v1}`)
- NVIDIA Grace (to start)
- AMD GPUs / ROCm (to start)
- Zen4 (AMD Genoa) (how did we manage that, what did we do differently)
- (outlook to) Zen5 (AMD Turin)
- improvements in glibc
- also cover RISC-V (despite that having a separate task)
- D1.5 Portable test suite for shared software stack (Ugent - actual: SURF)
- Mixin class, easier portability
- D5.3 Report on testing provided software (SURF - actual: UGent)
- Not just test suite, but also test suites run during build of software
- Can say something on the need for GPU build infra so that we can run GPU unit tests
- Integration with Ramble
- EESSI testsuite Dashboard
- D6.3 Interim report on Community outreach, education and Training (NIC)
- WP status updates
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
- [UGent] T1.1 Stable (EESSI) - **D1.3 due M24 (Dec'24)**
- `dev.eessi.io`:
- WIP from last time:
- Meeting planned in January with Tilen to have him experiment with `dev.eessi.io`
- => Moving to February. They need a GPU (Rome+A100), so also blocked by that
- Meeting tomorrow to see what still needs to be implemented before the January meeting (Lara+Pedro)
- meeting done, see notes?
- couple of things still need to be done
- docs for setting up bot: OK
- support for changing software subdir
- support for GPU builds
- schedule other tiger team meeting after MultiXscale GA
- ideally before meeting w/ Tilen on 5 Feb'25
- so tiger team to be scheduled last week of Jan'25
- Documentation effort to describe what we should do if we want to onboard a new code / repo to build for `dev.eessi.io`
- Pedro is ready to open PR for this
- NVIDIA GPU support:
- we had tiger team meeting on this this morning
- bot setup in service account at UGent for GPU build nodes is WIP (Lara)
- Key results:
- Fix GPU availability in EESSI container in test step [#847](https://github.com/EESSI/software-layer/pull/847])
- WIP:
- Deploy bot @UGent, testing first builds in [#842](https://github.com/EESSI/software-layer/pull/842)
- Fix issues in automatically determining ReFrame config from template in test step [#114](https://gitlab.com/eessi/support/-/issues/114)
- TODO from last time:
- [WIP] updateing the `SitePackage.lua` for proper GPU support ([see PR #798](https://github.com/EESSI/software-layer/pull/798)) => STILL waiting for review
- will be required for stuff that depends on cuDNN
- Deploy bot @SURF
- Re-install GPU software in proper location (not in CPU-only prefix)
- only applies to CUDA itself + OSU benchmarks + CUDA-samples
- these cause some headaches when installing CUDA & co for newer architectures
- "we will benchmark software from the shared software stack and compare the performance against on-premise software stacks to identify potential performance limitations, ..."
- work done by Satish for Espresso, LAMMPS, GROMACS?, OSU
- All put into the deliverable, no surprises. EESSI generally on par with local SW stack.
- [RUG] T1.2 Extending support - **D1.4 due M30 (June'25)**
- `zen4` _almost_ on par with the rest.
- PR to do this was merged, but not deployed, so we need to still do that https://github.com/EESSI/software-layer/pull/841
- Question: should these be _hidden_ modules?
- Then merge https://github.com/EESSI/software-layer/pull/766
- NVIDIA Grace
- @Thomas: any update?
- => set up tiger team for this (Thomas)
- AMD ROCm (see [planning issue #31](https://github.com/multixscale/planning/issues/31) + [support issue #71](https://gitlab.com/eessi/support/-/issues/71))
- @Pedro/Bob: any update?
- Bob looked at open EasyBuild PRs a bit for ROCm, plans to keep working on this
- => set up tiger team for this (Bob)
- [SURF] T1.3 Test suite - D1.5 due M30 (June'25)
- Ongoing effort: porting tests to use `eessi_mixin` class 80% complete
- Dealing more elegantely with read-only data now [issue](https://github.com/EESSI/test-suite/issues/211)
- would be nice to get more contributors...
- talk at EasyBuild User Meeting + hands-on session?
- webinar on EESSI test suite, maybe via EPICURE?
- maybe include hands-on too?
- also show off dashboard
- are we ready to let other sites push in their results and expose it via dashboard?
- probably require a policy in terms of which data is required (which scales)
- there will be a talk on Continuous Benchmarking by JSC at EUM'25
- also remote talk by Ramble?
- [BSC] T1.4 RISC-V (due M48, D1.6)
- ... (is build bot active? Who can control it? Should all PRs try to build for this, or not?)
- Bob is working on this, we're close
- BSC firewall was blocking events recoming from smee
- should reach out to Vitamin-V project?
- https://vitamin-v.upc.edu/
- BSC training using Paraver on RISC-V, will use EESSI for hands-on
- [SURF] T1.5 Consolidation (starts M25 - Jan'25)
- continuation of effort on EESSI (T1.1, etc.)
- [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
- [SURF] T5.2 Monitoring/testing, D5.3 due M30 (June'25)
- Plan to seperate dashboard & database in two separate VMs (security) => Status?
- Vega agreed to make test data public. Karolina is waiting for response from their director.
- Caspar sent reminder on 13-01@16:00h
- [UGent] T5.4 support/maintenance - D5.4 due M48 (Dec'26)
- should be a bit more proactive on getting support issues closed + follow-up on software-layer PRs
- [UB] WP6 Community outreach, education, and training
- **deliverables due: D6.3 (M30 - June'25)**
- Upcoming activities:
- [Alan] EESSI tutorial at HiPEAC 2025 accepted (20-22 Jan'25)
- standard 2h tutorial + extending EESSI
- 39 people registered for EESSI tutorial
- Lara is giving demo at some time, but can maybe be moved to end of the session (closer to 5pm)
- details are unclear for Lara
- [Lara] Also at HiPEAC: another workshop (about CoEs), Lara will present workshop there
- 2 workshops for MXS: a talk (Mon), and a demo (Wed - collision with EESSI tutorial).
- [HPCNow] WP7 Dissemination, Exploitation & Communication
- podcast interview for EuroHPC podcast
- Any updates? Date planned yet?
- Contacted HPCwire to see if they can make an article about EESSI => Status?
- TODO last time: we could make a press release ourselves. Susana would take lead, Kenneth provides input for quote. Status...?
- WIP by Susana
- new edition of newsletter is ready to be published
- would be nice to promote this at HiPEAC
- website update with new newsletter
- T7.1 Scientific applications provisioned on demand (lead: HPCNow) (started M13, finished M48)
- EESSI on 'paid layer' on top of Parallel Cluster: WIP. Status? (Pedro @ HPCNow)
- PR to AWS merged
- blog post once new AWS Tech Short is recorded?
- some discussion with Open OnDemand team on integrating EESSI (Eli, Kenneth)
- Task 7.2 - Dissemination and communication activities (lead: NIC)
- Updates ... ?
- see deliverable + GA
- Task 7.3 - Sustainability (lead: NIC, started M18, due M42)
- Updates ... ?
- see deliverable + GA
- Task 7.4 - Industry-oriented training activities (lead: HPCNow)
- Updates ... ?
- upcoming events
- HiPEAC'25 (Barcelona, 20-22 Jan)
- EuroHPC Summit (Krakow, 18-20 March)
- EasyBuild User Meeting (Jülich, 25-27 March)
- https://easybuild.io/eum25/
- 3rd day will probably be focused on EESSI
- MultiXscale scientific CECAM workshop @ Ljubljana (April 2025, WP6)
- talk on EESSI (by Alan, remote), would be nice to cover `dev.eessi.io` service
- EuroHPC User Day (Copenhagen, Denmark, 30 Sept-1 Oct)
- https://www.deic.dk/events/eurohpc-user-days-2025
- [NIC] WP8 (Management and Coordination)
- Amendmend not accepted in current form
- Status? (last time: waiting for IIT for changes, then resubmit on friday after the sync meeting)
- amendmend was re-submitted 20 Dec, waiting for reply
- waiting for report from special review
- for travel budget, a more detailed overview is probably desired?
- next General Assembly meeting
- 23-24 Jan'25 in Barcelona/Sitges
- Neja will send reminder for quarterly reports 2024Q4
### Other topics
- Interim EESSI Steering Committee (https://www.eessi.io/docs/governance/), had initial meeting
- will meet quarterly (+ additional topical meetings)
---------------------------
## Notes of previous meetings
see https://github.com/multixscale/meetings/wiki
----------------------------
## Template for sync meeting notes
TO COPY-PASTE
- overview of MultiXscale planning
- https://github.com/orgs/multixscale/projects/1/views/1
- WP status updates
- [SURF] WP1 Developing a Central Platform for Scientific Software on Emerging Exascale Technologies
- [UGent] T1.1 Stable (EESSI) - due M12+M24
- ...
- [RUG] T1.2 Extending support (starts M9, due M30)
- [SURF] T1.3 Test suite - due M12+M24
- ...
- [BSC] T1.4 RISC-V (starts M13)
- [SURF] T1.5 Consolidation (starts M25)
- [UGent] WP5 Building, Supporting and Maintaining a Central Shared Stack of Optimized Scientific Software Installations
- [UGent] T5.1 Support portal - due M12
- ...
- [SURF] T5.2 Monitoring/testing (starts M9)
- [UiB] T5.3 community contributions (bot) - due M12
- ...
- [UGent] T5.4 support/maintenance (starts M13)
- [UB] WP6 Community outreach, education, and training
- ...
- [HPCNow] WP7 Dissemination, Exploitation & Communication
- ...