# Sync meeting on MultiXscale deliverables due M30 June 2025 (20250528) Present: Alan, Pedro, Satish, Kenneth, Lara, Maksim, Thomas ## D1.4 Support for emerging system architectures - Rhea v1 (Neoverse V1) already covered, Rhea v2 (Neoverse V2) very close to NVIDIA Grace which is already supported - not building yet for Graviton 4 (only 2+3), but we could/should - we could get in touch with SiPearl (via Estella -> Alan) - ROCm compat guarantee only from 6.4 - Overview of ROCm ecosystem (external contribution with our input) - LLVM - see if we have enough for a full section (maybe out of scope); external contribution (w/ internal contribution). We run and pass the test suite. Important for future architectures. - Future RISC-V work will be referenced in future deliverable ## D1.5 - Caspar tackled all comments, to be reviewed again by Kenneth - OpenFOAM test is close to being ready, so that should be reflected in deliverable - see PR https://github.com/EESSI/test-suite/pull/243 - cut the knot on this during EESSI test suite sync of Thu 12 June ## D5.3 - placeholder page on dashboard to be added to EESSI docs - w.r.t. performance comparisons: just mention that no attempt was made to tune the tests for a particular system, it's not a benchmark suite - mention that dashboard will need to actively maintained to keep the service up ## Next sync - during WP1+WP5 sync meeting on Tue 10 June - aim to have deliverables 100% ready for final review by SC + submission by Petra --- # Sync meeting on MultiXscale deliverables due M30 June 2025 (20250509) Present: Pedro, Bob, Petra, Richard, Thomas, Maksim, Kenneth, Satish ## D1.4 Support for emerging system architectures - Written by Pedro and Bob - Thomas will review the document - Some things are still missing or need to be updated - E.g. the metadata on the first page and the RISC-V LLVM section - Numbers of available apps need to be filled in just before submitting the deliverable - Namings and references may not be consistent throughout the document - A64FX is not an emerging microarchitecture, has been there for many years - we could name it "additional" or "interesting for EuroHPC" - The additional targets should match the aforementioned criteria - Reason for adding cascadelake could be that we now have a system that allows us to build for GPUs - Is the NVIDIA GPU section too detailed? - The ROCm section could stress a bit more how complex and fast evolving the ROCm stack is - Many libraries, many dependencies - Definitely refer to https://semianalysis.com/2024/12/22/mi300x-vs-h100-vs-h200-benchmark-part-1-training/, that is quite brutal on the state of ROCm - Based on this article in semi-analysis, AMD is putting more effort on their software ecosystem - LLVM can be removed from the RISC-V section (or just mention it in the first subsection) ## D1.5 Portable test suite for shared software stack - List of authors needs to be fixed/updated - Satish has reviewed the document - Should we show a bit of relevant code of the EESSI Mixin class? - This deliverable is close to done, just need to address some details (see comments in Overleaf) - More or less complete, details. - Write some more details about the tests. - Community contributions. ## D5.3 Report on testing provided software - Remove IP addresses from figure 2 - Combine figure 4 and 5 (with the same layout as on the website) - Connected system 4.6 could move to section 3 Periodic testing. - But the dashboard only collects info on some systems. Otherwise refer to D1.5 - Refer to the list of systems table in periodic testing. - 5.1 title (sanity checks -> test step of install procedure) - 5.2.2 figure 11, include timepoints before the performance drop - Hardware based comparison -> Show a plot comparing between ARM systems - Also more or less complete, great overview. ## Next meetings - Wed 28 May 2025 10:30-12:00 CEST - goal: have deliverables reviewed + camera-ready for handover for final review to Steering Committee --- --- # Sync meeting on MultiXscale deliverables due M30 June 2025 (20250407) ## D1.4 Support for emerging system architectures writing effort lead by: Bob + Pedro (RUG) - process for identifying emerging targets - new systems (within EuroHPC context, national systems) - support requests (e.g. https://gitlab.com/eessi/support/-/issues/68 for Sapphire Rapids) - supported instructions + (expected) performance difference - also Intel Cascade Lake + Ice Lake - overview of procedure to provide installations for additional CPU target - reproduce installation order - step-by-step overview: bot on system where CPU target is supported, etc. - see also https://gitlab.com/eessi/support/-/wikis/building_add_new_cpu_target - tension between doing installations in same order as they were done for existing CPU targets vs required bug fixes only present in newer EasyBuild versions - lessons learned from adding additional CPU targets to existing EESSI version - TODO - A64FX: currently ~1/3rd of modules available - set up bot in service account + give others access to it - use EasyBuild 4.9.4 to install missing bits? - use Bob's script to generate easystack files from existing installations - NVIDIA Grace CPU: close to ready? - AMD ROCm: quite a lot of work to do - also make progress on NVIDIA GPU? - workflow is missing to put more software installations in place - no workflow that includes testing - no fixed set of GPU targets (CUDA compute capabilities) - expose GPU software in overview in docs - capture whether modules were built on GPU build node, or not (in description or module-load-message) - should have sanity check for CUDA compute capabilities in place, so we can rely on it - see [EasyBuild framework PR #4692](https://github.com/easybuilders/easybuild-framework/pull/4692) - more relevant for progress report (due in Aug'25, work done by June'25) - but should be kept short - tiger team should convene again! => Thomas - timeline - [Pedro] early draft by end of April - [Thomas] review done of the draft by mid May - can start reviewing on Monday, May 12 - [Pedro] camera-ready by end of May - June as buffer --- ## D1.5 Portable test suite for shared software stack writing effort led by Caspar (SURF) - focus on EESSI test suite itself - Supported software in EESSI test suite - How it's used (daily runs, test step as part of deployment procedure, and even on local software stacks) - EESSI mixin class: extraction of common logic in portable tests to a single mixin class - Check for and discuss other substantial improvements in the test suite repo (go through release notes?) - Community building (EB user meeting / docs & tutorial on how to contribute) - EESSI mixin facilitates writing new tests, as it points you to missing keywords etc. - timeline - [Caspar+Satish] early draft by end of April - [Kenneth] review done of the draft by mid May - [Caspar+Satish] camera-ready by end of May - June as buffer ## D5.3 Report on testing provided software writing effort led by: - Lara (UGent) + Satish (SURF) for daily runs + performance results - Maksim (SURF) on dashboard - Daily runs: which systems? Improvements to configuration (disjoining test version and config version), automatically use latest test suite release - Lessons learned - alerting based on sudden change of performance? - Study performance results - more variation for e.g. TensorFlow tests - RHEL8 vs RHEL9 (Snellius, HPC-UGent Tier-2) - interesting patterns in Vega? - OSU Microbenchmarks faster with 2023a compared to 2023b? - performance drop in early version of OpenMPI 5.0.x (2024a toolchain) - Performance variations due to change in system software, changes to test suite, etc. - lack of impact from changes to EESSI (which is a good thing!) - Dashboard - Inclusion of more sites on the dashboard? - Challenges? (e.g. permission to publish?) - timeline - early draft by end of April - Lara+Kenneth on daily runs of test suite - Satish on performance results - Maksim on dashboard - [???] review done of the draft by mid May - [Lara/Satish/Maksim] camera-ready by end of May - June as buffer ## Next meetings - also invite Petra to these meetings? - Wed 30 April 2025 10:30-12:00 CEST - goal: have drafts ready for review - Wed 28 May 2025 10:30-12:00 CEST - goals: - have camera-ready versions ready - assess whether deliverables are ready for review by MultiXscale Steering Committee