# Notes for 20210401 meeting (should be copied to https://github.com/EESSI/meetings/wiki/meeting-Apr-1-2021) --- * date & time: Thu Apr 1st 2021 - 2pm CEST (12:00 UTC) * (every first Thursday of the month) * venue: *(online, see mail for meeting link, or ask in Slack)* * agenda: - Quick introduction by new people - EESSI-related meetings in last month - Application for CZI grant “Essential Open Source Software for Science” - Progress update per EESSI layer - Pilot repository: changes & status - Usage of AWS resources - Discussion with NVIDIA w.r.t. CUDA - S4 NeIC project proposal + NESSI test lab - Next steps - Past & upcoming events - Q&A ### Slides * see https://github.com/EESSI/meetings/blob/main/meetings/EESSI_meeting_20210401.pdf ### Meeting notes *(by Bob, Kenneth)* #### Introduction by new people - New people on the call: - Jure Pečar from EMBL - experienced EasyBuild user, helping out with BLIS evaluation #### CernVM-FS coordination meeting - ephemeral publish container is in Program of Work - discussed inclusion of EESSI in default cvmfs-config repo #### Application for CZI grant - Proposal written/submitted by Alan in collaboration with the UMCG (University Medical Center Groningen), focussing on rare diseases and supporting biomedical workflows with EESSI - question by Victor: how will this impact EESSI in general? - Alan/Kenneth: general goal should not be impacted, it's mostly focused on a particular use case, which probably means we should provide a bunch more bioinformatics software in our repo to support typical workflows used in rare disease studies #### Progress update: filesystem layer - planning to create new master key for Stratum-0 - only store on Yubikeys (+ physical backup like USB stick in safe storage) - then get EESSI configuration into cvmfs-config-default repository, so any client can get easy access to EESSI - Caspar: this would be good time to document native installation of CernVM-FS for EESSI #### Stratum 1 in AWS - by Jörg with help from Bob & Terje - up and running in AWS instance (tx.large) in eu.west region (?) - not included yet in latest EESSI configuration (0.3.0), but will be soon - "it was fairly easy" - currently using XFS, which was not recommended in older CVMFS versions -> to be checked with CVMFS developers - question by Jure if he should set up a Stratum 1 at EMBL - makes sense, eventually every "big" HPC site would want their own Stratum 1 anyway (to have a full copy of the repo, to protect themselves from network issues) - how many Stratum 1 servers do we want and how should they be distributed? - CVMFS devs warned us that we shouldn't have too many either... - but not all Stratum 1 servers need to be included in EESSI configuration - still looking for volunteers to go through the process to set up an additional Stratum 1 (in different AWS region) #### Progress update: compatibility layer - we should reach out to the Gentoo developers to check if/how we can help to get the problems that we run into (Lmod, bootstrap on POWER) resolved - can we help with testing stuff, showing that it works, etc.? - check if we can use ReFrame instead of pytest for the compatibility layer validation - ask Victor if these compat layer tests can be run through ReFrame? - the installation of 2021.02 was broken and has been removed from the repository - ppc64le installation on hold for 2021.03, hoping that the upstream fix for the bootstrap script will get done soon #### Progress update: software layer - some experiments with speeding up the install script by running multiple EasyBuild sessions in parallel, to overlap installations with different dependencies - not fully working yet - better approach would be to farm out installations to a Slurm cluster (via CitC in AWS) #### 2021.03 pilot repo - ppc64le on hold, until the bootstrap issue has been fixed upstream - More or less the same software and hardware targets - GPU installations on hold until we get green light from NVIDIA to include CUDA in the repository #### Experience report on building software layer for AMD Rome - by Jörg - the build script worked very well: the colors are very useful, the comments are nice, and it does some nice checks (e.g. if you are running it in a Prefix environment) - we need to write some documentation for doing the software installations - Jörg volunteers to do this by making a PR in the `docs` repository #### Discussion with NVIDIA w.r.t. CUDA - question by Jörg: would it help to invite more vendors? - there's already quite a few very big companies joining the call, and we shouldn't make the group too large. #### S4 NeIC proposal + NESSI - ... #### NESSI test lab - big deployment of EESSI on several systems with different hardware, including GPUs - limited access to a small group of users using permissions on `/cvmfs` #### Sponsorship AWS/Azure - AWS credits are being spent on several things: Stratum 1, build nodes, test machines, etc - Discussions with Azure are ongoing #### AWS infrastructure - We need Packer-built images for Openstack as well - Terje shows a demo of the infrastructure code / scripts for deploying dynamic infrastructure - create/remove nodes - grant access to users based on GitHub handles - support for different node types - available on the AWS login node #### Q&A - interest in setting up a separate meeting to discuss how software stack on new clusters can be set up with a later transition to EESSI in mind? - Bob will set up a doodle for this