# EESSI/AWS sync meetings
- link to AWS project doc: https://docs.google.com/document/d/1CHG9fCh2LkfJ-EI8J-_Wr5NpHL5iwm8Wu6syfK9h7-c
## Next meeting
- Thu 12 Oct 2023, 12:00 UTC
- Thu 9 Nov 2023, 12:00 UTC
---
## Notes 14 Sept 2023 (12:00 UTC)
- ...
---
## Notes 10 August 2023 (12:00 UTC)
- status update on sponsored credits
- Costs are about $3k/month for March-July'23 (up from ~$1.5k/month)
- EFS costs are on the rise (~50% in July)
- Build bot is still leaving behind large tarballs to allow debugging failing builds, which are not getting cleaned up currently
- currently ~$10k left in sponsored credits
- Looking into using a CDN
- Will be Q4 before we look further into this
- Injecting OpenMPI/libfabric libraries into EESSI
- Full discussion in https://github.com/EESSI/software-layer/issues/252 (see this comment for details on the [potential way forward](https://github.com/EESSI/software-layer/issues/252#issuecomment-1662202921))
- Basically two steps
- Take a copy of the host libmpi.so
- We are using the EESSI linker so we need to force the library to find some of it's libraries from the host (like libfabric)
- We modify the elf header of the library to do this
- We also inject some additional dependencies to effectively preload some other required host libraries
- Place it in a special place where it will get preferentially get picked up before the EESSI MPI library
- Seems to work with latest version of EESSI, GROMACS runs show performance improvement of ~5%
- failing test suites for OpenBLAS/FFTW/numpy (only) on Graviton 3 (not seeing this on Graviton 2)
- popping up while populating software stack in EESSI pilot 2023.06
- Numerical errors with OpenBLAS in LAPACK test suite
- Some toolchains use older OpenBLAS which lack optimisations
- We see increased number of failing tests
- Discussion on issue at https://github.com/xianyi/OpenBLAS/issues/4187
- Note OpenBLAS devels are only just starting to test on `neoverse_v1`
- We ignored these failing tests for now, assuming they're mostly harmless
- cfr. https://github.com/EESSI/software-layer/pull/309
- FFTW: erratic error with single FFTW test (not always the same one)
- cfr. https://github.com/EESSI/software-layer/pull/310
- still figuring this out
- handful of failing tests in numpy test suite
- cfr. https://github.com/EESSI/software-layer/pull/306
- planning to open upstream issues for this to figure out how serious these are
- Kenneth will send email to Angel on this, could be useful to get some feedback on this from AWS Performance Engineering team
- progress on making it easy to integrate EESSI with ParallelCluster
- Matt is working on open source add-ons for ParallelCluster
- booth talk at AWS booth at SC'23
- long talks (~45min), repeated a couple of times
- live demo of getting EESSI working on AWS
- can cover different aspects
## Notes previous meetings
- 10 Aug 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-08-10
- July 2023: (skipped)
- 8 June 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-06-08
- 11 May 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-05-11
- 13 April 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-04-13
- 9 Mar 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-03-09
- 11 Jan 2023: https://github.com/EESSI/meetings/wiki/AWS-meeting-2023-01-11