# CVMFS coordination meetings see https://indico.cern.ch/category/5485 next meetings: - ... ## Notes for CernVM-FS coordination meeting 2024-06-10 attending: Kenneth - notes @ https://codimd.web.cern.ch/HEqxj15dSUGi_CUkGk-u6w?edit# - release of CernVM-FS v2.12.0 is being prepared - poll for updates from EESSI project - thanks for doing CernVM-FS sticker promotion at ISC'24 - expressed interest in getting official CernVM-FS builds for RISC-V, should be possible (maybe already for CernVM-FS 2.12.0?) --- ## Notes for CernVM-FS coordination meeting 2023-11-13 - CVMFS workshop 9-11 Sept 2024, probably @ CERN - large interest in "Best Practices for CernVM-FS in HPC" - via Zoom on Mon 4 Dec 2023 - 50 seats taken in a matter of hours, now way more seats available - is time US-friendly? - East Coast yes (kind of) - could consider repeat event in US friendly time zone - additional package repositories for CVMFS release candidates? - cvmfs-testing package repo - access to bleeding edge (nightly builds) - maybe additional package repo for release candidates for patch releases - less rigorous testing for release candidates - maybe "pre-release" is a better term for this - release candidates is more used for major releases - more frequent patch releases? - takes more time (full test, takes ~24h), so people need to wait longer for the fixed - discussions of problems with hanging CernVM-FS - very difficult to reproduce, still unclear what the problem is or what triggers it - only seen with selected repos like `ligo.osgstorage.org` - see also https://github.com/cvmfs/cvmfs/issues/3375 - CVMFS developers have not been able to "catch" the issue live - next meeting: Mon 11 Dec 2023 ## Notes for CernVM-FS coordination meeting 2023-10-02 (attended by Kenneth) - packaging problem - cvmfs-server package depends on init-scripts package (which is deprecated) - related: `/etc/init.d` needs to be removed (which can not be done on live systems of course), and then reinstall `chkconfig` - cvmfs-server actually only needs chkconfig - see also https://access.redhat.com/solutions/6969215 - only really needed on EL6, so probably OK to remove (?) - upcoming patch release (2.11.1), probably this week - incl. fix for packaging issue - and anything that is ready in time - probably another patch release later to deal with unresponsive CVMFS clients - see issues: - https://github.com/cvmfs/cvmfs/issues/3378 - https://github.com/cvmfs/cvmfs/issues/3383 - https://github.com/cvmfs/cvmfs/issues/3402 - progress on setting up test environment to trigger overload - no HTTP requirement outside of Fermilab is causing some trouble there - there is already an exception for CernVM-FS Stratum-1 servers, asking for another one - could also go via HTTPS instead - alternative setup with "extrad" servers which is actually closer to configuration for which report was made - will hopefully allow to reproduce the problem, and allow to enable additional debug setup - may also be related to authentication (although it also popped up at RAL without auth) - current hypothesis is that it's overloading origin server - progress on /opt campaign - identified which sites are still using /opt - wosg squid monitoring machine can check which IP addresses are trying to access /opt - ~15-20 sites - is reboot of client is required to recover from hangs? - depends on what type of "hang" - if trying to accessing the repo returns an error, then should be possible to kill CVMFS `cvmfs_config killall` without having to reboot - `cvmfs_config reload <repo>` or `wipecache` could be tried first, before `killall` - or just kill `cvmfs` process using `kill` command - would be useful to run `cvmfs_config bugreport` first, which creates tarbal that can be used to report a bug - some discussion on large repos looking for a new home (?) - not in cms.cern.ch since it's too much data - next meeting: Mon 13 Nov'23 ## Notes for CernVM-FS coordination meeting 2022-04-17 (attended by Kenneth) - Improve handling of noop transaction [issue #2839](https://github.com/cvmfs/cvmfs/issues/2839) - @Bob: Are we relying on empty transactions somehow in EESSI? - Forbid patching of publisher/gateway if there is any open transaction [isse #3207](https://github.com/cvmfs/cvmfs/issues/3207) - Problems with CernVM-FS 2.10.1 w.r.t publishing - CVMFS server can not be installed unless Apache server is *running* (?) - That's a pretty significant constraint - That should not be the case currently, cvmfs_server will check at runtime whether Apache server is running (but not at install time) - Is it possible to have Apache server on a separate host than the CVMFS server? - Unclear, needs to be checked - Does there actually need to be a dependency on a specific web server at all? - CVMFS doesn't do anything "wild" with HTTP, so it shouldn't matter which HTTP server is used at all (but maybe that's naive) - Deadline for changes to be included with CernVM-FS release 2.11? - We would like to get EESSI configuration included with CernVM-FS by default - We're working on switching to new domain (eessi.io instead of eessi-hpc.org), which should be done first - Hoping to release CernVM-FS 2.11 in June/July 2023 - Gateway issue (probably for patch release) - Refactoring of download manager + load catalogue - Telemetry feature (already in develop): CVMFS internal stats will be exposed, can be ingested into InfluxDB - Idea for a "Best Practices with CernVM-FS for HPC" tutorial - Virtual tutorial (to maximize possible attendance) in Fall 2023 (Sept-Oct'23) - In scope of MultiXscale EuroHPC Centre-of-Excellence - In collaboration with CVMFS developers - Mainly soliciting feedback on tutorial content we produce - Similar to https://cvmfs-contrib.github.io/cvmfs-tutorial-2021 - Also work on improving the CernVM-FS documentation for the aspects that would be covered in this tutorial - Valentin + Dave are interested in this - Jose: a lot of work was done on this in OpenScienceGrid recently? - For example for clients that do not have internet access - There were some solutions, but there's probably no (separate OSG-specific) docs on it - Using Squid as a gateway could be a workaround - Loopback filesystem, use that is cache (should be well documented) - Laura: please don't use NFS - not a good idea - Should contact cvmfs-* mailing lists - Initial meeting can be planned via cvmfs coordinator meetings mailing list ## Notes 2022-02-13 - CernVM Program of Work 2023 - a lot of performance improvements for client in CVMFS 2.11 (2023Q2) - improved use of kernel cache for client hot cache - there will be a CHEP'23 talk about this - requires recent kernel (EL9) to unlock performance gains - improvement for cold cache (prefetching) planned for CVMFS 2.12 (2023Q4) - next CVMFS workshop planned for early 2024 - Milestones for 2.10.1: https://github.com/cvmfs/cvmfs/milestone/7 ## Questions to ask - tracking usage of CVMFS repo; - trigger init script to send UDP packet to a "counting server" - see https://github.com/DrDaveD/expcounter - ingesting updated versions of a directory with software installations - currently, we do something like: ``` cvmfs_server transaction ... rm -rf /cvmfs/<repo>/path/to/dir tar -C /cvmfs/<repo>/path/to/dir -xzf <tar_file> cvmfs_server publish ... ``` - this may result in broken permissions in the repo - seems like this can be mitigated using `tar --no-same-owner --no-same-permissions ...` - but ideally we can just use `cvmfs_server ingest` instead, after somehow also removing the dir *first*... - there is an option for this, should remove dir first before doing the ingest - ingest option being implemented that will allow "resetting" ownership of files - is it possible to set up a (private) stratum 1 using another stratum 1 (i.e. without access to the stratum 0)? - yes, requires touching a cvmfs_master_replica file (done at OSG) - done via a separate port (Frontier Squid config) - see also https://github.com/opensciencegrid/oasis-server/blob/master/etc/squid/oasiscustomize.sh#L12 - experiences with Stratum-1 on top of S3 storage - still need CVMFS to access repo, can't do directly via S3? - can we mix S1's backed by S3 (in AWS/Azure cloud) with S1's backed by regular storage (on-prem)? - no geoapi from S3 - client may fail when querying an S1 for geoapi info - (passthrough option for S1, purely for monitoring) - aspects: cost (volume of data), reliability of hardware, performance - @RAL: looking into CEPH as backed storage for (S0+?)S1 - @Fermilab: 6-7 years of S0 with dedicated disk, no complaints about performance - can also directly publish to S3 - no S1's at all (like Jump Trading) - costs - number of requests have big impact on cost - maybe answer is no S1's in commercial cloud - could still have squids in the cloud - be careful with their auto-scaling load-balancing squids - frontier squid can be configured to auto-register, see https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid#Enabling_discovery_through_WLCG - side effect is that this also makes it excluded from WLCG Squid Monitoring - using WLCG Stratum-1's would alleviate need to run S1's in commercial cloud