# CernVM-FS coordination meeting (2022-05-09)
## Garbage collection issue
- see https://sft.its.cern.ch/jira/browse/CVM-1921
- Starvation problem, GC is blocking publication
- if `CVMFS_AUTO_GC` is enabled in configuration, it should be run automatically after a publish
- can also control how often GC is run via `CVMFS_AUTO_GC_LAPSE`
- operations like GC and publishing should be pushed onto a queue to avoid that they interfere with each other
- GC can take a long time, so may need a plan for that too to avoid impact on publishing frequently
- for example publishing during the day and GC at night when there's very little publishing activity
- with a dynamic installation environment ala EESSI, how is this to be organized?
- highly regular GC will be tricky due to conflicts with publishing, irregular GC will cause the coming GCs to take longer and longer as one waits for them
- End result may be 3+ hours for GCs
- at Fermilab there's a requirement to run GC at least once every 10 days
- seems like John may be on to a bug in CernVM-FS?
- incremental garbage collection is on the CVMFS wishlist (but implementing other stuff has higher priority)
- GC can already be interrupted, progress it has made (during marking phase) will be picked up next time GC runs
- see also https://cvmfs.readthedocs.io/en/stable/cpt-repo.html#repository-garbage-collection
## JIRA vs GitHub
- idea is stop creating new issues in JIRA, and favor using GitHub
- lots of GitHub issues and PRs being opened by JumpTrading to reflect the patches they have done in their CVMFS fork
- lots of them related to performance improvements
- some are specific to their use case
- fellow @ CERN funded by this company
- JumpTrading will present a talk at CVMFS workshop
## Can ducc command be used to only unpack container images (not ingest them)?
- ducc currently doesn't support skipping ingestion into CVMFS
- Maybe Singularity is a better option for this?
## Installing CMVFS on stratums or proxies pulls in Apache as a dependency
- default index.html is left in place, so visiting the URL with a browser gives you the default Apache landing page
- makes it look like something wasn't cleaned up or configured properly
- it's not the job of CVMFS to clean up anything outside of the scope of `var/www/html/cvmfs`
## Large scale corruption issue
- https://sft.its.cern.ch/jira/browse/CVM-2001
- related to https://github.com/cvmfs/cvmfs/issues/2834
- some fixes in the works for this, but mostly to fix performance issues, not to fix corruption due to kernel cache poisoning
- can be discussed again during next CMVFS coordination meeting (with Jakob joining too)
## Impact of IPS @ Compute Canada
- mainly happened with CVMFS preload (populating an alien cache)
- unclear what type of problems this is triggering exactly
- getting corrupt files should not be possible due to hashing mechanism in CVMFS
- easy solution is to ask for an exemption in the IPS
- unclear whether something could be improved in CVMFS (better error reporting?)
- ComputeCanada should be reporting this to CVMFS with more details to get a better context
- CVMFS is still using HTTP, not HTTPS, since that only adds overhead, also interferes with putting a squid in between
- HTTPS should be OK between Stratum-0 and Stratum-1
- Squids could also use HTTPS "up"
- client to squid will be hard to avoid using http for transparent proxying
## Next meeting
- Mon 13 June 2022, 17:00 CEST