conda-forge bot/infra meetings
first friday of every month at 14:00 UTC
2025-03-07
Attendees:
Agenda:
- JRG/MRB: OCI requirements
- Budget: ~10K
- Work items:
- Finalize the CEP pending issues
- Name mangling to account for naming standards (no leading underscore, 128 chars max or so).
- Update current GHCR deployment with new naming (PR needs to be in conda-oci-mirror)
- Mamba needs their logic updated; conda and the rest need implementation from scratch
- Teach conda-package-handling to transform the OCI artifacts back and forth
- Implement downloading OCI packages in rattler
- Implement download OCI packages in conda
- Repodata indexing and publication in OCI contexts. Might need another CEP.
- Compatibility with sharded variant.
- Figure out how to do staging and production environments
- Old ideas include private/public states and a queue somewhere
- Another idea: pick up event from merges to main and other branches that triggers the build somewhere else. We would end up with a centralized "builds repo". Similar to linter setup with multiple isolated stages.
- Important: How to provide feedback back to the feedstock. Commit status required. Figure out what to do with the README badges. The linter dynamically sets the name.
- Repodata patching and broken
- How to do labels in efficient ways for RCs, and other non-production artifacts
- Note about limits: no length limits for names, 128 chars for labels/tags.
- Action items:
- Standardize only how an OCI-uploaded package looks like, leave repodata out
- Later, generate repodata ourselves in conda-forge instead of mirroring
2025-02-07
Attendees:
Agenda:
- YT: Reviews for integration tests in bot codebase.
- MRB:
- Recommend splitting changes in smaller PRs for easier reviewability.
- All of these are good ideas and probably needed, but it's better to have more actionable submissions.
- Possible idea: write a big PR, but then split things like API changes from new feature requests. This way the reviewer can assess whether something will break quickly, and then once that's established it's safer to build on top of a solid, working codebase.
2024-11-01
Attendees:
Agenda:
-
MRB:
-
SC:
- Combining Pulumi w/ 1password for secret managing
- Need to figure out necessary scopes per service
- What would be a good candidate for the first workflow to try?
-
JRG(via KZ):
- Zulip vote has passed, invite link incoming
-
WV:
- Python bindings for rattler are underway,
- wishes welcome
- rattler-solver for conda part of that (collab w/ JRG)
- parallel linking is a separate issue
- sharded repodata is being rolled out on prefix
2024-10-04
Attendees:
Agenda:
- WV:
- Version bumping for recipe v1
- Duplicate version/URL detection in the bot?
- YT: update_sources will write the detected version to countyfair, then the migrator jobs will take process the new versions, and then the autotick will execute the new migration data.
- WV: The version is kept, but the detected URL is not. Maybe it would be nice to keep around to simplify the process.
- KZ: Some docs at https://github.com/regro/cf-scripts/?tab=readme-ov-file#current-bot-jobs-and-structure
- YT: Some opportunities for optimization of some steps
- KZ: Consider constraints of resources
- KZ:
- Going through the backlog of broken version updates.
- JRG:
- Raw links in status page now available for broken version updates and migration details.
- Many errors can be explained with a handful
2024-09-06
Attendees:
Agenda:
- YT:
- Big queue of PRs awaiting reviews about adding abstractions for local debugging
- Working on adding integration tests to the bot to at least make sure that it can start and work. Tests are basic but sufficient to detect 'obvious' issues that required reverts in the past.
- ED: Where do the mockup repos live?
- YT: Fake orgs and repos set up separately, suffixed with -staging.
- MRB: Can we have the repos embedded in regro itself to reduce the spread of orgs?
- YT: Trying to avoid interfering with the production environment.
- MRB: With one more bot account it should be ok.
- Is it helpful to add review comments after a PR has been merged?
- JRG:
- Almost done with the CZI grant, ~70h left. Will prioritize Pulumi secret management. Any other tasks should be bumped?
- MRB:
- Yes for secrets
- Mirroring: Wolf's prototype and layers, are full .conda artifacts stored?
- JRG: Yes
- Delegate to original source for non-mirrorable content
- DO use redirections if possible
- "OCI middleware to provide static-like downloads so conda can use it blindly"
- MRB to KZ: how's the conda-build Lazy Index PR going?
2024-07-02
Attendees:
Agenda:
-
WV: Working on rattler-build integrations across conda-smithy, staged-recipes
- JRG: Main challenges?
- WV: A lot of moving pieces across regro / conda-forge, unclear relationship, API is basically frozen, dependencies on
main
. Trying to modernize projects as they are being worked on (ruff, pyproject.toml).
- ED: do we have dev standards for these things? i.e., if app, use lock files, data model = pydantic, etc.
-
WV: sovereign tech fund "reduce tech debt" PRs have kind of rotted because they didn't get merged in. Also, some of the PRs feel like automated tooling changes (os.path -> Pathlib)
2024-06-07
Attendees:
Agenda:
-
YT: Local debugging of migration runners so it's not as tied to the Github infra.
- Git logic needs to be decoupled from the other logic so this refactor is possible. Using OO approach to replace the Git logic with a DryRun logic. Some parts already merged, but still some more needed.
- Question about maintenance bus factors, and how to help with the bot tasks / reviews.
- MRB: Help welcome, take into account maintenance model. Sometimes quick patches are needed instead of a perfect solution.
-
NM: rattler-build in conda-smithy
- Received PR comments from Isuru. What to do after the (eventual) approval. Staged-recipes integration?
- MRB recommends starting integration tests with an existing feedstock.
-
KZ: Problem in openmpi5 migration? https://github.com/conda-forge/libnetcdf-feedstock/pull/192#issuecomment-2149885644
-
MRB: https://github.com/regro/conda-forge-feedstock-check-solvable can use rattler under the hood now
-
JRG: More verbose logs coming soon
-
MRB: Status on conda-build using the new conda Index object?
2024-04-05
Attendees:
Agenda:
- YT: Changes in pydantic schema PR
- JRG: Summary of progress in managing Github teams with Pulumi
- MRB: Same but for repo secrets
2024-03-01
Attendees:
Agenda:
- JRG: new pydantic docs for conda-forge.yml
- JRG: move orga docs to community
- JRG+KZ: reorg docs into new Diataxis-based framework
- MRB: what to do with libcfgraph
- JRG: conda-forge-metadata still defaults to libcfgraph; switch to OCI+anaconda combo
- MRB: need to check who else is using that repo
- JRG: Python import maps still unavailable elsewhere; we need the files-to-artifact DB somewhere.
- (We talked about mongodb free tier, or a regular SQL server.)
2024-02-02
Attendees
Agenda:
- JRG: Add latest updates for conda-forge.org/packages
- JRG: Github Actions runners for Apple Silicon
- JRG: Make Travis CI opt-in
- Bela: Debugging improvements for autotickbot.
- https://github.com/regro/cf-scripts/pull/2131
- What kind of problems are usually found while working with the bot?
- Recipes not being parsable by the bot. We try to expose some of these errors to the status page but they don't always make it. Need local debugging by running the bot code against the problematic recipe.
- Improvement: test the recipe parsing.
- Global state of the bot gets in the way of graph traversal
- Very tricky to debug
- Debugging via notebook in regro/cf-graph-countyfair
- "I didn't get a version update"
- Parsing issues
- Upstream changed source URL and we don't know about that novel schema
- Checking URLs for existence (e.g. new version) sometimes fails with curl, but not wget, or requests
- Solvability of the dependencies. It got a little better, but still not perfect.
- A few times a month, they need to reset some metadata in the bot database. If you accidentally push a tag with confusing syntax that is mistakingly taken as too far in the future, the bot will remember it even if deleted in the repo, so folks have to manually submit a PR in the graph to clear it up.
- This would be nice to have as an admin-requests PR-workflow.
- What's the biggest time sink while debugging issues in cf-script?
- Setting up, updating the repos and metadata, loading up the repo
- 6 months ago, an admin command allows to update version in-feedstock. Pulls bot code. Note that the webservices server is too slow to run the code itself, so instead it makes GHA run it on behalf of the user with a dance of webhooks and repository_dispatch payloads. See:
- Idea: devise automation to debug a migration via issue-title admin-command. e.g.
@conda-forge-admin, please debug migration XYZ
- Wolf: rattler-build and conda-forge: availability soonish.
- MRB: 1st question: Define availability. Just compatibility for recipe.yaml, or also bot integrations.
- WV: Tier 1 integration would be to recognize recipe.yaml presence to run for recipe.yaml. Autotick-bot compatibility would come after.
- MRB: Editing the YAML is easier, yes, but there's a lot of hardcoded logic and assumptions about meta.yaml and might be not trivial. Needs to ensure that both formats receive the same output.
- WV: Render output format will also be different. The names in-tarball will be different too (instead of info/meta.yaml it'll be info/recipe.yaml, and so on).
- MRB: There's some Jinja replacement done for Jinja functions like
compiler()
stuff, which are ignored in the output graph. Also some complexity with run_exports and pinnings.
- MRB: Rendering happens for each .ci_support/ file, so there's some solving happening, but partially.
- The code is difficult to follow in some places.
- Python bindings would come very handy
- MRB: Does rattler understand recipe/conda_build_config.yaml files?
- WV: Yes, to a degree. No selectors are taken into account. Considering changing it to have a 2-stage parsing to get platforms and some fields only, and the rest will come in a second phase.
- MRB: This might get in the way of conda-smithy rerendering processing and migrators. If not compatible, it might require two versions of the same migrator for each format. It'd be easier to have rattler understand CBC as we move forward.
- CBC.yaml can run arbitrary Python in the selectors (it's an eval after all). We need to be careful with things like os.environ calls
- MRB: Idea: Generate meta.yaml from recipe.yaml to make the bots work initially.
2024-01-12
Attendees:
Agenda:
- JRG: web updates
- OCI: mamba in process
- Uwe: rattler build work is coming, lots of bot changes, will be working on it
2023-11-03
Attendees:
Agenda:
2023-09-01
Attendees:
- Jaime Rodríguez-Guerra
- Vini
- Hind
- Eric
- MRB
Agenda:
- Jaime: Potential SDG application for a Scaleway-based Apple Silicon CI. Amit is looking into it to assess whether it's feasible.
- Apple licensing
- MacStadium no public CI
- Matt: New recipe format in the bot
- Main point: Parsing recipe format to add nodes in the graph
- Dependencies on conda-build
- Might be good timing to start anew
- Good option for CZI grant (rattler-build + conda-forge?)
- Discussion about OCI and community standardization:
2023-07-06
Attendees:
Agenda:
2023-03-03
Attendees:
Notes:
- depfinder: problem is because we're fnmatching on a bunch of stuff that includes the full path
- https://github.com/conda/grayskull/issues/441 this one?
- DEBUG depfinder:reports.py:168 ******found ignore match for name: /usr/share/miniconda3/envs/test/conda-bld/praw_1677855233037/work/praw/reddit.py
- problem 1: we have an env named "test" so the fnmatch for "/test/"
- problem 2:
bot background:
- biggest thing that's happened in bot land in the last 2 years are that the bot will now update recipe with dependencies from grayskull. or depfinder actually. it does both. grayskull is broken cause grayskull has a bug. marcelo hasn't had a chance to fix it. things related to carets in the deps which conda doesn't understand. and thene xtra requirements syntax for pip-land. so that's all broken. depfinder broke which we think we just figured out.
- second big change - slow process - started before bot meetings stopped - completed it in th past week . refactored a lot of the internals of the bots datamodel sothat everythign can run in parallel. much more responsive now cause its on cron jobs. before that was done - vini you were working on the version stuff - if you made any changes to the json blob - one json blob per feedstock - this comes from the github API - what happened was that if you wanted to run diff parts of the bot in parallel they'd all edit the file simultaneously and the whole thing woudl fall apart. that single PR-json file and extracted the different elements into different files. jobs that collect versions, jobs that update PRs, jobs that update fedstocks and remake metadata and some other jobs, those ar ethe main three, all of those jobs can run in parallel now. what happens now is that when the bot starts up it grabs the current state. does use a consistent data model at a given time. if we were to switch to a DB and we allowed updates on the fly, the bot would potentially see a changing data model as it runs. now you just DL a git repo so it all works. that's cut out a huge amount of latency, even though CF has grown by ~2x. now the bot runs on a cycle that takes ~2 hours which is down from 3 which is down from even 4-5 hours not too far ago.
- V: some stuff is lingering around. graph, date. another job can start while the prev job is running.
- M: bot always had this deadman switch. current job kicks off the next one, as opposed to the cron job. sometimes that step of kicking off the next one fails so we have a cron job that sees if one is running and starts one if its not. switched to GH concurrency feature in their workflows and use the github cron syntax for kicking things off. now you'll see dup jobs coming in. that cron job kicks off a new one but concurrency of 1 means that it doesn't actually run. bot status badges dont mean anything anymore. when github does the cancellation, it doesn't know that we actually are working fine.
- M: With Eric's help we moved things back into one repo instead of splitting into two. cf-autotick-bot was merged with cf-scripts
- V: knew we had migrations and admin-requests. noticed on J infra source of truth there is a curator app. is this new?
- M: Security discussion. not transcribing.
- M: Bot latency is the biggest issue right now. "Half a day" in practice (30 mins min, plus the CI job itself - 2h-ish). Cutting the time down is the biggest impact on users. Splitting tasks in separate jobs could also work (version updates different from migrations).
- V: How are tasks splitted in version updates?
- M: Some modulo operation on the feedstock name hash (?)
- M: chaindb still used for dependency tracking (?), which requires xonsh. If we remove that, we can remove xonsh completely.
2023-04-07
cf-graph-countyfair - working data for the bot. what versions its found, all of the nodes, etc.
libcfgraph - indexes all of our artifacts so we can access them quickly
- really useful for a whole bunch of tasks / stuff
https://hackmd.io/63JsxC1GSWWXX3IClySLtA
move cf-graph into the DO database
easiest place to start is the version update loop
next place to go is PR update loop
- issue here is that sometimes the bot has to go all the way down
- two layers of PR data
- list of prs that the bot has made
- github pr info something something etag
Action items: