# RPM meeting * RPM Developer Docs: https://hackmd.io/@pbrochad/rpm-dev-notes * General notes on RPM ecosystem, Pulp troubleshooting, etc. <!-- ## Agenda template ### date, 2022 Pulp 3: Open PRs: * https://github.com/pulp/pulp_rpm/pulls Un-triaged issues: * https://github.com/pulp/pulp_rpm/issues?q=is%3Aissue+is%3Aopen+label%3ATriage-Needed CI status check * https://github.com/pulp/pulp_rpm/actions?query=workflow%3A%22Pulp+Nightly+CI%2FCD%22 3-month planning checkin (every 1.5 months): --> ## Pending action items: * discuss adopting zero-downtime-migration strategy * https://pulpproject.org/pulpcore/docs/dev/learn/plugin-concepts/?h=zero#zero-downtime-upgrades * last copr issue https://github.com/pulp/pulp_rpm/issues/2271 ## Agenda template ``` ### Month DD, 2023 Action Items: Discussion Topics: * Review PRs * Triage new issues * Check CI ``` Open, non-Draft PRs: * https://github.com/pulp/pulp_rpm/pulls?q=is%3Apr+is%3Aopen+draft%3Afalse Un-triaged bugs: * https://github.com/pulp/pulp_rpm/issues?q=is%3Aissue+is%3Aopen+label%3ATriage-needed CI status check * https://github.com/pulp/pulp_rpm/actions/workflows/nightly.yml?query=workflow%3A%22Rpm+Nightly+CI%2FCD%22 ## 2025 ## Action Items * Think about v4 changes ## Upcoming ## September 18, 2025 * Triage and EVR sort discussion * https://github.com/pulp/pulp_rpm/issues/4124 * https://github.com/pulp/pulp_rpm/blob/main/pulp_rpm/app/migrations/0013_RAW_rpm_evr_extension.py * can't take advantage of postgres 16 new collations feature because backportability * https://www.postgresql.org/docs/current/collation.html * Discuss sync memory reductions * Customer case resolved - workaround works * Document with long term ideas tradeoffs - https://hackmd.io/6fwkDMOXRamBz27d5CM-IQ * rpm-builder usefulness for testing * https://crates.io/crates/rpm-builder * https://github.com/rpm-rs/rpm-builder * discussion ensues, along with many huzzahs! * discussion around licensing * report on django-import-export-4 progress * ## September 11, 2025 * dalley to create document on sync-time memory reduction strategies & workarounds * https://hackmd.io/6fwkDMOXRamBz27d5CM-IQ * https://github.com/pulp/pulp_rpm/issues/2271 ## August 21, 2025 * [decko / brian] memory use issues on sync * issue seems to be the filelists. lots of packages with lots of files * https://github.com/pulp/pulp_rpm/issues/4086 * https://github.com/pulp/pulp_rpm/issues/4085 * not a *huge* problem right now * maybe this is just "these repos are kind of terrible, and we may not be able to do anything rational with them" * dalley's done some investigation * "the problem" is ballooning memory when extracting the filelist from a package, for all the packages in the repo * the offending repos have MANY versions of a *small number* of packages * discussion around ways to respond to memory-pressure adaptively, ensues * discussion around dealing w/ filelists specifically * e.g., "store filelists in some complicated compression-format/tree-format" * this is how libsolv and internals of rpm-headers work currently * def would need to be considered from a performance/scaling/migration POV * another possible answer: store filelists as compressed-json-blob * thoughts in our brains * being able to solve the adaptive-sync-stage-one *generally* would be really cool for Pulp * rethinking how filelist is stored in pulp_rpm specifically would be "pretty" interesting * just fixing "these two repos" is..less interesting from a prioritization standpoint * AI: [decko] try opening communications with the third party owners involved, "hey, your repos are a little...crazy. Can you make them somewhat-more-sane?" * can we get thirdparty vendor issues opened? * leave original issues open til we we've had some communcations w/ vendors * [pbrochad] strategic team review of capsule sync PR * https://github.com/pulp/pulp_rpm/pull/4070 * on intake/sync - seems reasonable * Daniel has Thoughts in his Brain RE appearance/history/side-effects - see issue for his comments * discussion ensues * it's *possible* that we don't even need to make this conditional? * location_href vs filename vs relative_path, vs publications vs distributions vs mirroring, makes us all Very Sad * [dalley] CI issues ## August 14, 2025 * pulpcore 3.85 compatibility status * migrations squashed - resolves the BaseDistribution issue * bump minimum supported pulpcore to 3.85? with these issues it may be simpler? * note: trying to get a new libcomps release ## June 12, 2025 * https://discourse.pulpproject.org/t/poc-simple-reopsitory-tmp-wiped-afer-sync/2013 ## June 6, 2025 * continue talking through/firming up the pulp_rpm-v4 changes (see above) ## May 15, 2025 * Make a Y release: pulp_rpm 3.30.0 * Done! * Discuss a plan for building a Pulp developer focused documentation about RPM world and its dark corners. * advisory handling * RPM filenames/conflicts * https://issues.redhat.com/browse/PULP-294 * onboarding/scratchpad * high-level "here's which pieces of pulp_rpm map to which pieces of the RPM Ecosystem metadata" * AI: [dalley] has a google doc "somewhere" - will find and link to us * AI: [pbrochado] to take first pass at turning that into a public hackmd * note: review for any customer/release info * https://hackmd.io/@pbrochad/rpm-dev-notes * Think about v4 changes * Don't sync .treeinfo by default, opt-in * what is "least user astonishment"? * Drop location_base, location_href from Package, maybe replace w/ "filename" * this is a high-prio thing to address for v4 * Should pkgid and checkum type really be part of the Package? * should def be able to look up by pkgid * what happens at upload time? * Drop publishing as sha384? * package-checksum and metadata-checksum * we currently *do not allow* md5/sha1 publication * we want to reduce rpm's dependency on sha384 - in case core decides to phase out unused checksums * we'd still allow it *at sync time* - but say "pulp only lets you specify 'more reasonable' checksums for publications" * can this be done "in" Pulp3 breaking change release * discussion: current checksum strategy is not long-term tenable * maybe a Pulp5 discussion * more kinds-of checksums already in use in various places * we'll need a better way than "generate always all of these checksums" * Evaluate whether we *really* still need depsolving? * let's have a "make an actual decision here" w/ katello/satellite * If any changes need to be made to deconflict copy APIs between pulp_rpm and pulp_ansible and pulp_deb * def a good idea for pulp-4 * prob needs Copy -> RpmCopy ## March 6, 2025 * pulp_rpm content-label perms need to be done differently (soon) * rework * investigating "fun with aiohttp and SSL and self-signed certs" again ## February 13, 2025 * ready for 3.28 release? * [change-distribution-layout PR](https://github.com/pulp/pulp_rpm/pull/3878) should be included * just investigating a test-fixture issue * consensus: wait 3.28 on this please * [remove deprecated options](https://github.com/pulp/pulp_rpm/pull/3879) * should we just...leave these? * *does* change the published-API, in ways that we don't really "have to" * consensus: not in 3.28, probably when Pulp 4 happens * "soon"! ## January 30, 2025 * discussion around zero-downtime-migrations * review the rules * Probably when it becomes relevant ie. when we have a major migration of some kind * [null content origin pr](https://github.com/pulp/pulp_rpm/pull/3865) * ggainey to make sure dalley/pbrochado have access * team will decide next week whether to adjust it or wait til he is back from PTO * [PRN support PR](https://github.com/pulp/pulp_rpm/pull/3864) * no breaking news * implementation details are Fun * Q: on the view, check src/dest repo sanity * lots of discussion ensues ## January 23, 2025 * discussion around [checking GPG keys before adding to repo](https://github.com/pulp/pulp_rpm/pull/2954) * branch-protection-rulesd * we only had them for [0-9].[0-9] * so nothing after 3.9 had branch-protection * modified to match pulpcore - [0-9].[0-9]* * discussion: [PRN support PR](https://github.com/pulp/pulp_rpm/pull/3864) * discussion: [Content Origin](https://github.com/pulp/pulp_rpm/pull/3865) ## January 16, 2025 Discussion: * priorities * core/3.70 support * https://github.com/pulp/pulp_rpm/issues/3854 * null-CONTENT_ORIGIN impacts * audit and fix * https://github.com/pulp/pulp_rpm/issues/3856 * PRN support (eg, advanced copy): * https://github.com/pulp/pulp_rpm/issues/3853 * audit to see where else we need to change * pulp-smash removal: * https://github.com/pulp/pulp_rpm/issues/3855 * did ggainey archive 2024 minutes? * yes : https://hackmd.io/@pulp/rpm_meeting_2024 * next pulp_rpm Y-release should include 3854, 3853, 3856 * dalley creating a 3.28 milestone ### January 9, 2025 * Rename this meeting to satellite? * maaaaybe - but we really don't talk much about non-rpm/file issues * e.g., container? * ggainey: we talk about Satellite/katello A LOT * pbrochado: we do spend time dealing with just-upstream-issues * dalley: we do talk about Satellite, but maybe only because it's the biggest stakeholder * anthomas: what about rpm/stakeholder? * dalley: the name may not be importnt, as long as we know what we're here for * ggainey: do we *need* a Satellite-specific meeting? * dalley: no - there's already the katello integration * consensus: let's not * ttereshc: jira d2d dashboard updated/cleaned up * what else do we need/want on this dashboard for us? Let Tanya know! * some process-discussion has happened * ttereshc: Story Points on any/everything you're working on * in-progress/closed, please * dalley: only on pulp-side? **Yes** please. * background: * goal is, Pulp team doesn't touch top-level Sat jiras * pulp-part is not a subtask, it's **an issue in the Pulp tracker** that gets linked * "shouldn't" need to set the Sat-jira-status * ttereshc: * do still need to set "fixed in" on the satellite issue * discussion/comments will prob happen on the satellite-jira * ggainey: * remind me what the process is when there are sat-jiras for each sat-version, that all map to "one gihub issue that is backported/released to mul.tiple pulp versions" * answer: RTFM, Grant! * https://docs.google.com/document/d/1V0rl8PNV6xEbu_xA__iEuoINubVTGWpXGpqK_eKXLeo/edit?tab=t.0&authuser=2&hl=en#heading=h.giqhjac9tqij * discussion when upstream finds problems when there are not yet any Sat-Jiras/customer-cases * def go to Sat-Eng and make it known * prob want to discuss at katello-integration-mtg the right thing to do * ttereshc to send anthomas doc/jiras on the current process and its logic * PRN support with RPM advanced copy API? * discussion to bring anthomas up to speed on "what the hell are PRNs?" * needs a github issue - dalley volunteers to open one * where else might RPM need to do work? * specifically - things we don't just get "for free" from inheriting from core? * downloaders? content? * https://github.com/pulp/pulp_rpm/issues/3853 * pbrochado: discuss backporting migrations? * discussion on why we don't do this * there is a way to handle this for a specific fix under discussion * there is a django-command that can make this happen * if a migration has the same name/order/depends-on, in every single backported branch, then this can work # v4 Planning * Let's keep in mind that /v4/ has to work in parallel with /v3/, *and* that Pulp4 is more than just /v4/ ## Ready for Katello Feedback * Don't sync .treeinfo by default, opt-in * https://github.com/pulp/pulp_rpm/issues/4008 * what is "least user astonishment"? * separate it out to a separate option rather than being part of skip_types? treeinfo is a different kind of thing than, say, skipping modules or errata or source rpms. * Katello: no problems with this * Move skip_types to remote * https://github.com/pulp/pulp_rpm/issues/4009 * also drop it from the sync-time options? * Katello: this sounds better than status quo * May require a data migration on the Katello side * *could* be added to v3 remotes ahead of time, but not hugely important * Default to zstd compression metadata * https://github.com/pulp/pulp_rpm/issues/4006 * Anything but EL7 can consume it - EL7 is now EOL * Katello: should be OK, what we do will depend on how many EL7 clients we expect to need to support at the time * Could just always pass compression: gzip to publication creation - then nothing really changes * Could set a migration to use gzip for any existing publications and enable configuring gzip or zstd for future ones via UI or something * If any changes need to be made to deconflict copy APIs between pulp_rpm and pulp_ansible and pulp_deb * https://github.com/pulp/pulp_rpm/issues/4017 * def a good idea for pulp-4 * prob needs Copy -> RpmCopy * !This will break bindings! (but also it's already broken) * Katello: do not currently have issues with this * Only time this is used is with dependency solving * Force immediate download of md5 / sha1 repos? * https://github.com/pulp/pulp_rpm/issues/4019 * so that we can generate sha256 checksums and not have on-demand issues when user turns off md5/sha1 checksums * first cut: we could "refuse on first attempt" with error-msgs that describe why, and "switch to immediate to sync this repo" * Katello: we may need more discussion on this * Don't know how many customers may rely on this, e.g. syncing from artifactory * Don't really want to surprise anyone with big unexpected downloads, failure is probably preferable * A setting that would enable you to bypass this if required would be good * Evaluate whether we *really* still need depsolving? * let's have a "make an actual decision here" w/ katello/satellite * Drop from v4 API, keep around for v3 for the time being? Deprecate & remove later on? * https://github.com/pulp/pulp_rpm/issues/4020 * Katello: we can probably remove it technically speaking * But it's bit of a social problem because of ingrained opinions about depsolving * Very very old content views might still be an issue? * We should talk to the product people to get a read on how * We should gather some information to present to help make the case to justify removing it * Drop location_base, location_href from Package, maybe replace w/ "filename" * base vs href: * loc_href is "what is the path to this RPM in this repository structure" * base is "where is this repository" (can start with, for example, "https:" or "../../../allour binaries") * do we even save location_base? * this is a high-prio thing to address for v4 (or maybe Pulp4?) * let's make sure we understand and document how these interact with the answer to "what things are legal for a user to do in terms of layout of RPMs in a given repository" * https://issues.redhat.com/browse/PULP-294 * https://github.com/pulp/pulp_rpm/issues/2580 * Remove location_href from repository unique constraint * collides with content-dedup * same pkg w/ diff loc-href in 2 repos causes Problems * "filename" - constructed from NEVRA * "basename" vs "path" * katello currently treats location-href as basename (already doing a split-and-use-base if it's a path) * deduplicating content is part of what causes this to be a Hard Problem * repo-version-layout might be its own Thing? (this is for Much Future Work) * Have more accurate serializer types for Package * https://github.com/pulp/pulp_rpm/issues/3694 * see comment: we think this may not affect Ruby / Python bindings, but probably does affect Go bindings * would be a change in the "shape of the data" * maybe we could get a bit of pre-test w/ Satellite's help for the Ruby bindings? * python-binding-changes would be caught immediately by Pulp's CI * JSONField serializer is too generic, it can be any json type * Remove is_modular flag from Package model * https://github.com/pulp/pulp_rpm/issues/3524 * really isn't "owned" by the Package (or should not be) - it's an attribute caused-by "exists in a modulemd" * "changing the flag" violates the "immutable content" rule * what's a good answer? * ask "is this package in a module *in this repo*"? * one specific collision case: * ORA and RHEL had the same Package, which was "in a module" in one and "not in a module" in the other * second-sync "won" and changed things out from under the "other" repo * can we add an API (even to v3) where one can ask "is Package X modular *in repo-version-y*" ## Need more refinement * Should pkgid and checkum type really be part of the Package? * should def be able to look up by pkgid * what happens at upload time? (same thing that always happens?) * ggainey sez "Hell no" :) * Drop publishing as sha384? * package-checksum and metadata-checksum * we currently *do not allow* md5/sha1 publication * we want to reduce rpm's dependency on sha384 - in case core decides to phase out unused checksums * we'd still allow it *at sync time* - but say "pulp only lets you specify 'more reasonable' checksums for publications" * can this be done "in" Pulp3 breaking change release * discussion: current checksum strategy is not long-term tenable * maybe a Pulp5 discussion * more kinds-of checksums already in use in various places * we'll need a better way than "generate always all of these checksums" * ggainey sez "Either Just Do It, or drop the discussion" ## [2024 minutes](https://hackmd.io/@pulp/rpm_meeting_2024) ## [2023 minutes](https://hackmd.io/@pulp/rpm_meeting_2023) ## [2022 minutes](https://hackmd.io/@pulp/rpm_meeting_2022) ## [2021 minutes](https://hackmd.io/@pulp/rpm_meeting_2021) ###### tags: `RPM`, `Minutes`