# Pulp3 and FilesystemExports This is in response to https://bugzilla.redhat.com/show_bug.cgi?id=2028377#c12 (In reply to Glenn Snead from comment #12) > I took a look at https://github.com/Katello/katello/pull/9925, and I don't > see the difference from the existing Satellite 6.10 content view export > method. > > Unless the resulting tarball has the proper file tree i.e. > content/rel8/x86_64/baseos/os/{Packages,repodata} format with a full copy of > the latest repository metadata we cannot use Satellite 6.10 to support > disconnected customers who are running their own Satellite servers. These > Satellite servers expect an available CDN server to supply them with content > regardless of what their Satellite Organization(s) are named, and what is in > their Organization's entitlement manifest. Pulp3 supports this kind-of export, it's called a FilesystemExport. It currently doesn't produce the tarfile/toc/chunk that PulpExport does, but it does export to-the-filesystem a specified repository publication. This is in tech-preview, because it hasn't had much use/testing, and therefore needs more eyes/thoughts on whether there are requirements we haven't thought of - this BZ is prob exactly what it needs :) This code missed the 3.14 deadline, it's in 3.15 Doc is here: * https://docs.pulpproject.org/pulpcore/restapi.html#tag/Exporters:-Filesystem * https://docs.pulpproject.org/pulpcore/restapi.html#tag/Exporters:-Filesystem-Exports The export-workflow is "create a FilesystemExporter, with a name and an export path; invoke that exporter with a repository-publication href or a repository-version to create a filesystem-export; tar the results." Example script, starting from "create and sync a repo": (NB: pulp-cli doesn't have file-export support yet, so I use direct HTTP requests) ``` $ pulp rpm remote create --name test --url https://fixtures.pulpproject.org/rpm-signed/ --policy immediate $ pulp rpm repository create --name test --remote test $ pulp rpm repository sync --name test $ pulp rpm publication create --repository test # use resulting publication-HREF below $ http POST :/pulp/api/v3/exporters/core/filesystem/ \ name=test \ path=/tmp/fsexports/ \ method=write # use resulting Exporter-HREF below $ http POST :/pulp/api/v3/exporters/core/filesystem/cad6493d-9412-4bf7-95ad-d2fb8b74fdd1/exports/ \ publication=/pulp/api/v3/publications/rpm/rpm/ad5c8424-1adb-4966-b619-01b9d19ecc74/ $ ls /tmp/fsexports/ bear-4.1-1.noarch.rpm dolphin-3.10.232-1.noarch.rpm horse-0.22-2.noarch.rpm squirrel-0.1-1.noarch.rpm camel-0.1-1.noarch.rpm duck-0.6-1.noarch.rpm kangaroo-0.2-1.noarch.rpm stork-0.12-2.noarch.rpm cat-1.0-1.noarch.rpm duck-0.7-1.noarch.rpm kangaroo-0.3-1.noarch.rpm tiger-1.0-4.noarch.rpm cheetah-1.25.3-5.noarch.rpm duck-0.8-1.noarch.rpm lion-0.4-1.noarch.rpm trout-0.12-1.noarch.rpm chimpanzee-0.21-1.noarch.rpm elephant-8.3-1.noarch.rpm mouse-0.1.12-1.noarch.rpm walrus-0.71-1.noarch.rpm cockateel-3.1-1.noarch.rpm fox-1.1-2.noarch.rpm penguin-0.9.1-1.noarch.rpm walrus-5.21-1.noarch.rpm cow-2.2-3.noarch.rpm frog-0.1-1.noarch.rpm pike-2.2-1.noarch.rpm whale-0.2-1.noarch.rpm crow-0.8-1.noarch.rpm giraffe-0.67-2.noarch.rpm repodata wolf-9.4-2.noarch.rpm dog-4.23-1.noarch.rpm gorilla-0.62-1.noarch.rpm shark-0.1-1.noarch.rpm zebra-0.1-2.noarch.rpm $ cd /tmp $ tar cvf test.tar fsexports/ fsexports/ fsexports/bear-4.1-1.noarch.rpm fsexports/camel-0.1-1.noarch.rpm fsexports/cat-1.0-1.noarch.rpm fsexports/cheetah-1.25.3-5.noarch.rpm fsexports/chimpanzee-0.21-1.noarch.rpm fsexports/cockateel-3.1-1.noarch.rpm fsexports/cow-2.2-3.noarch.rpm fsexports/crow-0.8-1.noarch.rpm fsexports/dog-4.23-1.noarch.rpm fsexports/dolphin-3.10.232-1.noarch.rpm fsexports/duck-0.6-1.noarch.rpm fsexports/duck-0.7-1.noarch.rpm fsexports/duck-0.8-1.noarch.rpm fsexports/elephant-8.3-1.noarch.rpm fsexports/fox-1.1-2.noarch.rpm fsexports/frog-0.1-1.noarch.rpm fsexports/giraffe-0.67-2.noarch.rpm fsexports/gorilla-0.62-1.noarch.rpm fsexports/horse-0.22-2.noarch.rpm fsexports/kangaroo-0.2-1.noarch.rpm fsexports/kangaroo-0.3-1.noarch.rpm fsexports/lion-0.4-1.noarch.rpm fsexports/mouse-0.1.12-1.noarch.rpm fsexports/penguin-0.9.1-1.noarch.rpm fsexports/pike-2.2-1.noarch.rpm fsexports/shark-0.1-1.noarch.rpm fsexports/squirrel-0.1-1.noarch.rpm fsexports/stork-0.12-2.noarch.rpm fsexports/tiger-1.0-4.noarch.rpm fsexports/trout-0.12-1.noarch.rpm fsexports/walrus-0.71-1.noarch.rpm fsexports/walrus-5.21-1.noarch.rpm fsexports/whale-0.2-1.noarch.rpm fsexports/wolf-9.4-2.noarch.rpm fsexports/zebra-0.1-2.noarch.rpm fsexports/repodata/ fsexports/repodata/68f65a6687ddf7616b17a283382da9303e1132913e4e4756e5255346ec5de3ca-primary.xml.gz fsexports/repodata/f437734156437b264d07c9f17896d39953b6659d918a9ae605c5ec26301673ce-filelists.xml.gz fsexports/repodata/06df8526da1cbf5a73951c97fa066a076ee0c95d916c90992e3ec5b078d6a651-other.xml.gz fsexports/repodata/b65a7e04ba544b7339e848e23e0cd7021843e1c29eed06cb04707658c0aeb699-updateinfo.xml.gz fsexports/repodata/a71ea6fe231802411ac8df72e00b74b5211c3dec3ffefefa96f8473691b03d7b-comps.xml fsexports/repodata/repomd.xml $ ``` The import-workflow, in this context, is "copy test.tar to downstream and unpack, create a "file:" remote pointing to the unpack location, and sync into a downstream repository. Starting from the above: ``` $ pulp rpm remote create --name file --url file:/tmp/fsexports --policy immediate $ pulp rpm repository create --name file --remote file $ pulp rpm repository sync --name file $ http :/pulp/api/v3/tasks/696723d6-5c67-444f-921f-194e34b3acf3/ { "child_tasks": [], "created_resources": [ "/pulp/api/v3/repositories/rpm/rpm/3ba87621-c246-47c0-baa1-a4c7af2f4eba/versions/1/" ], "error": null, "finished_at": "2022-03-03T16:55:16.720442Z", "logging_cid": "4120190b8c3a496b9787bd76f8eadc3d", "name": "pulp_rpm.app.tasks.synchronizing.synchronize", "parent_task": null, "progress_reports": [ { "code": "sync.downloading.metadata", "done": 6, "message": "Downloading Metadata Files", "state": "completed", "suffix": null, "total": null }, { "code": "sync.downloading.artifacts", "done": 0, "message": "Downloading Artifacts", "state": "completed", "suffix": null, "total": null }, { "code": "associating.content", "done": 43, "message": "Associating Content", "state": "completed", "suffix": null, "total": null }, { "code": "sync.parsing.packages", "done": 35, "message": "Parsed Packages", "state": "completed", "suffix": null, "total": null }, { "code": "sync.parsing.comps", "done": 3, "message": "Parsed Comps", "state": "completed", "suffix": null, "total": 3 }, { "code": "sync.parsing.advisories", "done": 4, "message": "Parsed Advisories", "state": "completed", "suffix": null, "total": 4 } ], "pulp_created": "2022-03-03T16:55:15.731439Z", "pulp_href": "/pulp/api/v3/tasks/696723d6-5c67-444f-921f-194e34b3acf3/", "reserved_resources_record": [ "/pulp/api/v3/repositories/rpm/rpm/3ba87621-c246-47c0-baa1-a4c7af2f4eba/", "shared:/pulp/api/v3/remotes/rpm/rpm/56b5c79e-b2ed-4f43-88b3-27d030ef1234/" ], "started_at": "2022-03-03T16:55:15.781989Z", "state": "completed", "task_group": null, "worker": "/pulp/api/v3/workers/8734ffe2-504a-4cb1-af83-fb6872918d4a/" } ``` The net is, I think we have much of the machinery in place to answer this need on the Pulp side. We need to think about how we expose it in a way that makes sense to the Satellite-user, and there's work to be done on packaging the result (eg, teach FilesystemExport about tar and toc and chunks etc). > > Is there a plan for this use case? We have a lot of customers, 83 at the > current state with hundreds more to come, who are depending on it. > > Here's what we need: > > Full repository exports containing a full copy of each repository's metadata > that is both Satellite Organization, Content View, and software entitlement > manifest neutral. > Incremental repository exports containing a full copy of each repository's > metadata that is both Satellite Organization, Content View, and software > entitlement manifest neutral. fsexport doesn't support incrementals, that would be a new feature > Single full or incremental repository exports containing a full copy of each > repository's metadata that is both Satellite Organization, Content View, and > software entitlement manifest neutral. This is to address critical CVEs that > have mission critical impact. Think heart bleed and log4j. ## Notes from 9-MAR ### attendees: bbuckingham, paji, ggainey, gsnead * requirements: * disconnected CDN * is the "upstream" for pre-vetted "downstreams" * currently 5.5Tb in AWS * one in AWS that can talk to us * produces tarfile exports * tarfiles moved to 'disconnected' satellite * disconnected satellite syncs * downstreams are vetted, sync from this upstream * questions * how does current Sat6 repo-export not meet our needs? ### Proposed solution from katello point of view: * Katello will use fsexport to export contents of a repository along with its last metadata. * The user is responsible for creating, enabling and syncing this repository in their destination satellite * Something like ```bash $ hammer content-export complete repository --id=22 --cdn-format=true [.....................................................................................................................................................................................] [100%] Exported to '/var/lib/pulp/exports/2022-03-16T20-08-35-00-00/Default_Organization/Library/content/dist/rhel/server/7/7Server/x86_64/ansible/2.4/os'. $ ls -lh '/var/lib/pulp/exports/2022-03-16T20-08-35-00-00/Default_Organization/Library/content/dist/rhel/server/7/7Server/x86_64/ansible/2.4/os' .... -rw-r--r--. 1 pulp pulp 489K Mar 16 20:08 python-passlib-1.6.5-1.1.el7.noarch.rpm drwxr-xr-x. 2 pulp pulp 4.0K Mar 16 20:08 repodata -rw-r--r--. 1 pulp pulp 22K Mar 16 20:08 sshpass-1.06-1.el7.x86_64.rpm ..... ``` * All rpms will be hard linked to the correct pulp artifacts to save on space. * The user would have to manually tar gzip them and send to downstream katello server. * No listing files in the destination (which have to be generated for import.) We could look into hammer generating that automatically. * PS: No support for incremental exports (until fsexports supports it.) * ### On the downstream katello server * Extract the archive and copy/rsync it to the local cdn. * User would have to manually create listing files for releasever and arch if its not already there in the local cdn. * The user would have to configure the katello server to pull from local cdn. * Enable the interested repo using manifest via rh repos page. * Sync the enabled repository. ### Notes/Questions on the proposed solution * I asssume I would use tar's "--hard-dereference" option for hard links. Can someone confirm? * Would the exported repository metadata be full copies of each repository's metadata? * I have a python script Rich Jerido wrote years ago to create any missing listings files. * Incremental exports is a must-have, as RHEL 7 repositories are quite large. The base repository is over 57 GB in size. ###### tags: `import/export`