# Pulp3 RPM Advanced-Copy depsolving issue(s) ## Related issues * ~~[1965942](https://bugzilla.redhat.com/show_bug.cgi?id=1965942) - warnings RE missing dependencies~~ * ~~[9293](https://pulp.plan.io/issues/9293) (backport, CLOSED-CURRENT)~~ * [2003764](https://bugzilla.redhat.com/show_bug.cgi?id=2003764) - depsolving once-per-rpm is nonperformant * [9387](https://pulp.plan.io/issues/9387), PR [2123](https://github.com/pulp/pulp_rpm/pull/2123) * [9388](https://pulp.plan.io/issues/9388) Backport request * **symptoms** of this problem - closed-DUP of '3764/9387/9388 * ~~[1934545](https://bugzilla.redhat.com/show_bug.cgi?id=1934545) - assert during depsolving/logging~~ * ~~[9336](https://pulp.plan.io/issues/9336), [9381](https://pulp.plan.io/issues/9381) Backport request~~ * ~~[1965936](https://bugzilla.redhat.com/show_bug.cgi?id=1965936) - memory use bz~~ * ~~[9335](https://pulp.plan.io/issues/9335)~~ * [1995232](https://bugzilla.redhat.com/show_bug.cgi?id=1995232) - postgres-cpu/explain bz * [9331](https://pulp.plan.io/issues/9331), [9378](https://pulp.plan.io/issues/9378) Backport request ## Executive Summary There are two underlying problems. One is the query-performance raised in [1995232](https://bugzilla.redhat.com/show_bug.cgi?id=1995232). The other is that Pulp3 is computing repoclosure on every single rpm added to a destinatoin during a copy, [2003764](https://bugzilla.redhat.com/show_bug.cgi?id=2003764) Fixing '3764 addresses the majority of the depsolving-problem. ## Issue/BZ shenanigans ### Final state desired * 6.10 GA * Fix [1965942](https://bugzilla.redhat.com/show_bug.cgi?id=1965942) - warnings RE missing dependencies **[DONE]** * Fix [2003764](https://bugzilla.redhat.com/show_bug.cgi?id=2003764) - depsolving once-per-rpm is nonperformant * Backport '3764 to pulp-rpm 3.14 * Close various BZs/issues/backport-requests around the symptoms as DUPS * Post-6.10 * Address [1995232](https://bugzilla.redhat.com/show_bug.cgi?id=1995232) - postgres-cpu/explain bz ## Observations ### Pulp issues * Logging: "can't install" warnings aren't relevant to the repo-depsolve-closure case. PR submitted to remediate. * assert/OOM signal handing can't happen, the code that needs to set a singal-handler isn't running in python main-thread * postgres query performance is...terrible. * assert-cause still unclear ### Katello issue * Filter-API doesn't do what the user expects * errata-exclude-before - pulls in ALL 32K RPMS * results in 4 copy-tasks * each task takes progressively longer, as dest fills * Doing advanced-copy 'well' (or even 'acceptably') probably wants a new feature for next-year-katello, "advanced-copy". that sends what the user is actually asking for ("repo as of date-X plus security fixes not including any of the following RPMs") ## Investigation * Satellite-CV-creation actually fails with a cancelled task, caused by this error message: ~~~ pulpcore-worker-4[1160]: python3: ../src/rules.c:261: solver_addrule: Assertion `!p2 && d > 0' failed. pulpcore-worker-4[1160]: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Cleaning up and canceling Task b4d6e7ec-bcff-4663-9ef5-599bd3bce24b ~~~ * This error is **FATAL** - but rather than failing the task with a traceback, it **CANCELS** the task with no explanation * need to open an issue for this, it's...Rude * Have not recreated the fatal error above yet * see [1934545](https://bugzilla.redhat.com/show_bug.cgi?id=1934545) * On standalone Pulp, can recreate the warnings by doing an advanced-copy, of RHEL7, of "all RPMs that are not associated with Advisories" (see recipes, below) * biggest thing Not Found: libc.so.6 (!!), 705 (!!!) times Investigating on Pulp3-upstream shows that, when adding base-RPMs to advanced-copy, RHEL7-x86_64 and RHEL8-x86_64 tested, **all** of the RPMs that couldn't find their files, were i686 (ie 32-bit) RPMs. Multi-arch strikes again? RE assert and logging - we're going to need a SIGABRT handler instantiated prior to solver.solve(). The handler can log the problem more thoroughly; not sure how much access we have "inside" such a thing? Investigation needed. * SQL performance during advanced copy is terrible. Here is an example of a single running postgres task - note the current elapsed time: ~~~ 5032 | 01:28:04.176268 | pulp | SELECT "core_content"."pulp_id", "core_content"."pulp_created", "core_content"."pulp_last_updated", "core_content "."pulp_type", "core_content"."upstream_id", "core_content"."timestamp_of_interest" FROM "core_content" LEFT OUTER JOIN "core_repositorycontent" ON ("c ore_content"."pulp_id" = "core_repositorycontent"."content_id") WHERE ("core_repositorycontent"."pulp_id" IN (SELECT U0."pulp_id" FROM "core_repository content" U0 INNER JOIN "core_repositoryversion" U2 ON (U0."version_added_id" = U2."pulp_id") LEFT OUTER JOIN "core_repositoryversion" U3 ON (U0."versio n_removed_id" = U3."pulp_id") WHERE (U0."repository_id" = '17e556a3-9e77-4ada-9b61-f5fa0f31b112'::uuid AND U2."number" <= 1 AND NOT (U3."number" <= 1 A ND U3."number" IS NOT NULL))) AND "core_content"."pulp_id" IN ('577501a8-5c99-4980-9295-eb8ce1d8bbec'::uuid, '4a9a4042-484d-4f8b-a19e-612c848dab99'::uu id, 'cfbefb76-5587-4404-99ba-2024e96f67b3'::uuid, '13a83bb6-8c05-4624-95ed-7b8f79cd8ee8'::uuid, 'ea76d66d-948a-4080-94e7-5b4126561bda'::uuid, 'be0f23dc -872 ~~~ ## Useful Sat6 recipes ### basic functions * Canceling a task: ~~~ curl -X PATCH -d state=canceled \ --cert /etc/pki/katello/certs/pulp-client.crt \ --key /etc/pki/katello/private/pulp-client.key \ https://$(hostname -f)/pulp/api/v3/tasks/UUID/ ~~~ * Accessing postgres: ~~~ [root@sat-r220-09 ~]# sudo su - postgres -bash-4.2$ psql pulpcore ~~~ ### Preparing an advanced-copy call "by hand" * Find all advisories with an updated-date prior to a specified date ~~~ http :/pulp/api/v3/content/rpm/advisories/?repository_version=/pulp/api/v3/repositories/rpm/rpm/a534c985-4add-46f3-8ef1-49de3bbcff8b/versions/1/\&fields='pulp_href,updated_date'\&limit=5000 \ | jq '.results[] \ | select(.updated_date < "2016-03-17")' \ | jq .pulp_href > errata_config ~~~ * Find packages that aren't part of an UpdateRecord (**note: works when there's only one repo sync'd.** Needs work to find just "...for a specific repository") ~~~ -bash-4.2$ psql pulpcore pulpcore=# select count(p.name) from rpm_package p where not exists (select 1 from rpm_updatecollectionpackage ucp where ucp.name = p.name and ucp.epoch = p.epoch and ucp.version = p.version and ucp.release = p.release and ucp.arch = p.arch); ~~~ * Find content-ptr-id for all packages not in UpdateRecords: ~~~ psql -U pulp -d pulp --host 127.0.0.1 \ -c "select content_ptr_id from rpm_package p where not exists (select 1 from rpm_updatecollectionpackage ucp where ucp.name = p.name and ucp.epoch = p.epoch and ucp.version = p.version and ucp.release = p.release and ucp.arch = p.arch)" \ > base_packages ~~~ * Prepare base_packages to hand off to /copy/: * Delete header/footer * Replace header with: ~~~ [ { "source_repo_version": "/pulp/api/v3/repositories/rpm/rpm/e9e67b6c-5a50-4a48-907b-5d527169c633/versions/1/", "dest_repo": "/pulp/api/v3/repositories/rpm/rpm/d0422d2b-e232-4f5c-9ec2-fdf2d737c5bf/", "content": [ ~~~ * Replace footer with: ~~~ ] } ] ~~~ * Replace UUIDs with `` "/pulp/api/v3/content/rpm/packages/UUID/",`` * **EXAMPLE**: ~~~ [ { "source_repo_version": "/pulp/api/v3/repositories/rpm/rpm/e9e67b6c-5a50-4a48-907b-5d527169c633/versions/1/", "dest_repo": "/pulp/api/v3/repositories/rpm/rpm/d0422d2b-e232-4f5c-9ec2-fdf2d737c5bf/", "content": [ "/pulp/api/v3/content/rpm/packages/ceb32b0a-11cf-41fc-ac06-96ed4b8a9cec/", "/pulp/api/v3/content/rpm/packages/6b78b4fd-5cb3-48cf-a3d1-a448a529fed4/", .... "/pulp/api/v3/content/rpm/packages/e18707da-dae6-4309-b523-be0393f72a65/" ] } ] ~~~ * Issue the /copy/ command: ~~~ http POST :/pulp/api/v3/rpm/copy/ \ dependency_solving=True \ config:=@./base_packages ~~~ ### Pulling useful info from journalctl * Find the things missing-from depsolve warnings: ~~~ journalctl | \ grep "WARNING: Encountered problems solving dependencies, copy may be incomplete: package" | \ awk -F"copy may be incomplete:" '{print $2}' | \ awk -F" " '{print $4}' | sort | uniq -c | sort -n ~~~ * Find the packages-looking-for-missing: ~~~ journalctl | \ grep "WARNING: Encountered problems solving dependencies, copy may be incomplete: package" | \ awk -F"copy may be incomplete:" '{print $2}' | \ awk -F" " '{print $2}' | \ sort | uniq -c | sort -n ~~~