--- tags: advisories --- # Advisory Merging Woes ## Problem Statement When we resync a repo, copy advisories from one repo to another, or upload an adisory into an existing repo, it's possible that an advisory with the same "id" (e.g., "RHSA-2020:1234") already exists. We need to decide whether the incoming advisory is identical to the existing one (in which case we keep the old), a replacement for the existing one (in which case we keep the new), or a modification of the existing one's related lists (e.g., packages) (in which case we try to merge them). There have been some issues around advisory-conflict-resolution. Specifically, [7282](https://pulp.plan.io/issues/7282) points out a failure when uploading advisory-JSON attempting to "add to" an existing advisory. The problem is, at least in the upload case, we are attempting to make that decision **without access to the incoming advisory's package- or reference-lists.** This is...Bad. ## Current Flow Upload starts at the [pulp_rpm/app/serializers/advisory.create()](http://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/serializers/advisory.py#L112-L132) call. ``validated_data`` is the incoming JSON file. At [pulp_rpm/app/serializers/advisory.create()](http://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/serializers/advisory.py#L125), we are making a call to ``super().create(validated_data)`` - but we've already stripped pkglist and ref-list from the JSON via the ``pop()`` calls at #120 and #121. Our ``super()`` is a [pulpcore.plugin.serializer.content.NoArtifactContentUploadSerializer()](https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/serializers/content.py#L60-L69), where we pop any ``file`` entity from ``validated_data`` and then pass it up the chain. Advisories don't have files, so this is harmless. In the same file we are now in [UploadSerializerFieldsMixin](https://github.com/pulp/pulpcore/blob/master/pulpcore/plugin/serializers/content.py#L37-L54). Here we remove the ``repository`` field from ``validated_data``, create an in-memory ``UpdateRecord`` from ``validated_data``, a new in-memory repo-version for the specified repository, and then we add the new ``UpdateRecord`` to the new repo-version. Remember - at this point, our ``UpdateRecord`` has only its base metadata fields. ``new_version.add_content()`` eventually calls ``RepositoryContent.bulk_create()``, followed by falling into its [__exit()__](https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L857), which invokes [repository.finalize_new_version()](https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L866), which sends us back to [pulp_rpm land](https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/models/repository.py#L210) - which [**finally**](https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/models/repository.py#L242) calls [pulp_rpm.add.advisory.resolve_advisories()](https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/advisory.py#L27), which calls [resolve_advisory_conflicts()](https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/advisory.py#L108). Which relies on ``pkglist``, which we haven't had for half-a-dozen nested routines, to do advisory-resolution. And now as is traditional among our people, we sit down and have a good cry. ## Implications, and Infinite Sadness I think it's possible that this is only a problem for the advisory-upload case. (In fact, I'm pretty certain of that, because otherwise every sync and copy that involved advisories would have had, at best, 'odd' results, and at worst failed every time...) I haven't yet tried to dig through every other advisory-creation path at this level of detail, however, so I could be wrong. In the upload-case, though, we simply can't let this be the processing-path. Upload, I think, needs to create an in-memory copy of the incoming JSON, pkglists and all, compute its digest[0], dig up any existing advisory with the same name and/or digest in the target repository, and short-circuit to the ``resolve_advisory_conflict()`` code immediately to decide what to do. [0]There is a short-circuit for "incoming has the same digest as existing", which is to ignore the 'new' one and return the href of the existing instance. Or something like that. ## Notes from 12-MAR brainstorm * upload w/out repository may work around the issue * pulp_rpm/serializer/advisory prob needs its own Upload serializer (not the one from pulpcore) * think about UpdateRecord and digest constraint collisions * think about whether UpdateRecord needs to/can be saved until the pklist/reflist collections are built, in create - so that digest is correctly calculated/stored * evaluate where/how UpdateRecord and Friends are persisted, and whether the current ordering is a good thing or not * may be a ticket RE "granular digests" separate pkglist-digest, reflist-digest, and metadata-digest? * https://pulp.plan.io/issues/3942 * it adds another aspect, it suggests to calculate digest from fields which is likely better than from XML. At the same time out db structure/fields might change why advisory format changes very rarely. ## Reproducer Here's a quick script for getting to where you can make the problem happen: ``` #!/bin/bash pulp rpm repository destroy --name zoo pulp rpm remote destroy --name zoo pulp orphans delete REPO=`pulp rpm repository create --name zoo | jq -r '.pulp_href'` echo "REPO HREF $REPO" pulp rpm remote create --name zoo --url https://fixtures.pulpproject.org/rpm-updated-updateinfo/ pulp rpm repository sync --name zoo --remote zoo TASK=`http -f POST :/pulp/api/v3/content/rpm/advisories/ file@/home/vagrant/devel/bear_advisory repository=$REPO | jq -r '.task'` echo "TASK $TASK" http :$TASK echo "DO THIS NEXT:" echo "http -f POST :/pulp/api/v3/content/rpm/advisories/ file@/home/vagrant/devel/camel_advisory repository=$REPO | jq -r '.task'" ``` The data being used in the above script: **bear_advisory** ```json= { "issued": "2020-03-08 20:04:01", "id": "CEBA-2019--666", "type": "Bug Fix Advisory", "release": "1", "version": "1", "pkglist": [ { "packages": [ { "arch": "noarch", "epoch": "0", "filename": "bear-4.1-1.noarch.rpm", "name": "bear", "reboot_suggested": false, "relogin_suggested": false, "restart_suggested": false, "release": "1", "src": "http://www.fedoraproject.org", "sum": "", "sum_type": "", "version": "4.1" } ] } ], "severity": "", "description": "Not available", "reboot_suggested": false, "updated": "2020-03-08 20:04:01", "solution": "Not available", "fromstr": "centos-announce@centos.org" } ``` **camel_advisory** ```json= { "issued": "2020-03-08 20:04:01", "id": "CEBA-2019--666", "type": "Bug Fix Advisory", "release": "1", "version": "1", "pkglist": [ { "packages": [ { "arch": "noarch", "epoch": "0", "filename": "camel-0.1-1.noarch.rpm", "name": "camel", "reboot_suggested": false, "relogin_suggested": false, "restart_suggested": false, "release": "1", "src": "http://www.fedoraproject.org", "sum": "", "sum_type": "", "version": "0.1" } ] } ], "severity": "", "description": "Not available", "reboot_suggested": false, "updated": "2020-03-08 20:04:01", "solution": "Not available", "fromstr": "centos-announce@centos.org" } ``` ###### tags: `advisories`, `RPM`