# Pulp 3 Dependency Solving synopsis
## Use Cases
Red Hat customers with especially cautious risk profiles often do not blindly trust updates from the CDN. What they do instead is to use Satellite to manage and curate their own repositories. The primary use case for this is to apply particular security advisories to their repositories and making them available without pulling in unrelated changes.
However, if one of the packages that is part of an advisory which is being applied to the custom repo requires a new or updated dependency, then those dependencies need to be respected.
The primary use case is thus: "As a user, I want to take an Errata/Advisory and copy it, along with every RPM package / module it references, and any non-present RPM dependencies and modules they depend on, to my custom repository."
## Code Walkthrough
Relevant files:
* https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/copy.py
* https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py
Starting with this code: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/tasks/copy.py#L175-L207
The input parameters are stored in a mapping that looks roughly like the following:
```
[
{"source_repository": "rhel8-baseos", "destination_repository": "rhel8-custom-baseos", "units_to_copy": [...]},
{"source_repository": "rhel8-appstream", "destination_repository": "rhel8-custom-appstream", "units_to_copy": [...]},
]
```
The user provides a list of pairs of "source" and "destination" repositories, along with the list of content (RPMs, Modules, PackageGroups, Advisories) which they would like to copy between each pair of repos.
Many pairs can be specified at once, because in order to fully resolve the dependencies for packages in (e.g.) "rhel8-appstream" you would likely need "rhel8-baseos" as well.
For each repository that was listed as a "source", we create a new libsolv repo and load it with the contents of the Pulp repository. We track the name of the "libsolv repo" so that we can map it backwards to the original Pulp repo it is representing. For each repository that was listed as a destination, we load its contents into a single libsolv repo which we will later mark as the "installed" repository.
Now following this code: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L581-L712
The process of "loading contents" entails: querying from Pulp's postgresql database information such as NEVRA + filelists + vendor + "requires" + "provides", for each RPM, NSVCA + dependencies + artifacts information for each Module, and module + stream + profiles information for each Module-default.
This is done "manually" for each unit, we do not directly load the solver from RPM metadata.
For each RPM in the repository, the libsolv "solvable" is created thusly: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L116-L259
For each Module in the repository, the libsolv "solvable" is created thusly: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L262-L451
For each Module-default in the repository, the libsolv "solvable" is created thusly: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L454-L509
The libsolv ID of each newly created libsolv solvable is then registered in a mapping which keeps track of the primary key of the Pulp "content" which it corresponds to. This is done so that once we have the list of dependency-solved "solvables" at the end of the process, we can then copy the correct Pulp "content" between repositories. The mapping and registration code is here: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L512-L542
Finally once the libsolv Pool has been completely created and populated with solvables it is finalized: https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L581-L599
And then we perform dependency solving. For each Pulp "content" which the user asked to be copied, we look up the solvable ID in the mapping we created previously. We create an install job for each solvable, with a special case for solvables that represent modules, where we then add an additional install job for each module artifact. We take the entire list of install jobs and solve for them at once [0]. We take the solver transaction and look at the "newsolvables()", and then map these solvable IDs back to the original Pulp "content" IDs, and look at the mapping to determine which repository they came from. They are then copied to the appropriate "destination" repository for the "source" repository they originated from. https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/depsolving.py#L738-L841
## Key points
* The fact that libsolv only has a single "installed" repo at any time doesn't map very well to what we need to accomplish, and results in extra relationship tracking.
* At least with the current architecture, we need to provide the information used to intialize each solvable ourselves, rather than parsing existing metadata.
* At the end of the process, we need to know which solvables correspond to which Pulp "content"
[0] We had at one point been trying to do it one solvable at a time, but it led to severe memory leaks and horrifically bad performance.
###### tags: `RPM`