# Pulp Replication
Pulp CLI configuration supports defining multiple Pulp instances[0]
An ansible role can easily talk to different servers [1]
[0] https://docs.pulpproject.org/pulp_cli/configuration/#config-profiles
[1] https://github.com/mdellweg/squeezer/tree/replicate_pulp
## Problem Statement
Users have trouble serving the exact same content in multiple data centers (DC) or geographies (Geo).
## The Current Solution
Users setup a Pulp in each DC or Geo and configure them to sync from each other. This takes a lot of work.
## Opportunity
Make configuring one Pulp to "be just like another" easier.
## Terminology
Primary Pulp - The Pulp where content originates from
Replica Pulp - The Pulp that receives it's content from the Primary Pulp
Replica Repo - A Repo on a Replica Pulp that is configured to sync from a Repo on a Primary Pulp
Replica Distribution - A Distribution on Replica Pulp that has the same base_path and Repository pairing as a Distribution on the Primary Pulp
Replica Content Guard - A Content Guard on a Replica Pulp that is configured to guard a Replica Distribution the same way as a Content Guard on the Primary Pulp's Distribution
Background Sync - A on_demand sync followed up by a immediate sync
## Use Cases
As a user I can...
* declare a Replica Repo on a Replica Pulp that also has a remote which will sync from a Primary Pulp
* declare a distribution on a Replica Pulp that matches the repo and base_path of a distribution on a Primary Pulp and any associated content guards
* trigger a background sync on a Replica Pulp Replicate Repo
* configure a periodic task that creates Replicate Repos and Replicate Distributions for all Repositories and Distributions from a Primary Pulp
* configure a priodic task that triggers background syncs every N minutes
## Proposal for CLI
Do the simplest, highest value thing first and deliver that as a fully working thing.
* Create a Replica Repo
* Assumptions
* Each repository has only one distribution associated with it
* CLI command
* pulp file repository replicate --replica-profile <profile name> --name <repository name>
* This command will do:
- Find the repository and associated distribution in the default profile pulp
- Create a remote on the replica pulp pointing to the base_url of the Distribution on the default pulp
- Create a repository on the replica pulp that has the same attributes as the one on the default pulp
- Create a distribution on the replica pulp that matches the distribution on the default pulp
- Sync the repository on the replica pulp
## Proposal for Pulp API
- PulpServer API with full CRUD
- base_url
- username
- password
- api_root
- cert
- key
- verify_ssl
- label_to_replicate
- 'Replicate' action on the PulpServer API will dispatch a Task Group that will do the following:
- If label_to_replicate is specified:
- Search for all distributions with the specified label
- Create a remote, repository, and distribution for each discovered distribution with autopublish enabled
- Sync each repository
- If no label_to_replicate, replicate all distriutions.
## Questions
- What should happen when an upstream distribution is not point to any repository or publication? Syncing would produce a 404.
- For RPM repositories, which sync_policy should be used? 'mirror_content_only' or 'mirror_complete'?
## Difference between remotes
RpmRemote
- sles_auth_token
UlnRemote
- uln_server_base_url
AptRemote
- distributions
- components
- architectures
- sync_sources
- sync_udebs
- sync_installer
- gpgkey
- ignore_missing_package_indices
ContainerRemote
- upstream_name
- include_foreign_layers
- include_tags
- exclude_tags
- sigstore
RoleRemote (ansible)
CollectionRemote (ansible)
- requirements_file
- auth_url
- token
- sync_dependencies
- signed_only
GitRemote (ansible)
- metadata_only
- git_ref
## Differences between Repositories
RpmRepository
- metadata_signing_service
- original_checksum_types
- last_sync_details
- retain_package_versions
- autopublish
- metadata_checksum_type
- package_checksum_type
- gpgcheck
- repo_gpgcheck
- sqlite_metadata
FileRepository
- manifest
- autopublish
AptRepository
ContainerRepository
- manifest_signing_service
AnsibleRepository
- last_synced_metadata_time
## Differences in Distributions
ContainerDistribution
- namespace
- private
- description