# Pulp Replication Pulp CLI configuration supports defining multiple Pulp instances[0] An ansible role can easily talk to different servers [1] [0] https://docs.pulpproject.org/pulp_cli/configuration/#config-profiles [1] https://github.com/mdellweg/squeezer/tree/replicate_pulp ## Problem Statement Users have trouble serving the exact same content in multiple data centers (DC) or geographies (Geo). ## The Current Solution Users setup a Pulp in each DC or Geo and configure them to sync from each other. This takes a lot of work. ## Opportunity Make configuring one Pulp to "be just like another" easier. ## Terminology Primary Pulp - The Pulp where content originates from Replica Pulp - The Pulp that receives it's content from the Primary Pulp Replica Repo - A Repo on a Replica Pulp that is configured to sync from a Repo on a Primary Pulp Replica Distribution - A Distribution on Replica Pulp that has the same base_path and Repository pairing as a Distribution on the Primary Pulp Replica Content Guard - A Content Guard on a Replica Pulp that is configured to guard a Replica Distribution the same way as a Content Guard on the Primary Pulp's Distribution Background Sync - A on_demand sync followed up by a immediate sync ## Use Cases As a user I can... * declare a Replica Repo on a Replica Pulp that also has a remote which will sync from a Primary Pulp * declare a distribution on a Replica Pulp that matches the repo and base_path of a distribution on a Primary Pulp and any associated content guards * trigger a background sync on a Replica Pulp Replicate Repo * configure a periodic task that creates Replicate Repos and Replicate Distributions for all Repositories and Distributions from a Primary Pulp * configure a priodic task that triggers background syncs every N minutes ## Proposal for CLI Do the simplest, highest value thing first and deliver that as a fully working thing. * Create a Replica Repo * Assumptions * Each repository has only one distribution associated with it * CLI command * pulp file repository replicate --replica-profile <profile name> --name <repository name> * This command will do: - Find the repository and associated distribution in the default profile pulp - Create a remote on the replica pulp pointing to the base_url of the Distribution on the default pulp - Create a repository on the replica pulp that has the same attributes as the one on the default pulp - Create a distribution on the replica pulp that matches the distribution on the default pulp - Sync the repository on the replica pulp ## Proposal for Pulp API - PulpServer API with full CRUD - base_url - username - password - api_root - cert - key - verify_ssl - label_to_replicate - 'Replicate' action on the PulpServer API will dispatch a Task Group that will do the following: - If label_to_replicate is specified: - Search for all distributions with the specified label - Create a remote, repository, and distribution for each discovered distribution with autopublish enabled - Sync each repository - If no label_to_replicate, replicate all distriutions. ## Questions - What should happen when an upstream distribution is not point to any repository or publication? Syncing would produce a 404. - For RPM repositories, which sync_policy should be used? 'mirror_content_only' or 'mirror_complete'? ## Difference between remotes RpmRemote - sles_auth_token UlnRemote - uln_server_base_url AptRemote - distributions - components - architectures - sync_sources - sync_udebs - sync_installer - gpgkey - ignore_missing_package_indices ContainerRemote - upstream_name - include_foreign_layers - include_tags - exclude_tags - sigstore RoleRemote (ansible) CollectionRemote (ansible) - requirements_file - auth_url - token - sync_dependencies - signed_only GitRemote (ansible) - metadata_only - git_ref ## Differences between Repositories RpmRepository - metadata_signing_service - original_checksum_types - last_sync_details - retain_package_versions - autopublish - metadata_checksum_type - package_checksum_type - gpgcheck - repo_gpgcheck - sqlite_metadata FileRepository - manifest - autopublish AptRepository ContainerRepository - manifest_signing_service AnsibleRepository - last_synced_metadata_time ## Differences in Distributions ContainerDistribution - namespace - private - description