Pulp3 Deployment Considerations

Current Usage

Pulp2 used for a large deployment, serves content to
* Pulls down content from content sources, e.g. RH or other channels
* Snapshot content on bi-weekly or weekly cadence
* Multiple PoPs
* Desiring to roll out RHEL 9
* Custom tooling to organize the repos and promotion using Pulp APIs
* Peforms some quality checks, e.g. linting, signature checks, etc
* Copies content between repos
* Uses rsync distributor to a webserver

Goal

  • Desiring the ability to have the content live natively on the cloud instead of having cloud
  • Having PoPs serve content when they are disconnected from the other PoPs

Use Cases

Snapshot Use Cases

  • As a user I can …
    • Define snapshot RedHat CDN content via console.redhat.com to
    • Easily connect systems to any snapshot with my existing RH credentials

Point of Presence Use Cases

  • As a user I can …
    • Launch a point of presence (PoP) which will auto-register with console.redhat.com
    • Configure the PoP to sync one or more c.rh.c snapshot repositories
      • On Demand - Metadata only, binary data delivered as pull-through cache
      • Full Sync - Metadata and binary data synchronized

Operator notes

Architecture

Questions (Pulp)

  • "quality check the package"
    • Won't clash with existing package name
    • has changelog
    • signed with the correct key
  • How much third-party content, custom content
    • Mostly Red Hat
  • What are the primary compose workflows
    • Repos are managed as bundles, treated in a sense as immutable snapshots
    • Only use newest versions, no "incremental update" with errata
  • Is rollback an aspect of the Pulp3 feature set that is useful?
    • Yes, but currently the Pulp2 distributor allowed them to publish a point in time. It did take a long time though
  • If Pulp3 had an Rsync Exporter (like the Pulp2 rsync distributor) would you use that instead of launching a container based Pulp on the Pop?
    • One is a push model, the other is a pull model
  • Filesystem export + Rsync, or native Rsync
  • What is the high-availability need?

AI

  • paul: Issue discovered: checksums of pulp_rpm repos aren't available for on_demand repos. Need a reproducer reported
    • How to reproduce:
      • Create a repo
      • Create a remote
      • Sync the repo with policy on_demand
  • bmbouter: discuss with pulpcore if we can prioritize 1817 or 3155
    • biggest issue is security - how do we make sure sensitive data is always censored appropriately
  • bmbouter: to organize a cost estimate calculator

Next

March 20, 2024

Have a PR open to fix replication bug: https://github.com/pulp/pulpcore/pull/5140

March 6, 2024

Need to prioritize https://github.com/pulp/pulpcore/issues/4637

Nov 1

  • pulpcon is next week Nov 6 - 9, agenda here
  • slides for multi-geo pulpcon talk here
  • pulpcore 3.40 released
    • now contains Pulp file
    • will require an update to any plugin for compatibility reasons, e.g. pulp_rpm
    • upgrading should be done with a planned outage still, in the future it can be done online
  • note the pulp-oci-images now runs the migrations as a separate container
  • what to do with a replica server that has had changes made to it?
    • please file a bug on this and we'll look at it

Oct 18

  • Production is going well
  • How to store some arbitrary data on a Repository, e.g. notes about a specific package being present in a repository
    • recommendation: use the label API on a Repository and use NEVRA as the key and whatever needs to be stored as the value
  • Telemetry Update
  • Pulpcon coming up Nov 6-9th

Oct 4

August 23

August 9

  • Clusters are spun up with testing and replication being used
  • Having an issue with password rotation of the database
    • password changes and Pulp needs to be restarted
  • replication updates
    • commands have been added to the CLI
    • bugfixes for replication have been released, please let us know if anything else isn't working
  • Need to revisit the OTEL work soon

May 3

  • Performance testing is showing the the S3 object storage with PULP_REDIRECT_OBJECT_STORAGE=False causes high memory and CPU relative to a clustered backend solution
  • pulp_rpm == 3.20 to release today containing the replication bits. It'll be ready for testing

April 19

  • Experimenting with solutions to timeouts - is it S3 or not?
  • Replica support for pulp_rpm should be released by early next week
  • Metrics work ongoing

March 16

  • Pulp 3.23 released with replica support
  • Metrics Work Ongoing
    • What do we know we'll get?
      • for pulp-content and pulp-api we'll get response status, url, and latency raw data
    • What else would we want?
      • for tasking we'll get a 1 second summary of:
        • busy/free proportion
        • top/sar style resource metrics like cpu usage, ram, network usage within the 1 second summary
      • for tasking we'll also get event based metrics:
        • task uuid, task start time, stop time, task name

March 1

  • Issue filed: https://github.com/pulp/pulpcore/issues/3621
  • pulp-replica, when will it be released?
    • goal: to be included in 3.23
  • gave overview of domains
    • they are interested in using domains, what happens today is lots of content comes in and sometimes it clashes, e.g. with nevra. Domains would solve this problem
  • updates on the image tag changes that have been made
  • use case: get secrets from KMS via a sidecar container that gets the secrets and loads a config map
  • open telemetry update
    • metrics and tracing are working well for pulp-api
    • next step: add support for pulp-worker and pulp-content
    • I'll record a youtube video showing off the tracing and metrics for pulp
    • next time: let's discuss feedback / ideas on the metrics for Pulp to find what would be useful

Feb 15

Feb 1

  • issue from last week about 0 bytes returned from pulp-content app was legitimately a 0 byte package! So no issue there
  • the yum/dnf timeout was increased from 2 seconds to 10 second. It was failing with 2 seconds which was for the occasional package just a little too slow
  • pulp-replica demo
  • talked through the pulp_concurrent setting some
    • It's concurrent TCP connections from 1 task

Jan 18

Dec 14th

  • Not an issue to continue using non-clustered Redis
  • Status of deployment
    • Integrated with additional orchestration/mush APIs
    • Testing against the RHEL 9 release streams
  • Open Telemetry
  • Cost Analysis
    • desire: to have a cost estimator for running a Pulp installation on AWS in terms of infra and network storage + network delivery costs

Nov 30th

  • Issue discovered: checksums of pulp_rpm repos aren't available for on_demand repos. Need a reproducer reported
  • Redis issue figured out: It was a clustered Redis install, but Pulp doesn't support clustered Redis
    • <discourse link needed>
  • Hard to tell when a sync task is on_demand versus immediate.
  • Identified that FS Exports may be an option for their geo distribution
    • problem is: it doesn't deduplicate RPMs and that's a lot
  • Difficult to know when querying a task about what the task is doing. Kind of the only thing to go on is the created resources, which don't get created until the end

Nov 16th

Nov 2 Updates

Oct 12th Updates

Sept 28th Updates

  • general updates
  • Propose we shorten to a 30 minute, 2 week call
  • [question from pulp operator team] Could you share some more details about the permissions that k8s operators typically require that are not acceptable for your environment?
    • it downloads a lot of untrusted assets, but that could be gotten around by pointing to your own registry
    • the permissions would need a more specific review
    • they mostly use helm charts today
    • uncomfortable with the API access to k8s itself because the "deployer" here is not the admin, they are general users
    • we should be offering a helm chart
  • Documented the dockerfile, see updates here https://docs.pulpproject.org/pulp_oci_images/
  • Two upcoming goals (likely):
    • combine the single container and the operator images to have one set of technology
    • product an operations manual
  • pulpcon coming up Nov 7-12: CFP is open until Monday, we'd welcome any talk about how Pulp3 is being used

Sept 14th Updates

  • Using single container to pull RHEL content
  • Deployed on AWS and using AWS RDS as the db backend for it
  • Having some issues with running on k8s
  • Enjoying to associate a repo version and distribution
    • improvement from pulp2
  • next step: to try to use the pulp_installer roles to built a container
  • need identified: desire a dockerfile that we would share

HCaaS open questions

  • privacy - some content is deeply sensitive, questions about which systems are allowed to touch it, which systems are allowed to store it
  • third party (potentially licensed) content, e.g. content from VMWare, Nvidia
  • reliability - cybersecurity is critical, updates are critical, the infra must be available, SLAs are important
  • quality checking (as described earlier) - where would custom, organization-defined quality checks fit into a hosted service model

AI

  • bmbouter to share dockerfile
  • Investigate container privileges - running without root
  • tiho to setup followup time to explore use cases and operational needs for a SaaS model
Select a repo