## Operating Pulp in Production
Brian Bouterse
---
Q: Why am I the worst person to give this talk?
---
Q: Why am I the worst person to give this talk?
A: I don't run Pulp in production!
---
## Agenda
* Lessons learned
* Community of Practice for Pulp in production
* Zero-Downtime working group
* Performance and Scale Testing
* Containers for Production
* Topics totally unaddressed
---
## Lessons Learned
Info gathered from users I've talked to.
---
## Deploying Pulp
Keep buildable assets in a repo, commit changes trigger rebuilds in a Jenkins pipeline. Pushes the image to quay, and then other tooling deploys those manifests to openshift. This happens for 3 images:
* the application itself
* an nginx reverse proxy
* an image for linting check that is process isolated to run untrusted code
---
Right Sizing Pulp
---
### Install #1
* 250,000 requests/day
* Openshift based on AWS
* Uses RDS Database
* No Redis Caching
---
### Install #1
~250,000 requests/day
nginx:
- 3 pods
- CPU request/limit: 100m/200m
- Memory request/limit: 128Mi/128Mi
---
### Install #1
~250,000 requests/day
api:
- 3 pods
- CPU request/limit: 1/1
- Memory request/limit: 2048Mi/2048Mi
---
### Install #1
~250,000 requests/day
content-app:
- 3 pods
- CPU request/limit: 200m/1
- Memory request/limit: 1536Mi/1536Mi
---
### Install #1
~250,000 requests/day
worker:
- 6 pods
- CPU request/limit: 200m/500m
- Memory request/limit: 256Mi/512Mi
---
### Install #1
~250,000 requests/day
database:
- 8 vCPU
- 32 GB Memory
- 100 GB disk
---
### Install #2
* 6,500,000 requests/day
* In Openshift environment (not AWS I think)
* Uses RDS Database
* No Redis Caching
* Older install (has resource manager)
---
### Install #2
~6,500,000 requests/day
nginx:
- 2 pods
- CPU request/limit: 100m/200m
- Memory request/limit: 64Mi/128Mi
---
### Install #2
~6,500,000 requests/day
api:
- 32 pods
- CPU request/limit: 250m/500m
- Memory request/limit: 1Gi/1536Mi
---
### Install #2
~6,500,000 requests/day
resource-manager:
- 1 pod
- CPU request/limit: 250m/500m
- Memory request/limit: 256Mi/512Mi
---
### Install #2
~6,500,000 requests/day
worker:
- 2 pods
- CPU request/limit: 250m/1
- Memory request/limit: 256Mi/512Mi
---
### Install #2
~6,500,000 requests/day
database:
- 16 vCPU
- 64 GB Memory
- 100 GB disk
---
## Story
* User had set the retained_repo version to 1.
* Bug! Accidentally created a new repo version with 0 content.
* Effect: deleted all content from important repositories!
---
## Lesson Learned
Having more version history is a good safeguard. The recovery was able to be done by having other repository copies.
---
## Community of Practice for Pulp in production
---
A community of practice (CoP) is a group of people who "share a concern or a passion for something they do and learn how to do it better as they interact regularly". - [Etienne and Beverly Wenger-Trayner](https://www.wenger-trayner.com/introduction-to-communities-of-practice/)
---
Designed to be async and low-effort for participation.
https://discourse.pulpproject.org/t/community-of-practice-running-pulp-in-production/683
---
## Containers for Production
Claim: Pulp is hard to deploy
Goal: Focus on container based installs only
---
## Containers for Production
* Feedback on Operator in Production
* Feedback on container usage outside k8s
* upgrading from Ansible Installer -> Containers
---
## Topics Totally Unaddressed
* Monitoring
* Configuring external logging
* Documenting how to
* configure with AWS RDS DBs
* deploy Pulp on EKS
* configure with Amazon MemoryDB for Redis
* Scaling Up/Down
* Backup / Restore
* Multi-Geography Installations
{"metaMigratedAt":"2023-06-17T10:55:14.446Z","metaMigratedFrom":"YAML","title":"Operating Pulp in Production","breaks":true,"description":"Lessons Learned and a new Community of Practice","contributors":"[{\"id\":\"dc40d541-bddd-4823-82c8-6e5276fe233a\",\"add\":3587,\"del\":2981},{\"id\":\"c1a80dce-09b9-41c3-9e68-db045a3b55b1\",\"add\":907,\"del\":52}]"}