owned this note
owned this note
Published
Linked with GitHub
## Current Priority List
* Performance degradation investigation
* https://github.com/pulp/pulpcore/issues/3970
* AI: gerrod willing to take this one on
* Repository content querying performance
* https://github.com/pulp/pulpcore/issues/3969
* AI: lmjachky to take this one on:
* try to narrow down the performance implications of upgrading pulpcore to 3.25+, like noted in #3970
* proposals: store content in ArrayList vs reference repository versions as numbers instead of FKs
* AI: ggainey to dig up minutes of last discussion on this topic and link here/in-issue
## Template
```
## 2023-MM-DD 1000 GMT-4
* attendees:
* regrets:
* Prev AIs:
* Agenda
* AIs:
* ggainey to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944
```
## Upcoming
* DONE!
## 2023-08-21 1000 GMT-4
* attendees: ggainey, gerrod, lmjachky, dalley, jhutar
* regrets:
* Prev AIs:
* [lmjachky] compare filter-output pre/post #4275
* Agenda
* Can we get jhutar here to talk to us about his perf work?
* biggest issue: someone needs to PAY ATTENTION to the results
* how do you decide on red/green for a test?
* need to define a range (for a number of metrics), note when something is "outside" allowed
* perfteam has an easy process on-demand, but hard to **keep** same hardware
* results in "noisy" results
* as always - exact-same-hardware is important for reliable results-reporting
* current setup is internal - would be Exciting to try and get results published outside
* talk to jhutar's mgt to set priorities
* AI: [gerrod] to open communications
* "90% of the work" is "defining the test and running tests reliably"
* can we work w/ other downstream projects to get reliable access to pulp-hardware?
* internal OpenStack instance exists
* if jhutar had a pulp-setup-script and a pulp-performance-test-script, would be pretty straightforward to get something started
* "ideas are cheap, implementation sucks"
* Up/Down vote: is this group "done"?
* dalley working a few:
* ACS issue from last week
* https://bugzilla.redhat.com/show_bug.cgi?id=2228592
* might be improved by lmjachky's work
* measurements needed
* neither should block closing the working group
* consensus: This Group Is Done!
* AIs:
* ggainey to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944
## 2023-08-14 1000 GMT-4
* attendees: gerrod, lmjachky, dalley, ggainey
* regrets:
* Prev AIs:
* Agenda
* Performance improvement in content app ready for review
* https://github.com/pulp/pulpcore/pull/3803
* ACS artifact stage improvements
* Need someone familiar with ACS to sanity check the changes I'm making
* https://github.com/pulp/pulpcore/pull/4274
* needing to hydrate the RemoteArtifact is "unfortunate" from a performance POV
* gerrod to take a look and think about stages-use
* dalley still working on tests/perf-analysis
* Resolution for repo_version.get_content():
* https://github.com/pulp/pulpcore/pull/4275
* Collaborated with ipanova, tested the performance on a machine with 112k repositories -> got good results
* Do we need to touch the DB schema if the improvements were significant?
* no, please - keep this very backport-able
* Should we do an output-comparison between original/modified query against a COPR(ish) repo to make sure we get the same thing?
* yes please
* What should lmjachky look at next?
* maybe, nothing specific based on the comments below
* More generally - what future work do we want this group to work on?
* implications of immediate-sync an upstream on-demand remote - can completely overload the upstream content-app
* https://github.com/pulp/pulpcore/issues/3549
* Q: are we actually at a point where we declare **this** working group "done"?
* What about "automated performance tests" as part of CI?
* sounds like a fine, fine idea
* There Exist ansible playbooks that run perf-tests against downstream and spit out charts
* jhutar@redhat.com - invite him to this mtg to talk to us about perf-testing
* AIs:
* [lmjachky] compare filter-output pre/post #4275
* ggainey to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944
## 2023-08-07 1000 GMT-4
* attendees: gerrod, lmjachky, dalley, tsanders
* regrets: ggainey
* Prev AIs:
* lmjachky will start optimizing the repo_version.get_content(plugin.Model) (baseline with pulp_rpm) query to get better results in general (no longer focusing on pulp_ansible performance)
* https://github.com/pulp/pulpcore/issues/3969#issuecomment-1662977035
* gerrod to measure times using session auth & investigate performance regarding DRF web renderer
* Session auth test resulted in no performance difference across versions
* Agenda
* Performance improvement around publications in pulp_rpm
* https://github.com/pulp/pulp_rpm/pull/3224/files
* Suggestions for lmjachky's query testing
* Contact ipanova to use COPR machine to test
* Use potentially slow seq scans in explain statement as a guide for DB model changes
* Sync stage query for existing content dominates sync pipeline for re-sync tasks
* Potentially could create a large cache of previous version's content to check against
* Only use content's natural uniqueness fields to make cache as small and fast as possible
* Probably would require some refactoring of the stage's pipeline as it doesn't have knowledge of the previous repo-version
* AIs:
* gerrod to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944
## 2023-07-31 1000 GMT-4
* attendees:gerrod, lmjachky, dalley
* regrets: ggainey
* Prev AIs:
* AI: all - take 20 min before next mtg to review/triage "Performance" labelled issue in core
* https://github.com/pulp/pulpcore/issues?q=is%3Aopen+is%3Aissue+label%3APerformance
* AI: lmjachky needs to get more info about SQL queries run inside Django
* Agenda
* Performance labelled issues
* Issues look good
* https://github.com/pulp/pulpcore/issues/3549 candidate for future work
* lmjachky's SQL query investigation (https://github.com/pulp/pulpcore/issues/3969#issuecomment-1652531793 +)
* Used DEBUG=True and explain() to see the queries running under the hood:
* no significant differences like complex joins between the v.get_content() (faster) and v.get_content(packages) (slower) queries:
* just selection of more fields + a loop with one iteration in explain
* **worth rewritting the get_content query from scratch**
* 3.25 performance update:
* Fairly confident majority of slow down is from Basic Auth Changes in django 4.2
* Slight bump when domains was introduced in 3.23, but consistent response times from 3.24->3.28 when auth is removed
* AIs:
* lmjachky will start optimizing the repo_version.get_content(plugin.Model) (baseline with pulp_rpm) query to get better results in general (no longer focusing on pulp_ansible performance)
* gerrod to measure times using session auth & investigate performance regarding DRF web renderer
* gerrod to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944
## 2023-07-24 1000 GMT-4
* attendees: ggainey, dalley, lmjachky, gubben
* regrets:
* Prev AIs:
* **DONE** AI: gerrod to schedule weekly
* AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
* **DONE** AI: ggainey to dig out where initial perf-discussion happened
* See [this Matrix discussion](https://matrix.to/#/!aVApiNMtnstWbwDcVU:matrix.org/$kTOGE_lFJWaILRMBN9WYvDPQotqtTaGG0PXq8d1nsmo?via=libera.chat&via=matrix.org&via=fedora.im)
* AI: gerrod takes lead on [#3970](https://github.com/pulp/pulpcore/issues/3970)
* https://docs.google.com/spreadsheets/d/1LpiTBzA-L9sR9B2xcVmHqPoWfgDONdTArztj4auBBHw/edit?usp=sharing
* some brute-force test scripts in place
* 4 endpoints, vs 4 versions, 4 scenarios (empty, 1000 content, empty-noauth, 1000 content-noauth)
* looks strongly like Django4/basic-auth hashing change
* discussion ensued
* AI: lmjachky takes lead on [#3969](https://github.com/pulp/pulpcore/issues/3969) (contents-from-repo-version)
* working w/ originators to get perf-testing scripts to use
* discussion ensued
* NEW AI: lmjachky needs to get more info about SQL queries run inside Django
* **DONE** ~~AI: dalley continues lead on [#2250](https://github.com/pulp/pulpcore/issues/2250)~~ (memory-growth)
* "number of queries" question post-fix
* pinged originator (gmbnomis) on their tests in pulp_cookbook that exposed the original problem
* Agenda
* discuss issues from prev-AI
* AIs:
* AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
* AI: lmjachky needs to get more info about SQL queries run inside Django
* ggainey to post minutes to https://discourse.pulpproject.org/t/performance-working-group/944
## 2023-07-17 1000 GMT-4
* attendees: ggainey, gubben, lmjachky, dalley
* agenda
* Decide on weekly meeting schedule
* consensus vote says "this time slot works"
* AI: gerrod to schedule weekly
* Add any new performance issues under 'Performance' label
* https://github.com/pulp/pulpcore/issues?q=is%3Aopen+is%3Aissue+label%3APerformance
* this list probably needs to be triaged
* possible some of these are dups/already-addressed/etc
* Go over Priority list, assign work
* discussion about how to approach this effort
* need to have a well-defined baseline
* prob in 3.24/3.25/main? (for #3970)
* measure a well-defined set of REST calls for each
* for repository-query (#3969) - Repository._content_relationships()
* prob is the same prob for all versions - need a pre/post-FIX measurement
* AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
* AI: ggainey to post a discourse thread
* AIs
* AI: gerrod to schedule weekly
* AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
* AI: ggainey to dig out where initial perf-discussion happened
* See [this Matrix discussion](https://matrix.to/#/!aVApiNMtnstWbwDcVU:matrix.org/$kTOGE_lFJWaILRMBN9WYvDPQotqtTaGG0PXq8d1nsmo?via=libera.chat&via=matrix.org&via=fedora.im)
* AI: gerrod takes lead on [#3970](https://github.com/pulp/pulpcore/issues/3970)
* AI: lmjachky takes lead on [#3969](https://github.com/pulp/pulpcore/issues/3969)
* AI: dalley continues lead on [#2250](https://github.com/pulp/pulpcore/issues/2250)
* AI: ggainey to post a discourse thread
* https://discourse.pulpproject.org/t/performance-working-group/944