owned this note changed 2 years ago
Published Linked with GitHub

Current Priority List

  • Performance degradation investigation
  • Repository content querying performance
    • https://github.com/pulp/pulpcore/issues/3969
    • AI: lmjachky to take this one on:
      • try to narrow down the performance implications of upgrading pulpcore to 3.25+, like noted in #3970
    • proposals: store content in ArrayList vs reference repository versions as numbers instead of FKs
    • AI: ggainey to dig up minutes of last discussion on this topic and link here/in-issue

Template

## 2023-MM-DD 1000 GMT-4
* attendees:
* regrets:
* Prev AIs:
* Agenda
* AIs:
  * ggainey to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944

Upcoming

  • DONE!

2023-08-21 1000 GMT-4

  • attendees: ggainey, gerrod, lmjachky, dalley, jhutar
  • regrets:
  • Prev AIs:
    • [lmjachky] compare filter-output pre/post #4275
  • Agenda
    • Can we get jhutar here to talk to us about his perf work?
      • biggest issue: someone needs to PAY ATTENTION to the results
      • how do you decide on red/green for a test?
        • need to define a range (for a number of metrics), note when something is "outside" allowed
      • perfteam has an easy process on-demand, but hard to keep same hardware
        • results in "noisy" results
        • as always - exact-same-hardware is important for reliable results-reporting
      • current setup is internal - would be Exciting to try and get results published outside
      • talk to jhutar's mgt to set priorities
        • AI: [gerrod] to open communications
      • "90% of the work" is "defining the test and running tests reliably"
      • can we work w/ other downstream projects to get reliable access to pulp-hardware?
      • internal OpenStack instance exists
        • if jhutar had a pulp-setup-script and a pulp-performance-test-script, would be pretty straightforward to get something started
      • "ideas are cheap, implementation sucks"
    • Up/Down vote: is this group "done"?
  • AIs:

2023-08-14 1000 GMT-4

  • attendees: gerrod, lmjachky, dalley, ggainey
  • regrets:
  • Prev AIs:
  • Agenda
    • Performance improvement in content app ready for review
    • ACS artifact stage improvements
      • Need someone familiar with ACS to sanity check the changes I'm making
      • https://github.com/pulp/pulpcore/pull/4274
      • needing to hydrate the RemoteArtifact is "unfortunate" from a performance POV
        • gerrod to take a look and think about stages-use
      • dalley still working on tests/perf-analysis
    • Resolution for repo_version.get_content():
      • https://github.com/pulp/pulpcore/pull/4275
        • Collaborated with ipanova, tested the performance on a machine with 112k repositories -> got good results
        • Do we need to touch the DB schema if the improvements were significant?
          • no, please - keep this very backport-able
        • Should we do an output-comparison between original/modified query against a COPR(ish) repo to make sure we get the same thing?
          • yes please
  • What should lmjachky look at next?
    • maybe, nothing specific based on the comments below
  • More generally - what future work do we want this group to work on?
    • implications of immediate-sync an upstream on-demand remote - can completely overload the upstream content-app
    • https://github.com/pulp/pulpcore/issues/3549
    • Q: are we actually at a point where we declare this working group "done"?
  • What about "automated performance tests" as part of CI?
    • sounds like a fine, fine idea
    • There Exist ansible playbooks that run perf-tests against downstream and spit out charts
    • jhutar@redhat.com - invite him to this mtg to talk to us about perf-testing
  • AIs:

2023-08-07 1000 GMT-4

  • attendees: gerrod, lmjachky, dalley, tsanders
  • regrets: ggainey
  • Prev AIs:
    • lmjachky will start optimizing the repo_version.get_content(plugin.Model) (baseline with pulp_rpm) query to get better results in general (no longer focusing on pulp_ansible performance)
    • gerrod to measure times using session auth & investigate performance regarding DRF web renderer
      • Session auth test resulted in no performance difference across versions
  • Agenda
    • Performance improvement around publications in pulp_rpm
    • Suggestions for lmjachky's query testing
      • Contact ipanova to use COPR machine to test
      • Use potentially slow seq scans in explain statement as a guide for DB model changes
    • Sync stage query for existing content dominates sync pipeline for re-sync tasks
      • Potentially could create a large cache of previous version's content to check against
      • Only use content's natural uniqueness fields to make cache as small and fast as possible
      • Probably would require some refactoring of the stage's pipeline as it doesn't have knowledge of the previous repo-version
  • AIs:

2023-07-31 1000 GMT-4

  • attendees:gerrod, lmjachky, dalley
  • regrets: ggainey
  • Prev AIs:
  • Agenda
    • Performance labelled issues
    • lmjachky's SQL query investigation (https://github.com/pulp/pulpcore/issues/3969#issuecomment-1652531793 +)
      • Used DEBUG=True and explain() to see the queries running under the hood:
        • no significant differences like complex joins between the v.get_content() (faster) and v.get_content(packages) (slower) queries:
          • just selection of more fields + a loop with one iteration in explain
        • worth rewritting the get_content query from scratch
    • 3.25 performance update:
      • Fairly confident majority of slow down is from Basic Auth Changes in django 4.2
      • Slight bump when domains was introduced in 3.23, but consistent response times from 3.24->3.28 when auth is removed
  • AIs:
    • lmjachky will start optimizing the repo_version.get_content(plugin.Model) (baseline with pulp_rpm) query to get better results in general (no longer focusing on pulp_ansible performance)
    • gerrod to measure times using session auth & investigate performance regarding DRF web renderer
    • gerrod to post to discourse https://discourse.pulpproject.org/t/performance-working-group/944

2023-07-24 1000 GMT-4

  • attendees: ggainey, dalley, lmjachky, gubben
  • regrets:
  • Prev AIs:
    • DONE AI: gerrod to schedule weekly
    • AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
    • DONE AI: ggainey to dig out where initial perf-discussion happened
    • AI: gerrod takes lead on #3970
    • AI: lmjachky takes lead on #3969 (contents-from-repo-version)
      • working w/ originators to get perf-testing scripts to use
      • discussion ensued
      • NEW AI: lmjachky needs to get more info about SQL queries run inside Django
    • DONE AI: dalley continues lead on #2250 (memory-growth)
      • "number of queries" question post-fix
      • pinged originator (gmbnomis) on their tests in pulp_cookbook that exposed the original problem
  • Agenda
    • discuss issues from prev-AI
  • AIs:

2023-07-17 1000 GMT-4

  • attendees: ggainey, gubben, lmjachky, dalley
  • agenda
    • Decide on weekly meeting schedule
      • consensus vote says "this time slot works"
      • AI: gerrod to schedule weekly
    • Add any new performance issues under 'Performance' label
    • Go over Priority list, assign work
    • discussion about how to approach this effort
      • need to have a well-defined baseline
        • prob in 3.24/3.25/main? (for #3970)
        • measure a well-defined set of REST calls for each
      • for repository-query (#3969) - Repository._content_relationships()
        • prob is the same prob for all versions - need a pre/post-FIX measurement
    • AI: all - take 20 min before next mtrg to review/triage "Performance" labelled issue in core
    • AI: ggainey to post a discourse thread
  • AIs
Select a repo