# Proposal: gather /status/ telemetry ## What question will this help us answer? The general use is to get a feel for what a "typical" Pulp3 installation looks like. The most important question being answered is "What specific plugin-versions are in use?" This will start giving uis insight into what our community's upgrade-pace is, and gauge plugin-popularity. We'll be able to start gauging what a typical/standard/median "size" of a Pulp3 installtion is, by learning: * How many content-apps are in use? * How many hosts do these content-apps run on? * How many workers are in use? * How many hosts do the workers run on? ~~* How much disk is available~~ Finally, we'll learn how important "redis" is to the community, since /status/ reports on whether it is in use or not. ## What is a specific example of the data to be gathered? ``` $ http :/pulp/api/v3/status/ { "database_connection": { "connected": true }, "online_content_apps": [ { "last_heartbeat": "2022-01-26T18:58:36.372233Z", "name": "49030@pulp3-source-fedora34.padre-fedora.example.com" }, { "last_heartbeat": "2022-01-26T18:58:36.532093Z", "name": "49031@pulp3-source-fedora34.padre-fedora.example.com" }, ], "online_workers": [ { "current_task": null, "last_heartbeat": "2022-01-26T18:58:39.618259Z", "name": "49007@pulp3-source-fedora34.padre-fedora.example.com", "pulp_created": "2022-01-26T15:17:47.865## Parking Lot for potential future/RFE work 912Z", "pulp_href": "/pulp/api/v3/workers/202beb44-4c54-48d9-a3f1-c671c406310e/" }, { "current_task": null, "last_heartbeat": "2022-01-26T18:58:39.655667Z", "name": "49017@pulp3-source-fedora34.padre-fedora.example.com", "pulp_created": "2022-01-26T15:17:48.479940Z", "pulp_href": "/pulp/api/v3/workers/4d3bfd4c-1bd2-4930-8c7b-15e800bec3e0/" }, ], "redis_connection": { "connected": true }, "storage": { "free": 31851548672, "total": 42006183936, "used": 7990427648 }, "versions": [ { "component": "core", "version": "3.18.0.dev" }, { "component": "file", "version": "1.11.0.dev" }, { "component": "rpm", "version": "3.18.0.dev" }, { "component": "container", "version": "2.11.0.dev" }, { "component": "deb", "version": "2.18.0.dev" }, { "component": "certguard", "version": "1.6.0.dev" }, { "component": "pulp_2to3_migration", "version": "0.16.0.dev" } ] } ``` ## How will this metric be stored (in the database or gathered at runtime)? * Gathered directly from the /status/ endpoint at telemetry-send-time. ## Will the gathering and/or storage of this cause unacceptable burden/load on Pulp? * No - /status/ is a very low-impact API. ## Is this metric Personally-Identifiable-Data? * **YES** - `online_content_apps` and `online_workers` worker-names include machine-names, which can carry PID. These will need to be sanitized. ### How can we sanitize this output? * remove the "name" field (it doesn't teach us anything) * replace `"name": <actual-process-id>@<actual-host>` with `"name": PID@HOST` * replace `"name": <actual-process-id>@<actual-host>` with `"name": PID@sha256(actual-host-name)` * this would let us track number-of-unique-hosts, without knowing the hostname * record only number-of-processes/number-of-hosts * requires a little more processing ## What pulpcore version will this be collected with? * 3.19 ## Discussion * keeping unique-host-info vs counts * can we tell plugins-per-host? is it useful? is it even possible? * currently don't/can't do this * would allow scaling-control * what about artifact/content "sizes"? * yes! needs its own proposal - volunteers? ## Is this approved/not-approved? * accept alternative proposal: * Aye: 6 * Nay: 0 ## Alternative Proposal * Lose db/redis info * "on" isn't useful - "version" is * should be their own telemetry * Lose storage * not very useful * when connected to object-storage, not very sueful * should also be its own telemetry option * Summary info * change to count process and hosts * see better questions at top, sanitation section ### Proposed alternate telemetry data ``` { "online_content_apps": { "processes": 2 "hosts": 1 }, "online_workers": { "processes": 2 "hosts": 1 }, "versions": [ { "component": "core", "version": "3.18.0.dev" }, { "component": "file", "version": "1.11.0.dev" }, { "component": "rpm", "version": "3.18.0.dev" }, { "component": "container", "version": "2.11.0.dev" }, { "component": "deb", "version": "2.18.0.dev" }, { "component": "certguard", "version": "1.6.0.dev" }, { "component": "pulp_2to3_migration", "version": "0.16.0.dev" } ] } ``` ## Parking Lot for potential future/RFE work * can we tell plugins-per-host? is it useful? is it even possible? * currently don't/can't do this * would allow scaling-control * Determining clusters solutions * Pulp instances that are scaled out horizontally, how could that be visualised (give away in the status?) * Unique pulp instances, that for part of a "cluster" from a clients perspective (does that matter, probably not) ###### tags: `Telemetry` ## Graphs to be produced * How many unique systems there are? * represent as a line graph over time (count of total unique systems) * Versions per component * For each component, e.g. rpm, certguard, pulpcore * use a pie chart to show the version distribution for that component * Bar graph reports the number of users per component * Regardless of version * how to distinguish whether this is a single container installation? * Average hosts * online_content_app hosts summarized into a single average, and graphed as a timeseries * online_workers hosts summarized into a single average, and graphed as a timeseries * Average processes * same as above, only for processes * Average processes / host * same as above, only for average processes / host