Ideas for task-throttling on large installs

Problem Statement

Pulp processes asynchronous jobs using workers from the worker-pool
While the number of workers is easily-scaled, there are always a finite number available
A Pulp installation should have as many workers "as its CPU and database can handle". More than that may actually decrease performance/throughput
Tasks are processed in strictly FIFO order, assuming resources are available
On a pulp instance serving many users and/or many domains, it is quite possible for one user/domain to fill up the queue of tasks in a way that forces all other users/domains to wait
- This is especially an issue in an environment aiming for as much self-service as possible
  - A large instance where the "large number of users" are only content-consumers (e.g., COPR) is not as strongly affected
  - Even in such an instance, it can be desireable for specific kinds-of tasks to be throttled (see "Current Experience" below)
In general, allowing a single end-user to monopolize background processing for the entire instance is a (very) suboptimal user-experience

"What Are We Trying To Solve?"

Prevent a single task-group/user/domain from monopolizing the available workers for extended periods of time, while
insuring that workers are never idle, if there are tasks in the queue that are ready-to-run

Possible solutions/brainstorming

See related discussion/thoughts from last year!
It should be possible for the Pulp admin to specify a "maximum percentage" of the worker-pool that any user and/or domain can consume at any one time
Tasks are picked up by workers only when all "reserved resources" are available
reserved-resource is an arbitrary string
if every task dispatched included a reserved-resource identifying the user and/or domain that requested it, a given user/domain would only consume one worker at a time
if such an identifier included a -N, where "N" represents the number-of-workers that user/domain is allowed at a time, then the user/domain could consume N workers (only) at any given time
maybe just address task-group-issue first, leave user/domain changes for a second phase?

Current experience

Several task-groups that dispatch multiple tasks atomically already use a similar throttling mechanism
- core-import
- rpm-prune

Problems

It is possible under this approach to end up with idle workers even when there are tasks ready-to-run in the queue
- can we teach the process to understand this state and arrange for it to not happen?
Only core knows how many workers are currently in the pool (plugins can't (currently) "know" this)
That number can change dynamically
The round-robin approach in the two examples above works only because the "index" is beng monitored by the single task doing the dispatching
In the general case, an arbitrary dispatch() call would need to pick an "index" to use
- preferably in some "stateless" manner
- how to accomplish this is left as an exercise for the reader
  - percentage modulo current-milliseconds?
Should the number-of-workers-allowed be something set table per-user/per-domain?
Should only the admin be allowed to set it? (yes)
If it's per-user/per-domain - should the user/domain-admin be allowed to lower the number?

Ideas from ChatGPT

To address the issue of preventing users from monopolizing all the workers while still ensuring that jobs are handled efficiently, consider implementing the following strategies:

Job Queue Prioritization:
    Fair Queuing: Use a fair queuing system where jobs from different users are interleaved to ensure fairness. This prevents any single user from dominating the queue.
    Priority Queuing: Assign priorities to jobs based on user type or job importance. This ensures critical jobs are processed more quickly.

Job Rate Limiting:
    Per-User Limits: Set a maximum number of jobs that a single user can submit within a certain time frame. This prevents any user from flooding the system with jobs.
    Batch Submission Limits: Limit the number of jobs a user can submit as part of a single operation. For instance, if a user's operation involves N jobs, limit how many of these can be submitted at once.

Worker Allocation:
    Dynamic Allocation: Dynamically allocate workers based on the job queue, ensuring that no user gets more than a fair share of workers. For example, if there are 10 workers and 5 users, each user could get up to 2 workers.
    Reserved Workers: Reserve a portion of workers for high-priority or time-sensitive jobs, ensuring that critical operations are not delayed by large batches of regular jobs.

Job Scheduling Algorithms:
    Round-Robin Scheduling: Rotate through users in a round-robin fashion to ensure an even distribution of worker time.
    Weighted Fair Queuing: Assign weights to users based on their job submission patterns and job sizes, ensuring that heavier users get proportionally less worker time.

Monitoring and Adjustment:
    Real-Time Monitoring: Implement monitoring to track job submission rates, worker utilization, and job completion times. This helps in identifying bottlenecks and user behavior patterns.
    Adaptive Controls: Adjust rate limits, worker allocation, and priorities dynamically based on the monitored data to maintain a balance between fairness and efficiency.

User Quotas:
    Quota Management: Implement a quota system where each user has a defined quota of jobs they can submit within a certain period. Users can request quota increases if necessary, subject to approval.
    Penalty for Overuse: Introduce penalties or throttling for users who exceed their quotas, ensuring that they do not disrupt the system for others.

By combining these strategies, you can create a balanced system that prevents any single user from monopolizing resources while ensuring that all users' jobs are handled efficiently.

A kid(user) is playing(dispatching) with as many toys(workers) as they want (and are available) as long as they are the only kid doing so. When another kid comes to play, the first must share the toys even if they planned to play with them longer(had more queued tasks). The more kids that join, the more that the toys must be shared

This is an implementation of a dynamic worker allocation strategy
- Add a second condition to the task selection algo that finds the number of running tasks per user. Choose the next task to better allocate running tasks per user to a set limit.
- This limit could be static global number, dynamically calculated based on current number of users with tasks, or be set on a per user basis.
- Users can go over their task limit if there are idle workers and no other waiting users/available tasks in the queue.
- Example: System has 5 available idle workers and each user has a static 2-worker limit. User A dispatches 10 sync tasks. User A gets 5 tasks running. User B then dispatches 10 sync tasks. The next two tasks will go to User B. User A will continue with 3 workers for their remaining tasks since their tasks were first, while User B will continue with gauranteed 2 workers until User A's tasks are finish, then any extra worker will be available to them. If a third User C comes along and dispatches tasks they will get the next available worker(from A's latest finished task) and will continue having only that 1 worker till A or B's tasks are exhausted.

Thoughts from mface-to-face

heterogenous kinds-of tasks
- too long for web-response
- know it's going to be long-running (sync)
- know it can be immdeiate/deferred
proposal
- tasks can be flagged as "immediate"
- when a worker becomes available, choose immediate-unblocked ahead of long-running-unblocked
- problem: allows tasks that expect ordered-by-resource to become unordered

Ideas for task-throttling on large installs

Problem Statement

"What Are We Trying To Solve?"

Possible solutions/brainstorming

Current experience

Problems

Ideas from ChatGPT

The "Toy Sharing" Scheduler

Thoughts from mface-to-face

Read more

Discover how many REST operation-ids are NOT currently invoked during pulp-cli test run

Welcome to PulpCON 2023!

Domains and pulp_rpm - MY Content! Mine!

braindump