On our OpenShift Clusters teams generally use Memory and Disk storage responsibly. That is, they use what they ask for. When it comes to CPU, however, teams are far less optimized and need support to get back into alignment with our best practices.
Generally, teams misuse CPU for a few key reasons:
They misunderstand where to allocate CPU to make their app performant. If this is the case, they allocate too much to request.cpu
thinking that will improve application performance - it does not. The only way to make an app performant is to increase limit.cpu
.
They misunderstand what CPU is. Teams misunderstand how CPU works on Linux and Kubernetes. It's thought of as a "buff" making the CPU they have more powerful so it can do more work. It provides more time to do the work it needs to.
Building on #2 above they misunderstand how this applies to multithreaded applications, notably those written in .NET or Java.
They architected the application VMs. It vertically scales (one pod that does a lot) rather than horizontally scales (logs of small pods with a Horizontal Pod Autoscaller (HPA)).
To help teams, the first step is to understand the problem ourselves then help them by first educating and then providing clear steps they can follow for corrective action.
Start by running a PromQL against the namespace or project to gauge if the project is using resources very well, reasonably well, poorly, etc. This will help you decide whether to rubber stamp the request or if it needs further consideration.
The PromQL below will sum up request, limit, and usage for an entire project or namespace. See the pro-tip below on how to change it in meaningful ways.
Pro-Tip:
Use fc726a-dev
to isolate and look at an individual namespace in the PromQL example above.
Use offset 1w
or offset 2w
to see how the namespace or project has used quota over the past 1 week or 2 weeks respectively.
Metric | Value |
---|---|
namespace:container_cpu_usage:sum | 0.066 |
namespace_cpu:kube_pod_container_resource_limits:sum | 35.5 |
namespace_cpu:kube_pod_container_resource_requests:sum | 14.95 |
Divide the container_cpu_usage
and kube_pod_container_resource_requests
to determine what the namespace or project set is using as a percentage. In this case, its 0.066 / 14.95 * 100 = 0.44%
. Platform Services recommends teams target 80% utilization whereas this project is using < 1% of the CPU they're requesting.
The difference between what the deployments ask for and what the containers use is called "slack". An analogy I often use with teams is an "all you can eat buffet". When a deployment requests substantially more than it uses its analogous to a person filling a heaping plate at a buffet but only eating one or two bytes of food. What is left on the plate becomes food waste as it cannot go back on the buffet. Likewise, no other project can utilize the CPU they have tied up as slack - it's wasted.
Once the problem is understood you can dig deeper as needed. The following PromQL will check to see if any of the pods are experiencing CPU throttling. Throttling is when a container does not have the time (remember, we're on a time-share system) to complete its work in the time allocated (100/ms blocks).
In the query below container_cpu_cfs_throttled_periods_total
is the total number of 100ms blocks of time the container has been given since it started. Its total run time can be calculated as container_cpu_cfs_periods_total * 100ms = total-run-time-in-milliseconds
. The other parameter is the number of those blocks of time where the container needed more time to complete its work. They are being summed and divided to get the total values for the given namespace as a percentage.
Pro-Tip:
fc726a-dev
to isolate and look at an individual namespace in the PromQL example above.pod
parameter regex to identify a specific group of pods like patroni.*
.Metric | Value |
---|---|
fc726a-dev | 0.013 |
fc726a-test | 0.037 |
fc726a-prod | 0.055 |
In the case of fc726a-prod, the average container did not have enough time to do all its work about 5% of the time. Its not uncommon for this value to be 0%, even low values are reasonable, but as values creep up towards past 10% (heuristic value) it tends to imply a problem with improperly set limits which may require additional quota.
Once "slack" and "throttling" are understood it is beneficial to check out how the namespace is utilizing exiting quota. In the image below, if this project were asking for additional CPU quota it is worth asking further questions to understand why the team believes additional resources are required.
The section above on slack and throttling provide enough information to go digging for some useful suggestions to the team. The next step is to look through Deployments and StatefulSets to see if any pro-tips can be given to the teams. Generally, look for any specific advice such as:
request.cpu
that is too high (slack) or limut.cpu
that is too (throttling);Once your analysis is completed you can either approve the quota request because resources are being used well enough, or because there is not much to be gained by further optimization. If you feel the request for quota is unnecessary:
Email the technical contacts and Cc the Po with an email outlining your thoughts such as:
If you do approve the quota request, I generally require the team to demonstrate good resource utilization before approving subsequent requests. For example, if you request quota for dev
, and the request is small, it's usually ok and approved on the spot. But to move to the next namespace I'll need to see evidence that additional quota is required. An email like the following helps prepare teams for this so there are no surprises:
When approving disk storage, I generally follow-up with an email encouraging teams to grow PVCs as needed and not allocating large PVCs with very little data.
If you reject a disk storage request because the just made a bunch of large PVC and have nothing left for new ones, this is what I might say: