Constraining Akri Containers' Resource Usage

Currently, Akri's components (Controller, Agent, Discovery Handlers) do not have specified compute resource (Memory/CPU) requests or limits. This means there is no clarity on the maximum amount of resources Akri uses.

Note: brokers have been excluded from the analysis, since a user can provide any broker image; however, the ability to set broker resource limits and requests should be added to Akri's Helm charts.

This documentation investigates Akri's components resource usage and defines requests and limits for each component based on the results. It walks through the steps described in Daz Wilkin's blog that investigated Akri resource usage using Virtical Pod Autoscaler (VPA).

After the first round of investigation, the VPA recommended the default minimum CPU (15m) and memory (100Mi) for all of Akri's components as lower bounds, targets, and uncapped targets. In response, the minimum values were set to 0 in the VPA to obtain the resulting subsequent values.

Note: Minimums were set by passing the following values into the VPA Helm chart maintained by the Fairwinds team:

recommender:
  extraArgs:
    pod-recommendation-min-cpu-millicores: 0
    pod-recommendation-min-memory-mb: 0

Vertical Pod Autoscaler's resource recommendations

Summary

The Controller and Agent required slightly more compute resources than Discovery Handlers.
Controller:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "100Mi"
    cpu: "26m"

Agent:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "79Mi"
    cpu: "26m"

All Discovery Handlers:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "24Mi"
    cpu: "24m"

Note: the udev Discovery Handler's upper bound required one less CPU unit; however, for consistency for the time being, all Discovery Handlers will get the same requests and limits.

Change in recommendations over time

The upper bound of all of the recommendations dropped overtime, starting high and then lowering as time passed. After letting Akri run overnight on a 2 node cluster with all Discovery Handlers and Configurations, the recommendations dropped with the upper bound dropping significantly.

For example, at first the Controller's upper bound was

     "upperBound": {
       "cpu": "2226m",
       "memory": "2327583916"
     }

By the next day, as the VPA gathered more information about usage trends, it dropped to the following:

     "upperBound": {
       "cpu": "26m",
       "memory": "104857600"
     }

VPA Memory Recommendations for Controller
VPA Memory Recommendations for Controller

VPA Memory Recommendations for Controller after Initial Spike
VPA Memory Recommendations for Controller after Initial Spike

The CPU upper bound recommendations also start extremely high, which may be skewing the final target and lower bounds if they take into account the upper bound, as they are consistently in the 0-3 range, jumping with each restart of the controller:
VPA CPU Recommendations for Controller

Note: the controller appeared to be restarting due to bubbled up connection errors to the Kubernetes API; however, the Controller can cleanly handle restarts.

The bounds that the components settled at by the next day are what will be used; however, using Grafana to chart out the change in bounds overtime and as devices and nodes are added to the system could lead to a more accurate choice of bounds. All resource requests and limits will be customizable via Helm values.

The VPA's, after running for 16 hours, settled on similar recommendations for all of Akri's components (Agent, Controller, Udev Discovery Handler, ONVIF Discovery Handler, OPC UA Discovery Handler, and debug echo Discovery Handler). They are as follows.

Akri Controller Bounds

Container Recommendations:
  Container Name:  akri-controller
  Lower Bound:
    Cpu:     10m
    Memory:  11120035
  Target:
    Cpu:     11m
    Memory:  23574998
  Uncapped Target:
    Cpu:     11m
    Memory:  23574998
  Upper Bound:
    Cpu:     26m
    Memory:  104857600

The recommendation suggests a request of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and a little under 11Mi of memory. The recommendations correlate with the following addition to the Agent DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "100Mi"
    cpu: "26m"

Akri Agent Bounds

Container Recommendations:
  Container Name:  akri-agent
  Lower Bound:
    Cpu:     10m
    Memory:  11318184
  Target:
    Cpu:     11m
    Memory:  23574998
  Uncapped Target:
    Cpu:     11m
    Memory:  23574998
  Upper Bound:
    Cpu:     26m
    Memory:  82840000

The source of the memory values can be seen on the following graph. Note that actual usage was always below all recommendation types:
VPA Memory Recommendations for Controller

The actual CPU usage is also lower than all recommendations:
VPA CPU Recommendations for Agent

The recommendation suggests a limit of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and ~11Mi of memory. The recommendations correlate with the following addition to the Agent DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "79Mi"
    cpu: "26m"

Akri Discovery Handlers

Udev

Container Recommendations:
  Container Name:  akri-udev-discovery
  Lower Bound:
    Cpu:     10m
    Memory:  11572964
  Target:
    Cpu:     11m
    Memory:  11500000
  Uncapped Target:
    Cpu:     11m
    Memory:  11500000
  Upper Bound:
    Cpu:     23m
    Memory:  25041396

The source of the memory values can be seen on the following graph. Note that actual usage was always below all recommendation types:
VPA Memory Recommendations for Udev Discovery Handler

The actual CPU usage is also lower than all recommendations:
VPA CPU Recommendations for Udev Discovery Handler

The recommendation suggests a limit of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and ~11Mi of memory. The recommendations correlate with the following addition to the Udev Discovery Handler DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "24Mi"
    cpu: "23m"

ONVIF

Container Recommendations:
  Container Name:  akri-onvif-discovery
  Lower Bound:
    Cpu:     10m
    Memory:  11473094
  Target:
    Cpu:     11m
    Memory:  11500000
  Uncapped Target:
    Cpu:     11m
    Memory:  11500000
  Upper Bound:
    Cpu:     24m
    Memory:  24976379

The source of the memory values can be seen on the following graph. Note that actual usage was always below all recommendation types:
VPA Memory Recommendations for ONVIF Discovery Handler

The actual CPU usage is also lower than all recommendations:
VPA CPU Recommendations for ONVIF Discovery Handler

The recommendation suggests a limit of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and ~11Mi of memory. The recommendations correlate with the following addition to the ONVIF Discovery Handler DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "24Mi"
    cpu: "24m"

OPC UA

Container Recommendations:
  Container Name:  akri-opcua-discovery
  Lower Bound:
    Cpu:     10m
    Memory:  11472167
  Target:
    Cpu:     11m
    Memory:  11500000
  Uncapped Target:
    Cpu:     11m
    Memory:  11500000
  Upper Bound:
    Cpu:     24m
    Memory:  25441349

The source of the memory values can be seen on the following graph. Note that actual usage was always below all recommendation types:
VPA Memory Recommendations for OPC UA Discovery Handler

The actual CPU usage is also lower than all recommendations:
VPA CPU Recommendations for OPC UA Discovery Handler

The recommendation suggests a limit of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and ~11Mi of memory. The recommendations correlate with the following addition to the OPC UA Discovery Handler DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "24Mi"
    cpu: "24m"

Debug Echo

Container Recommendations:
  Container Name:  akri-debug-echo-discovery
  Lower Bound:
    Cpu:     10m
    Memory:  11472285
  Target:
    Cpu:     11m
    Memory:  11500000
  Uncapped Target:
    Cpu:     11m
    Memory:  11500000
  Upper Bound:
    Cpu:     24m
    Memory:  25382523

The source of the memory values can be seen on the following graph. Note that actual usage was always below all recommendation types:
VPA Memory Recommendations for Debug Echo Discovery Handler

The actual CPU usage is also lower than all recommendations:
![VPA CPU Recommendations for Debug Echo Discovery Handler]

The recommendation suggests a limit of 10m Kubernetes CPU units, which is the equivalent of asking for 0.010 vCPUs/Cores, and ~11Mi of memory. The recommendations correlate with the following addition to the Debug Echo Discovery Handler DaemonSet YAML:

resources:
  requests:
    memory: "11Mi"
    cpu: "10m"
  limits:
    memory: "24Mi"
    cpu: "24m"

Akri Sample Brokers

Kubectl top

Current usage of the brokers

NAME                                            CPU(cores)   MEMORY(bytes)
debug-echo-broker-deployment-84bfc77467-86gc5   0m           8Mi
debug-echo-broker-deployment-84bfc77467-cplpj   0m           8Mi
debug-echo-broker-deployment-84bfc77467-f698d   0m           8Mi
onvif-broker-deployment-86685869b6-426bb        898m         111Mi
onvif-broker-deployment-86685869b6-cnvw9        897m         117Mi
onvif-broker-deployment-86685869b6-wht77        900m         119Mi
opcua-broker-deployment-64d8647f7-6hp4r         3m           72Mi
opcua-broker-deployment-64d8647f7-9bqdb         2m           72Mi
opcua-broker-deployment-64d8647f7-zcfs5         2m           72Mi
udev-broker-deployment-599c8665bf-6hxkj         0m           2Mi
udev-broker-deployment-599c8665bf-pdq4g         0m           2Mi
udev-broker-deployment-599c8665bf-sqpkb         0m           2Mi

Broker Pod VPA Analysis Grafana Dashboard

Note: The OPC UA broker was deployed long after the others

Kubectl top

To check current usage of resources, use kubectl top.
For example, kubectl top pods shows:

NAME                                          CPU(cores)   MEMORY(bytes)
akri-agent-daemonset-7p9p5                    0m           6Mi
akri-agent-daemonset-8hsrg                    3m           10Mi
akri-controller-deployment-85f76c5c8b-gn54p   0m           5Mi
akri-debug-echo-discovery-daemonset-4ttsd     0m           2Mi
akri-debug-echo-discovery-daemonset-nfsch     0m           2Mi
akri-onvif-discovery-daemonset-fghzr          0m           4Mi
akri-onvif-discovery-daemonset-xlt2f          0m           3Mi
akri-opcua-discovery-daemonset-9bvh8          0m           2Mi
akri-opcua-discovery-daemonset-wxl6c          0m           2Mi
akri-udev-discovery-daemonset-cjbqv           0m           2Mi
akri-udev-discovery-daemonset-nqmxq           0m           2Mi
Select a repo