##### tags: `Kubernetes`
# KEDA + HPA
考慮一個分散式的 producer-consumer pattern,message 彼此間沒有關聯,consumer 實作為 `deployment` 跑在 k8s 中,透過 KEDA + HPA 設計 auto scaling。
## HPA
> https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
k8s 原生支援的是針對 cpu/memory 的 autom scaling,但在這個場景裡並不適用,因此在 HPA 不討論`when to scale` 以及 `how many to scale` 的問題,先關注 `how to scale`。
### scaling policy
> https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-policies
有兩種 policy 可以設定,分別是 `Pods` 跟 `Percent`,也可以設定若是同時滿足時取最大值或是最小值,如下的設定,定義了當 scale down 的條件持續了 10 秒,則減少兩個 pod。
```yaml=
behavior:
scaleDown:
stabilizationWindowSeconds: 20
policies:
- type: Pods
value: 2
periodSeconds: 10
```
```plantuml
@startuml
scale 5 as 150 pixels
concise "resource evaluate" as eva
concise "calculated replica" as cal_rep
concise "desired replica" as des_rep
@4 as :suf_start
@:suf_start+10 as :suf_end
@0
eva is undecided
cal_rep is 4
des_rep is 4
@:suf_start
eva is suffiecient
@:suf_end
eva is undecided
cal_rep is 2
@:suf_end+6
eva is suffiecient
@:suf_end+9
eva is undecided
@:suf_end+20
des_rep is 2
cal_rep@:suf_end -> des_rep@:suf_end+20 : take effect
highlight :suf_start to :suf_end #yellow: periodSeconds
highlight :suf_end to :suf_end+20 #lightyellow: stabilizationWindowSeconds
@enduml
```
## KEDA
> https://keda.sh/docs/2.5/concepts/scaling-deployments/
```yaml=
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: {scaled-object-name}
spec:
scaleTargetRef:
apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1
kind: {kind-of-target-resource} # Optional. Default: Deployment
name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject
envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0]
pollingInterval: 30 # Optional. Default: 30 seconds
cooldownPeriod: 300 # Optional. Default: 300 seconds
idleReplicaCount: 0 # Optional. Must be less than minReplicaCount
minReplicaCount: 1 # Optional. Default: 0
maxReplicaCount: 100 # Optional. Default: 100
fallback: # Optional. Section to specify fallback options
failureThreshold: 3 # Mandatory if fallback section is included
replicas: 6 # Mandatory if fallback section is included
advanced: # Optional. Section to specify advanced options
restoreToOriginalReplicaCount: true/false # Optional. Default: false
horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options
behavior: # Optional. Use to modify HPA's scaling behavior
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
triggers:
# {list of triggers to activate scaling of the target resource}
```
其中想先討論的是 `pollingInterval`
### pollingInterval
> This is the interval to check each trigger on. By default KEDA will check each trigger source on every ScaledObject every 30 seconds.
>
> Example: in a queue scenario, KEDA will check the queueLength every pollingInterval, and scale the resource up or down accordingly.
這個參數決定多久檢查一次 trigger。
考慮一個 queue 的應用場景,KEDA 會檢查 queueLength,第二個想討論的就是 `queueLength`
### queueLength
> Target value for queue length passed to the scaler. Example: if one pod can handle 10 messages, set the queue length target to 10. If the actual number of messages in the queue is 30, the scaler scales to 3 pods. (Default: 5, Optional)
要理解 queueLength,首先要理解 queue 是一個動態的系統,持續有 message 被建立,也有 message 被消化,而消化速度跟 pod 的 scale 成正比,在這樣的前提下,`the actual number of messages in the queue` 的意義是 `差`,是 message 的建立速率跟消化速率的差值,再更精確一點,是`速率差`乘上`pollingInterval`,結合 pollingInterval,我們可以理解 KEDA scale out pod 的目標,就是可以在 pollingInterval 內消化完多出來的 message,pod 的數量會是 message number 除以 queueLength,也就是說一個 pod 的容量要可以在 pollingInterval 內消化完 queueLength 個 message。
理解了 queueLegnth 以及 pollingInterval,我們再來討論前面跳過的問題:
### when to scale
當 `queueLength > 0` 就表示容量不夠了,需要做 scale out。
當 `queueLength = 0` 表示容量可能過剩,值得做 scale in。
### how many to scale
pod 數量 = message number 除以 queueLength