##### tags: `Kubernetes` # KEDA + HPA 考慮一個分散式的 producer-consumer pattern,message 彼此間沒有關聯,consumer 實作為 `deployment` 跑在 k8s 中,透過 KEDA + HPA 設計 auto scaling。 ## HPA > https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ k8s 原生支援的是針對 cpu/memory 的 autom scaling,但在這個場景裡並不適用,因此在 HPA 不討論`when to scale` 以及 `how many to scale` 的問題,先關注 `how to scale`。 ### scaling policy > https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-policies 有兩種 policy 可以設定,分別是 `Pods` 跟 `Percent`,也可以設定若是同時滿足時取最大值或是最小值,如下的設定,定義了當 scale down 的條件持續了 10 秒,則減少兩個 pod。 ```yaml= behavior: scaleDown: stabilizationWindowSeconds: 20 policies: - type: Pods value: 2 periodSeconds: 10 ``` ```plantuml @startuml scale 5 as 150 pixels concise "resource evaluate" as eva concise "calculated replica" as cal_rep concise "desired replica" as des_rep @4 as :suf_start @:suf_start+10 as :suf_end @0 eva is undecided cal_rep is 4 des_rep is 4 @:suf_start eva is suffiecient @:suf_end eva is undecided cal_rep is 2 @:suf_end+6 eva is suffiecient @:suf_end+9 eva is undecided @:suf_end+20 des_rep is 2 cal_rep@:suf_end -> des_rep@:suf_end+20 : take effect highlight :suf_start to :suf_end #yellow: periodSeconds highlight :suf_end to :suf_end+20 #lightyellow: stabilizationWindowSeconds @enduml ``` ## KEDA > https://keda.sh/docs/2.5/concepts/scaling-deployments/ ```yaml= apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: {scaled-object-name} spec: scaleTargetRef: apiVersion: {api-version-of-target-resource} # Optional. Default: apps/v1 kind: {kind-of-target-resource} # Optional. Default: Deployment name: {name-of-target-resource} # Mandatory. Must be in the same namespace as the ScaledObject envSourceContainerName: {container-name} # Optional. Default: .spec.template.spec.containers[0] pollingInterval: 30 # Optional. Default: 30 seconds cooldownPeriod: 300 # Optional. Default: 300 seconds idleReplicaCount: 0 # Optional. Must be less than minReplicaCount minReplicaCount: 1 # Optional. Default: 0 maxReplicaCount: 100 # Optional. Default: 100 fallback: # Optional. Section to specify fallback options failureThreshold: 3 # Mandatory if fallback section is included replicas: 6 # Mandatory if fallback section is included advanced: # Optional. Section to specify advanced options restoreToOriginalReplicaCount: true/false # Optional. Default: false horizontalPodAutoscalerConfig: # Optional. Section to specify HPA related options behavior: # Optional. Use to modify HPA's scaling behavior scaleDown: stabilizationWindowSeconds: 300 policies: - type: Percent value: 100 periodSeconds: 15 triggers: # {list of triggers to activate scaling of the target resource} ``` 其中想先討論的是 `pollingInterval` ### pollingInterval > This is the interval to check each trigger on. By default KEDA will check each trigger source on every ScaledObject every 30 seconds. > > Example: in a queue scenario, KEDA will check the queueLength every pollingInterval, and scale the resource up or down accordingly. 這個參數決定多久檢查一次 trigger。 考慮一個 queue 的應用場景,KEDA 會檢查 queueLength,第二個想討論的就是 `queueLength` ### queueLength > Target value for queue length passed to the scaler. Example: if one pod can handle 10 messages, set the queue length target to 10. If the actual number of messages in the queue is 30, the scaler scales to 3 pods. (Default: 5, Optional) 要理解 queueLength,首先要理解 queue 是一個動態的系統,持續有 message 被建立,也有 message 被消化,而消化速度跟 pod 的 scale 成正比,在這樣的前提下,`the actual number of messages in the queue` 的意義是 `差`,是 message 的建立速率跟消化速率的差值,再更精確一點,是`速率差`乘上`pollingInterval`,結合 pollingInterval,我們可以理解 KEDA scale out pod 的目標,就是可以在 pollingInterval 內消化完多出來的 message,pod 的數量會是 message number 除以 queueLength,也就是說一個 pod 的容量要可以在 pollingInterval 內消化完 queueLength 個 message。 理解了 queueLegnth 以及 pollingInterval,我們再來討論前面跳過的問題: ### when to scale 當 `queueLength > 0` 就表示容量不夠了,需要做 scale out。 當 `queueLength = 0` 表示容量可能過剩,值得做 scale in。 ### how many to scale pod 數量 = message number 除以 queueLength