[雲端] K8S / Operator / Helm-based
===
###### tags: `雲端 / K8s`
###### tags: `雲端`, `K8s`, `operator`, `helm`
<br>

[TOC]
<br>
## [Helm-based Operator ](https://sdk.operatorframework.io/docs/building-operators/helm/)
> Guide to building a Helm Based Operator using Operator SDK
> 使用 Operator SDK 建構基於 Helm 的 Operator 之指南
<br>
<hr>
<br>
## [[軟體前置安裝作業] 安裝 operator-sdk](https://sdk.operatorframework.io/docs/installation/install-operator-sdk/)
- ### 主要流程:安裝底下 3 支程式
- operator-sdk (大小:63M) (初期只會用到這支)
- ansible-operator (大小:46M)
- helm-operator (大小:54M)
- ### 基本環境檢核
- docker version 17.03+
```
$ docker -v
Docker version 19.03.4, build 9013bf583a
```
- kubectl version v1.11.3+
```
$ kubelet --version
Kubernetes v1.19.3
```
- ### 下載相關執行檔 (照文章說明)
- Install from GitHub release
```bash
# 設定要使用的程式版本
$ RELEASE_VERSION=v1.2.0
# 下載 3 支程式
# Linux
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/ansible-operator-${RELEASE_VERSION}-x86_64-linux-gnu
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/helm-operator-${RELEASE_VERSION}-x86_64-linux-gnu
# 下載 3 個驗證碼
# Linux
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu.asc
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/ansible-operator-${RELEASE_VERSION}-x86_64-linux-gnu.asc
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/helm-operator-${RELEASE_VERSION}-x86_64-linux-gnu.asc
```
下載完,會有 6 個檔案
```
$ tree -h
.
├── [ 45M] ansible-operator-v1.2.0-x86_64-linux-gnu
├── [ 589] ansible-operator-v1.2.0-x86_64-linux-gnu.asc
├── [ 53M] helm-operator-v1.2.0-x86_64-linux-gnu
├── [ 589] helm-operator-v1.2.0-x86_64-linux-gnu.asc
├── [ 63M] operator-sdk-v1.2.0-x86_64-linux-gnu
└── [ 589] operator-sdk-v1.2.0-x86_64-linux-gnu.asc
```
- 其實不用去檢查來源正確性...
- 官方提供的來源,通常是沒問題的
- 直接下載 operator-sdk-v1.2.0-x86_64-linux-gnu 就好
<br>
- 問題排除:No public key
```bash
$ gpg --verify operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu.asc
gpg: assuming signed data in 'operator-sdk-v1.2.0-x86_64-linux-gnu'
gpg: Signature made Wed Nov 11 07:54:26 2020 CST
gpg: using RSA key 0CF50BEE7E4DF6445E08C0EA9AFDE59E90D2B445
gpg: issuer "joe.lanford@gmail.com"
gpg: Cannot check signature: No public key
# To download the key, use the following command,
# replacing $KEY_ID with the RSA key string
# provided in the output of the previous command:
$ KEY_ID="0CF50BEE7E4DF6445E08C0EA9AFDE59E90D2B445"
$ gpg --recv-key "$KEY_ID"
gpg: keyserver receive failed: General error
# You’ll need to specify a key server if one hasn’t been configured.
$ gpg --keyserver keyserver.ubuntu.com --recv-key "$KEY_ID"
gpg: /home/ubuntu/.gnupg/trustdb.gpg: trustdb created
gpg: key 9AFDE59E90D2B445: public key "Joe Lanford <joe.lanford@gmail.com>" imported
gpg: Total number processed: 1
gpg: imported: 1
# 再次驗證就 okay 了
$ gpg --verify operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu.asc
gpg: assuming signed data in 'operator-sdk-v1.2.0-x86_64-linux-gnu'
gpg: Signature made Wed Nov 11 07:54:26 2020 CST
gpg: using RSA key 0CF50BEE7E4DF6445E08C0EA9AFDE59E90D2B445
gpg: issuer "joe.lanford@gmail.com"
gpg: Good signature from "Joe Lanford <joe.lanford@gmail.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 0CF5 0BEE 7E4D F644 5E08 C0EA 9AFD E59E 90D2 B445
```
- ### 安裝 operator-sdk (照文章說明)
```
$ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu && sudo mkdir -p /usr/local/bin/ && sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk && rm operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
```
- 主要流程:
- 將 operator-sdk 檔案,變更為 operator-sdk 執行檔
```
$ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
```
- 將 operator-sdk 執行檔複製 /usr/local/bin/ 目錄下
```
$ sudo mkdir -p /usr/local/bin/
$ sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk
```
- 刪除下載的檔案
```
$ rm operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
```
- 簡易測試 operator-sdk
```
$ operator-sdk version
operator-sdk version: "v1.2.0", commit: "215fc50b2d4acc7d92b36828f42d7d1ae212015c", kubernetes version: "v1.18.8", go version: "go1.15.3", GOOS: "linux", GOARCH: "amd64"
```
<br>
<hr>
<br>
## [Quickstart for Helm-based Operators](https://sdk.operatorframework.io/docs/building-operators/helm/quickstart/)
> A simple set of instructions that demonstrates the basics of setting up and running a Helm-based operator.
> 一組簡單的指令集,說明建立與運行一個基於 Helm 的 operator 的基礎概念
### [STEP-0] 新手上路
- 主題:
- 建立一個 nginx-operator
- 說明:
- 預設是使用 nginx
> 因為 helm create 指令建出來的預設範例是 nginx, port:80
<br>
### [STEP-1] 建立一個工作目錄,並建立一個「基於 Helm 的 operator 模板」
- 指令
```bash
$ mkdir nginx-operator
$ cd nginx-operator
$ operator-sdk init --plugins=helm
```
- 查看建立出來檔案目錄
```
├── Dockerfile <--- 基於 helm 的 Dockerfile
├── Makefile <--- 基於 helm 的 Makefile
├── PROJECT <--- 基於 helm 的 PROJECT
├── config
│ ├── default
│ │ ├── kustomization.yaml
│ │ └── manager_auth_proxy_patch.yaml
│ ├── manager
│ │ ├── kustomization.yaml
│ │ └── manager.yaml
│ ├── prometheus
│ │ ├── kustomization.yaml
│ │ └── monitor.yaml
│ ├── rbac
│ │ ├── auth_proxy_client_clusterrole.yaml
│ │ ├── auth_proxy_role.yaml
│ │ ├── auth_proxy_role_binding.yaml
│ │ ├── auth_proxy_service.yaml
│ │ ├── kustomization.yaml
│ │ ├── leader_election_role.yaml
│ │ ├── leader_election_role_binding.yaml
│ │ ├── role.yaml
│ │ └── role_binding.yaml
│ └── scorecard
│ ├── bases
│ │ └── config.yaml
│ ├── kustomization.yaml
│ └── patches
│ ├── basic.config.yaml
│ └── olm.config.yaml
├── helm-charts
└── watches.yaml <--- 基於 helm 的 watches
9 directories, 23 files
```
```docker
$ cat Dockerfile
# Build the manager binary
FROM quay.io/operator-framework/helm-operator:v1.2.0
ENV HOME=/opt/helm
COPY watches.yaml ${HOME}/watches.yaml
COPY helm-charts ${HOME}/helm-charts
WORKDIR ${HOME}
```
<br>
### [STEP-2] 建立:新資源的 API
> 建立 K8s 新資源(自定義資源),這個新資源叫做 Nginx
- 指令
```bash
$ operator-sdk create api --group demo1 --version v1 --kind Nginx
```
- group & version 用在 apiVersion 參數
```
apiVersion: group[.domain]/version
```
- 查看實際資源設定:
```
$ cat PROJECT
domain: my.domain
layout: helm.sdk.operatorframework.io/v1
projectName: nginx-operator
resources:
- group: demo1
kind: Nginx
version: v1
version: 3-alpha
```
```
$ cat watches.yaml
# Use the 'create api' subcommand to add watches to this file.
- group: demo1.my.domain
version: v1
kind: Nginx
chart: helm-charts/nginx
# +kubebuilder:scaffold:watch
```
- **會用到的參數**
- domain: my.domain
- group: demo1
- version: v1
- kind: Nginx<br><br>
- **Nginx 資源對應的 apiVersion**
```apiVersion: demo1.my.domain/v1```
<br>
- **Nginx 資源對應的 http 路徑**
- 模板:
```
/api/GROUP/VERSION/namespaces/NAMESPACE/TYPE/NAME
```
- 套入,應該會是
```
/api/demo1.my.domain/v1/namespaces/NAMESPACE/Nginx/NAME
```
- 新增的檔案清單
```
├── Dockerfile (old)
├── Makefile (old)
├── PROJECT (old)
├── config
│ ├── crd (new)
│ │ ├── bases
│ │ │ └── demo1.my.domain_nginxes.yaml
│ │ └── kustomization.yaml
│ ├── ...
│ ├── samples (new)
│ │ ├── demo1_v1_nginx.yaml
│ │ └── kustomization.yaml
│ └── ...
├── helm-charts (new)
│ └── nginx
│ ├── Chart.yaml
│ ├── charts
│ ├── templates
│ │ ├── NOTES.txt
│ │ ├── _helpers.tpl
│ │ ├── deployment.yaml
│ │ ├── hpa.yaml
│ │ ├── ingress.yaml
│ │ ├── service.yaml
│ │ ├── serviceaccount.yaml
│ │ └── tests
│ │ └── test-connection.yaml
│ └── values.yaml
```
- helm-charts 目錄結構的功能/意義,可參考 [Project Layout of Helm-based Operators](https://sdk.operatorframework.io/docs/building-operators/helm/reference/project_layout/)
- 自定義資源定義(CRD): ```config/crd/bases/demo1.my.domain_nginxes.yaml```
```yaml
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: nginxes.demo1.my.domain
spec:
group: demo1.my.domain
names:
kind: Nginx
listKind: NginxList
plural: nginxes
singular: nginx
scope: Namespaced
versions:
- name: v1
schema:
openAPIV3Schema:
description: Nginx is the Schema for the nginxes API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: Spec defines the desired state of Nginx
type: object
x-kubernetes-preserve-unknown-fields: true
status:
description: Status defines the observed state of Nginx
type: object
x-kubernetes-preserve-unknown-fields: true
type: object
served: true
storage: true
subresources:
status: {}
```
<br>
### [STEP-3] 建立與上傳 Nginx-Operator image
> 將 Helm Chart 的資源,封裝到 image
> 將來使用 Nginx 資源時,就是拿這 image 去佈署
>
> 其中:
> 這過程需要上傳 image 到 registry 暫存
> 在使用 Nginx 資源時,可以從 registry 下載
- 指令(未修正)
```bash
$ sudo make docker-build docker-push IMG=nginx
docker build . -t nginx
...
denied: requested access to the resource is denied
Makefile:46: recipe for target 'docker-push' failed
make: *** [docker-push] Error 1
```
registry 可以填本地端的 registry,自建 registry 的方法如下:
```bash
# 下載 registry 的 docker 版本
# 2 為某一版
docker pull registry:2
# 啟動 registry 的 docker 版本
docker run -d -p 5000:5000 --name registry registry:2
# 測試 registry
curl -X GET 127.0.0.1:5000/v2
# 從公開 registry 下載,並上傳到本地端的 registry
docker pull nginx
docker tag nginx 127.0.0.1:5000/nginx_local
docker push 127.0.0.1:5000/nginx_local
# 查看 registry
curl -X GET 127.0.0.1:5000/v2/_catalog
# {"repositories":["nginx_local"]}
curl -X GET http://127.0.0.1:5000/v2/nginx_local/manifests/latest
{
"schemaVersion": 1,
"name": "nginx_local",
"tag": "latest",
...
```
- 指令(修正後)
```bash
$ make docker-build docker-push IMG=127.0.0.1:5000/nginx_local:latest
```
- 查看 docker image 清單
```bash
$ docker images | grep -i nginx_local
REPOSITORY TAG IMAGE ID CREATED SIZE
127.0.0.1:5000/nginx_local latest 908bd3bb2dd9 7 minutes ago 159MB
```
<br>
### [STEP-4] 佈署安裝 operator 的「自定義資源定義檔」
- ### 指令
```
$ make install
```
執行結果:
```
.../nginx-operator/bin/kustomize build config/crd | kubectl apply -f -
customresourcedefinition.apiextensions.k8s.io/nginxes.demo1.my.domain created
```
- 就是 ```kubectl apply -f config/crd/bases/demo1.my.domain_nginxes.yaml```
<br>
- ### 查看 CRD 資源
```
$ kubectl get crd
NAME CREATED AT
...
nginxes.demo1.my.domain 2020-12-10T08:11:48Z
...
```
- ### 比較:local 端 v.s. k8s 端
```
$ cat config/crd/bases/demo1.my.domain_nginxes.yaml
```
```
$ kubectl describe crd/nginxes.demo1.my.domain
```
[](https://i.imgur.com/w00G43U.png)
[](https://i.imgur.com/6e3yKon.png)
<br>
### [STEP-5] 佈署安裝 operator 的「控制器 & RBAC權限」
> RBAC: Role-based access control,以角色為基礎的存取控制
- 指令
```bash
$ make deploy IMG=127.0.0.1:5000/nginx_local:latest
```
執行結果:
```
cd config/manager && .../nginx-operator/bin/kustomize edit set image controller=127.0.0.1:5000/nginx_local:latest
.../nginx-operator/bin/kustomize build config/default | kubectl apply -f -
namespace/nginx-operator-system created
customresourcedefinition.apiextensions.k8s.io/nginxes.demo1.my.domain unchanged
role.rbac.authorization.k8s.io/nginx-operator-leader-election-role created
clusterrole.rbac.authorization.k8s.io/nginx-operator-manager-role created
clusterrole.rbac.authorization.k8s.io/nginx-operator-metrics-reader created
clusterrole.rbac.authorization.k8s.io/nginx-operator-proxy-role created
rolebinding.rbac.authorization.k8s.io/nginx-operator-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/nginx-operator-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/nginx-operator-proxy-rolebinding created
service/nginx-operator-controller-manager-metrics-service created
deployment.apps/nginx-operator-controller-manager created
```
- 查看 K8s
- namespaces
```
$ kubectl get ns
NAME STATUS AGE
...
nginx-operator-system Active 3m39s
...
```
- pod
```
$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
nginx-operator-system nginx-operator-controller-manager-b4fd9bb67-9gz7l 2/2 Running 0 8m36s 10.244.1.111 alprworker-1203417-iaas
```
- deployment
```
$ kubectl get deployments --all-namespaces
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
nginx-operator-system nginx-operator-controller-manager 1/1 1 1 8m47s
```
- service
```
$ kubectl get svc --all-namespaces -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
nginx-operator-system nginx-operator-controller-manager-metrics-service ClusterIP 10.103.247.83 <none> 8443/TCP 11m control-plane=controller-manager
```
- get-all:
```
$ kubectl get all -n nginx-operator-system
...
NAME READY STATUS RESTARTS AGE
pod/nginx-operator-controller-manager-b4fd9bb67-9gz7l 2/2 Running 0 19m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx-operator-controller-manager-metrics-service ClusterIP 10.103.247.83 <none> 8443/TCP 19m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-operator-controller-manager 1/1 1 1 19m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-operator-controller-manager-b4fd9bb67 1 1 1 19m
```
<br>
### [STEP-6] 測試新資源 Nginx
- 指令
```bash
$ kubectl apply -f config/samples/demo1_v1_nginx.yaml
```
- 查看檔案內容
```
$ cat config/samples/demo1_v1_nginx.yaml
```
```yaml
apiVersion: demo1.my.domain/v1
kind: Nginx
metadata:
name: nginx-sample
spec:
# Default values copied from <project_dir>/helm-charts/nginx/values.yaml
affinity: {}
autoscaling:
enabled: false
maxReplicas: 100
minReplicas: 1
targetCPUUtilizationPercentage: 80
fullnameOverride: ""
image:
pullPolicy: IfNotPresent
repository: nginx
tag: ""
imagePullSecrets: []
ingress:
annotations: {}
enabled: false
hosts:
- host: chart-example.local
paths: []
tls: []
nameOverride: ""
nodeSelector: {}
podAnnotations: {}
podSecurityContext: {}
replicaCount: 1
resources: {}
securityContext: {}
service:
port: 80
type: ClusterIP
serviceAccount:
annotations: {}
create: true
name: ""
tolerations: []
```
- 查看 K8s
```
$ kubectl get all -l "app.kubernetes.io/instance=nginx-sample" -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/nginx-sample-646f977b4f-q75gm 1/1 Running 0 13m 10.244.1.112 alprworker-1203417-iaas <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/nginx-sample ClusterIP 10.98.159.30 <none> 80/TCP 13m app.kubernetes.io/instance=nginx-sample,app.kubernetes.io/name=nginx
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/nginx-sample 1/1 1 1 13m nginx nginx:1.16.0 app.kubernetes.io/instance=nginx-sample,app.kubernetes.io/name=nginx
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/nginx-sample-646f977b4f 1 1 1 13m nginx nginx:1.16.0 app.kubernetes.io/instance=nginx-sample,app.kubernetes.io/name=nginx,pod-template-hash=646f977b4f
```
- 連線到 nginx
```
$ curl 10.244.1.112 # 測試 pod 連線 (預設 port 是 80)
$ curl 10.98.159.30 # 測試 service 連線 (預設 port 是 80)
```
預期會出現
```html
...
<title>Welcome to nginx!</title>
...
```
<br>
### 總結:
- ### 最初,有 Nginx 的 Helm Chart 資源
- pod
- service
- ### 生成 Nginx Operator
- 透過 operator-sdk 定義新資源,叫做 ==Nginx==
- 並將 Helm Chart 資源,封裝到 image
- 將來使用 Nginx 資源時,就是拿這 image 去佈署
- ### 佈署 Nginx Operator
- CRD 檔 (新資源 Nginx 的定義檔)
- 控制器&對應的 RBAC 權限
- ### 測試新資源 Nginx
- tj-nginx-deploy.yaml
```
apiVersion: demo1.my.domain/v1
kind: Nginx
metadata:
name: tj-nginx-sample
spec:
service:
type: ClusterIP
port: 38080
```
- ```kubectl apply -f tj-nginx-deploy.yaml```
- 查看 K8s 狀態
```
$ kubectl get all -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/tj-nginx-sample-58d48f574b-f9hh2 1/1 Running 0 79s 10.244.1.115 alprworker-1203417-iaas <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/tj-nginx-sample ClusterIP 10.97.161.27 <none> 38080/TCP 79s app.kubernetes.io/instance=tj-nginx-sample,app.kubernetes.io/name=nginx
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/tj-nginx-sample 1/1 1 1 79s nginx nginx:1.16.0 app.kubernetes.io/instance=tj-nginx-sample,app.kubernetes.io/name=nginx
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/tj-nginx-sample-58d48f574b 1 1 1 79s nginx nginx:1.16.0 app.kubernetes.io/instance=tj-nginx-sample,app.kubernetes.io/name=nginx,pod-template-hash=58d48f574b
```
- 連線到 nginx
```
$ curl 10.244.1.115:80 # for pod
$ curl 10.97.161.27:38080 # for service
```
會出現
```html
...
<title>Welcome to nginx!</title>
...
```
<br>
<hr>
<br>
## [Tutorial for Helm-based Operators](https://sdk.operatorframework.io/docs/building-operators/helm/tutorial/)
> An in-depth walkthough that demonstrates how to build and run a Helm-based operator.
> 一個深度走訪,示範如何建置與運行一個基於 Helm 的 operator
### [STEP-0] 新手上路
- 主題:
- 建立一個 BlueWhale Operator
- 說明:
- 使用 hcwxd/blue-whale 映像檔, port: 3000
- 並用 ingress 接上 blue-whale 服務
<br>
### [STEP-1] 建立一個工作目錄,並建立 blue-whale 的 Chart 套件
- **參考**
[Hello World 3 (v2 & v3) - with Ingress](/5h33aCEKS5GJFoThP8hF6w#Hello-World-3-v2-amp-v3---with-Ingress)
<br>
- 指令
```bash
$ mkdir blue-whale-operatorr
$ cd blue-whale-operator
```
:::warning
:warning: **資料夾不能是「數字開頭」**
The Service "1214-blue-whale-operator-controller-manager-metrics-service" is invalid: metadata.name: Invalid value: "1214-blue-whale-operator-controller-manager-metrics-service": a DNS-1035 label must consist of lower case alphanumeric characters or '-', start with an alphabetic character, and end with an alphanumeric character (e.g. 'my-name', or 'abc-123', regex used for validation is '[a-z]([-a-z0-9]*[a-z0-9])?')
Makefile:33: recipe for target 'deploy' failed
make: *** [deploy] Error 1
:::
再建立 blue-whale 的 Chart
```bash
blue-whale-operator$ helm create blue-whale-chart
```
- 查看建立出來的檔案目錄
```
blue-whale-operator/
└── blue-whale-chart
├── Chart.yaml
├── charts
├── templates
│ ├── NOTES.txt
│ ├── _helpers.tpl
│ ├── deployment.yaml <--- 要修改
│ ├── hpa.yaml
│ ├── ingress.yaml
│ ├── service.yaml
│ ├── serviceaccount.yaml
│ └── tests
│ └── test-connection.yaml
└── values.yaml <--- 要修改
```
<br>
### [STEP-2] 修改 blue-whale 的 Chart 套件,接著測試&打包
#### 修改
- 修改相關檔案
- values.yaml
- templates/deployment.yaml
- ==檢視差異性==:
- values.yaml
```diff
image:
- repository: nginx
+ repository: hcwxd/blue-whale
pullPolicy: IfNotPresent
# Overrides the image tag whose default is the chart appVersion.
- tag: ""
+ tag: "latest"
+containerPort: 3000
...
ingress:
- enabled: false
+ enabled: true
- annotations: {}
+ annotations:
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
+ nginx.ingress.kubernetes.io/rewrite-target: /
hosts:
- - host: chart-example.local
- paths: []
+ - paths: ["/blue"]
```
<br>
- templates/deployment.yaml
```diff
ports:
- name: http
- containerPort: 80
+ containerPort: {{ .Values.containerPort}}
protocol: TCP
```
<br>
#### 測試
> 將測試修改的結果,佈署到 K8s,看有沒有錯誤發生
- 指令
```
blue-whale-chart$ helm install tj-demo2 blue-whale-chart/
```
- 若發生錯誤,如:**service type 配置錯誤**
```
$ helm install tj-demo2 tj-blue-whale-chart/
Error: Service "tj-demo2-tj-blue-whale-chart" is invalid:
spec.type: Unsupported value: "NodeIP":
supported values: "ClusterIP", "ExternalName", "LoadBalancer", "NodePort"
```
查看並刪除:
```
$ helm ls
$ helm delete tj-demo2
```
再重新修改 yaml 檔,直到能正確佈署
- 連線測試 (底下 IP 為系統動態配得)
```
# 測試 pod 連線
$ kubectl get pod -o wide
$ curl 10.244.1.128:3000
# 測試 service 連線
$ kubectl get svc -o wide
$ curl 10.105.18.96:80
# 測試 ingress 連線
$ kubectl get ing
$ curl 10.98.112.86/blue
```
<br>
#### 打包(選擇性)
```
$ helm package .
```
當前目錄下,會有 ```blue-whale-chart-0.1.0.tgz```
<br>
### [STEP-3] 建立一個「基於 Helm 的 operator 模板」
- 指令
```
$ operator-sdk init \
--plugins=helm \
--domain=asus.com \
--group=demo2 \
--version=v2 \
--kind=BlueWhale \
--helm-chart=blue-whale-chart
```
亦可拆成 init + create api:
```
$ operator-sdk init --plugins=helm --domain=asus.com
$ operator-sdk create api \
--group=demo2 \
--version=v2 \
--kind=BlueWhale \
--helm-chart=blue-whale-chart-0.1.0.tgz
```
- ```--helm-chart``` 來源,亦可以是 chart 封存套件
operator-sdk 會將封存的套件,解壓縮放到 helm-chart 目錄下
<br>
### [STEP-4] 建立與上傳 blue-whale-operator image
- 指令
```
$ make docker-build docker-push IMG=127.0.0.1:5000/blue-whale-operator
# 亦可拆成:
$ make docker-build IMG=127.0.0.1:5000/blue-whale-operator
$ make docker-push IMG=127.0.0.1:5000/blue-whale-operator
```
:::warning
:warning: 若缺少 ```docker-push```,控制器的管理器會起不來
<br>即使是 local 端的 image
```bash
$ docker images | grep blue-whale-local
REPOSITORY TAG IMAGE ID SIZE
blue-whale-local latest 4593ed340e62 159MB
```
```
$ kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
...
blue-whale-operator-1223-system blue-whale-operator-1223-controller-manager-694d9569cd-dwnt4 1/2 ImagePullBackOff 0 24m
...
```
pod 會出現 ?**ImagePullBackOff**
- Failed to pull image "blue-whale-local:latest"
<br>
```
$ kubectl describe pod your_pod_name -n your_name_spaces
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 24m default-scheduler Successfully assigned blue-whale-operator-1223-system/blue-whale-operator-1223-controller-manager-694d9569cd-dwnt4 to alprworker-1203417-iaas
Normal Pulled 24m kubelet Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0" already present on machine
Normal Created 24m kubelet Created container kube-rbac-proxy
Normal Started 24m kubelet Started container kube-rbac-proxy
Normal Pulling 23m (x4 over 24m) kubelet Pulling image "blue-whale-local:latest"
Warning Failed 23m (x4 over 24m) kubelet Failed to pull image "blue-whale-local:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for blue-whale-local, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Warning Failed 23m (x4 over 24m) kubelet Error: ErrImagePull
Warning Failed 23m (x5 over 24m) kubelet Error: ImagePullBackOff
Normal BackOff 4m50s (x86 over 24m) kubelet Back-off pulling image "blue-whale-local:latest"
```
:::
:::info
:information_source: 可能解法? (待驗證)
- 修改 config/manager/kustomization.yaml
```
images:
- name: controller
newName: 127.0.0.1:5000/blue-whale-operator
imagePullPolicy: IfNotPresent
```
- 經過測試,無效
此檔案在執行 ```make deploy IMG=...``` ,就會覆寫掉
:::
<br>
### [STEP-5] 佈署安裝 operator 的「自定義資源定義檔」
> 步驟 4 & 5,順序可調換,因為此步驟跟 image 無關
- ### 指令
```
$ make install
```
<br>
### [STEP-6] 佈署安裝 operator 的「控制器 & RBAC權限」
> RBAC: Role-based access control,以角色為基礎的存取控制
- ### 指令
```
$ make deploy IMG=127.0.0.1:5000/blue-whale-operator
```
- 一些錯誤原因
- 專案資料夾名稱太長
```
The Service "blue-whale-operator-1214-1402-controller-manager-metrics-service" is invalid:
metadata.name: Invalid
value: "blue-whale-operator-1214-1402-controller-manager-metrics-service": must be no more than 63 characters
Makefile:33: recipe for target 'deploy' failed
```
- 忘了加上 IMG 參數
錯誤訊息:
```
Error: ImagePullBackOff
Pulling image "controller:latest"
Failed to pull image "controller:latest":
rpc error: code = Unknown desc = Error response from daemon:
pull access denied for controller,
repository does not exist or may require 'docker login':
denied: requested access to the resource is denied
```
正常情況:
```
Pulling image "127.0.0.1:5000/blue-whale-local:latest"
Successfully pulled image "127.0.0.1:5000/blue-whale-local:latest" in 7.689397ms
```
READY 欄位必須是 2/2
```
$ kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
...
blue-whale-operator-1223-system blue-whale-operator-1223-controller-manager-86ddbff9fc-qst5s 2/2 Running 0 60m
```
檢視 operator pod 的狀態,會有兩個 image
```
$ kubectl describe pod blue-whale-operator-1223-controller-manager-86ddbff9fc-qst5s -n blue-whale-operator-1223-system
...
Containers:
kube-rbac-proxy:
...
Image: gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0
Image ID: docker-pullable://gcr.io/kubebuilder/kube-rbac-proxy@sha256:e10d1d982dd653db74ca87a1d1ad017bc5ef1aeb651bdea089debf16485b080b
Port: 8443/TCP
...
manager:
...
Image: 127.0.0.1:5000/blue-whale-local:latest
...
```
- ### 這個指令,佈署的內容有:
- kind: Namespace
- kind: CustomResourceDefinition
- kind: Role
- kind: ClusterRole
- manager-role
- metrics-reader
- proxy-role
- kind: RoleBinding
- kind: ClusterRoleBinding
- manager-rolebinding
- proxy-rolebinding
- kind: Service
- kind: Deployment
<br>
### [STEP-7] 測試新資源 Nginx
- 指令
```bash
$ kubectl apply -f config/samples/demo2_v2_bluewhale.yaml
```
- 檢視新資源
```bash
$ kubectl get BlueWhale
# 或是 (不分大小寫)
$ kubectl get bluewhale
```
<br>
### 總結:
- ### 測試新資源 BlueWhale
```tj-bluewhale.yaml```
```
apiVersion: demo2.asus.com/v2
kind: BlueWhale
metadata:
name: tj-test1
spec:
image:
repository: hcwxd/purple-whale
ingress:
hosts:
- paths:
- /tjblue
```
<br>
<hr>
<br>
## 新資源的 spec 用法
### 說明
- 新資源的 spec 宣告,其結構要同 Chart 底下的 values.yaml
- 其實 spec 就是在覆寫 Chart 底下的 values.yaml,可以說是一種**覆寫機制**
- 在 spec 中,新增加的額外屬性
若沒有連接到 k8s 資源,則會被忽略,不會有錯誤訊息。
### 以設定 NodePort 為範例
```tj-test2.yaml```
```yaml
apiVersion: tj2.asus.com/v2
kind: BlueWhale
metadata:
name: tj-test1
spec:
image:
repository: hcwxd/purple-whale
service:
nodePortEnabled: true
type: NodePort
nodePort: 30080
ingress:
hosts:
- paths:
- /tjblue
# 新增加的額外屬性
# 若沒有連接到 k8s 資源,則會被忽略,不會有錯誤訊息。
tj_prifle:
name: tj_tsai
id: AA1600128
```
- 映像檔:從預設為「藍色鯨魚」,改成「**紫色鯨魚**」
- 服務類型:從預設為「ClusterIP」,改成「**NodePort**」
- ClusterIP 和 NodePort 是兩種互斥的 service type
- 不能在 service type 設定為 ClusterIP 下,預先指定 nodePort 屬性,
在測試的時候,會有如下的錯誤訊息:
```bash
$ helm install tj-test1 .
Error: Service "tj-test1-my-chart" is invalid:
spec.ports[0].nodePort: Forbidden:
may not be used when `type` is 'ClusterIP'
```
- 因此,要拉出一個 boolean 參數,來控制 if 邏輯區塊:
在啟用 NodePort 下,才設定 nodePort: 30080
```
nodePortEnabled: true
nodePort: 30080
```
- 存取路徑:從預設為「/blue」,改成「**/tjblue**」
<br>
### 修改 values.yaml,定義預設值
```yaml
...
service:
type: ClusterIP
nodePortEnabled: false
nodePort: 30001
port: 80
...
```
### 修改 templates/service.yaml,控制邏輯
```yaml
apiVersion: v1
kind: Service
metadata:
name: {{ include "my-chart.fullname" . }}
labels:
{{- include "my-chart.labels" . | nindent 4 }}
spec:
type: {{ .Values.service.type }}
ports:
- port: {{ .Values.service.port }}
{{- if .Values.service.nodePortEnabled }}
nodePort: {{ .Values.service.nodePort }}
{{- end }}
targetPort: http
protocol: TCP
name: http
selector:
{{- include "my-chart.selectorLabels" . | nindent 4 }}
```
- ```{{- }}``` 表示邏輯控制指令、執行指令
- 如果 NodePort 有啟用(```nodePortEnabled = true```),就設定 nodePort 參數
```yaml
{{- if .Values.service.nodePortEnabled }}
nodePort: {{ .Values.service.nodePort }}
{{- end }}
```
### 官方文件中提到的概念
- [Understanding the Nginx CR spec](https://sdk.operatorframework.io/docs/building-operators/helm/tutorial/#understanding-the-nginx-cr-spec)
- Overriding these defaults is as simple as setting the desired values in the CR spec.
- As you may have noticed, the Helm operator simply applies the entire spec as if it was the contents of a values file, just like helm install -f ./overrides.yaml works.
<br>
<hr>
<br>
## [資源覆寫機制](https://sdk.operatorframework.io/docs/building-operators/helm/reference/advanced_features/override_values/)
### 使用情境 (這個說明不是很懂)
> If your Operator is deployed in a disconnected environment (no network access to the default images location) you can use this mechanism to set them globally at the Operator level using environment variables versus individually per CR / chart release.
>
> 如果你的 Operator 是佈署在斷線環境中(無法透過網路存取映像檔位置),則你可以使用此機制,在 Operator 級別上使用環境變數,以全域方式設定,而不是根據每個 CR / Chart 版本單獨設定它們。
- 如何動態控制「環境變數」?
- 如果進入到 container 變更環境變數,離開後再登入,還是原來的值
### 覆寫優先權
1. watches.yaml (最高)
2. CR (Custom Resource), 例如:```config/samples/xxx.yaml```
3. Chart/values.yaml (最低)
4. config/*** (預設值)
### 方法一(直接寫在 watches.yaml )
- 在 watches.yaml 中,新增 overrideValues 屬性
```yaml=
# Use the 'create api' subcommand to add watches to this file.
- group: tj04.asus.com
version: v04
kind: TjNginx
chart: helm-charts/tj-nginx
overrideValues:
image.repository: hcwxd/blue-whale
image.containerPort: 3000
# +kubebuilder:scaffold:watch
```
- image.repository & containerPort 是覆寫 Chart 中的 ```values.yaml``` 的值
```yaml
image:
repository: nginx # <--- 覆寫
tag: "latest"
containerPort: 80 # <--- 覆寫
```
- Operator image 要重新打包、上傳和佈署
> 因為 Operator Image (Dockerfile) 包含 watches.yaml
>
```make docker-build docker-push IMG=...```
```make deploy IMG=...```
:::warning
:warning: **若先前有 CR 實例在執行中**
1. 要先```kubectl delete -f config/samples/xxx.yaml```
2. 再執行```make undeploy```
<br>
若沒有先停止 CR 實例,就執行```make undeploy``` 會卡在
```
...
deployment.apps "xxx-controller-manager" deleted
```
之後要手動處理
1. 先停掉 deploy
2. 再停掉 pod & service
3. kubectl get all -A | grep 你的關鍵字
4. kubectl delete crd xxx (最後結果,crd 還是無法刪除,```--force``` 也沒用)
<br>
如何解決?重跑一次流程
- ```make install```
- ```make docker-build docker-push IMG=...```
- ```make deploy IMG=...```
- ```kubectl apply -f config/samples/xxx.yaml```
- ```kubectl delete -f config/samples/xxx.yaml```
- ```kubectl undeploy``` (有包含```make uninstall```)
- ```kubectl get crd``` (可以再確認一次)
(結論:佈署&取消佈署,要**對稱操作**)
:::
- 查看佈署的 Operator pod
```
$ kubectl get pod -A
$ kubectl describe pod/xxx-controller-manager-yyy -n xxx-system
...
manager: <--- 第2個 container
...
Environment:
IMAGE_REPOSITORY: hcwxd/blue-whale
IMAGE_CONTAINER_PORT: 3000
AUTHOR: tj_tsai
```
- 佈署 CR
```kubectl apply -f config/samples/xxx.yaml```
- xxx.yaml 是使用 nginx (port:80)
```yaml
...
spec:
...
image:
containerPort: 80
pullPolicy: IfNotPresent
repository: nginx
tag: latest
...
```
- 再觀看 pod, svc 結果,可以發現不是 nginx 而是 blue-whale

觀看方式,可以透過 [kubectl port-forward & ssh tunnel](https://hackmd.io/kD5ynMNsRH-TWq9KUwXfCQ#kubectl-port-foward)
<br>
### 方法二(將 watches.yaml 的參數,拉出到 Operator 環境變數)
```watches.yaml```
- 變更前:
```yaml=
- ...
overrideValues:
image.repository: hcwxd/blue-whale
image.containerPort: 3000
# +kubebuilder:scaffold:watch
```
- 變更後:
```yaml=
- ...
overrideValues:
image.repository: $IMAGE_REPOSITORY
image.containerPort: $IMAGE_CONTAINER_PORT
# +kubebuilder:scaffold:watch
```
<br>
將環境變數放到 Operator pod 中的 manager 容器 (image: controller)
```config/manager/manager.yaml```
```yaml=
...
spec:
...
template:
metadata:
labels:
control-plane: controller-manager
spec:
containers:
- image: controller:latest
...
env:
- name: IMAGE_REPOSITORY
value: hcwxd/purple-whale
- name: IMAGE_CONTAINER_PORT
value: "3000"
- name: AUTHOR
value: tj_tsai
```
再照上面重新打包、上傳和佈署
最後結果為

<br>
<hr>
<br>
## 其他參考資料
- helm-charts 目錄結構的功能/意義,可參考:
[Project Layout of Helm-based Operators](https://sdk.operatorframework.io/docs/building-operators/helm/reference/project_layout/)