# job-operator 本專案提供 kubernetes 使用者一 ManagedJob CRD,透過監控 kafka consumer lag 來判斷 ManagedJob 處理進度是否過慢,並根據叢集中節點的硬體資源使用率等訊息判斷目標機器,對過慢的 ManagedJob 進行遷移。 ## 安裝 ### 前置需求 本專案使用 [operator-sdk](https://github.com/operator-framework/operator-sdk) 產生,以下為 operator-sdk 的 Prerequisites - git - go version v1.13+. - mercurial version 3.9+ - docker version 17.03+ - kubectl version v1.12.0+ #### go 安裝 ```bash sudo add-apt-repository ppa:longsleep/golang-backports sudo apt-get update sudo apt-get install golang-go ``` #### operator-sdk 安裝 ```bash # Set the release version variable $ RELEASE_VERSION=v0.12.0 # Linux $ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu # install into PATH $ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu && sudo mkdir -p /usr/local/bin/ && sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk && rm operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu ``` ### 建置 ```bash cd job-operator export GOROOT=$(go env GOROOT) operator-sdk build <image_name> docker push <image_name> ``` ### 部署 1. 修改 deploy/role_binding.yaml, deploy/service_account.yaml 將檔案中的 <operator-namespace> 替換成 operator 部署時的 namespace,若部署時無指定 namespace,則這邊使用 default 替換 deploy/role_binding.yaml ```yaml kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: job-operator subjects: - kind: ServiceAccount name: job-operator namespace: <operator-namespace> roleRef: kind: ClusterRole name: job-operator apiGroup: rbac.authorization.k8s.io ``` deploy/service_account.yaml ```yaml apiVersion: v1 kind: ServiceAccount metadata: name: job-operator namespace: <operator-namespace> ``` 1. 安裝 operator 以外的設定檔至 kubernetes ```bash kubectl apply -f deploy/crds/vgm.io_managedjobs_crd.yaml kubectl apply -f deploy/crds/vgm.io_regions_crd.yaml kubectl apply -f deploy/crds/vgm.io_consumerlagmonitors_crd.yaml kubectl apply -f deploy/role.yaml kubectl apply -f deploy/role_binding.yaml kubectl apply -f deploy/service_account.yaml ``` 2. 修改 deploy/operator.yaml 修改 image 以及 WATCH_NAMESPACE 這些環境變數 - WATCH_NAMESPACE: operator 運作的 namespace,屬於該 namespace 的 CR 才會被處理。若 WATCH_NAMESPACE 值為 "",則會處理所有namespace 的 CR ```yaml # operator.yaml apiVersion: apps/v1 kind: Deployment metadata: name: job-operator spec: replicas: 1 selector: matchLabels: name: job-operator template: metadata: labels: name: job-operator spec: serviceAccountName: job-operator containers: - name: job-operator # Replace this with the built image name image: REPLACE_IMAGE command: - job-operator args: - --zap-encoder=console # using human readable logging format imagePullPolicy: Always env: - name: WATCH_NAMESPACE value: "" - name: POD_NAME valueFrom: fieldRef: fieldPath: metadata.name - name: OPERATOR_NAME value: "job-operator" ``` 3. 安裝 operator.yaml 至 kubernetes ```bash kubectl apply -f deploy/operator.yaml ``` ## 使用 可參考 deploy/crds/*_cr.yaml **Managedjob** ```yaml # vgm.io_v1alpha1_managedjob_cr.yaml apiVersion: vgm.io/v1alpha1 kind: ManagedJob metadata: name: example-managedjob spec: consumerGroupID: test1 # 該 job 使用的 consumer group ID inputTopic: test # 該 job 的資料來源 topic regionID: "7" # 保留,未來用作識別 private region podSpec: # 沿用 kubernetes pod 設定檔 containers: - name: ubuntu image: ubuntu command: ["/bin/bash", "-ec", "while :; do echo '.'; sleep 1 ; done"] ``` **region** ```yaml apiVersion: vgm.io/v1alpha1 kind: Region metadata: name: example-region spec: nodes: - node: exp-worker - node: exp-worker2 type: region # value = region/cloud ``` **consumerLagMonitor** ```yaml apiVersion: vgm.io/v1alpha1 kind: ConsumerLagMonitor metadata: name: <name> namespace: <namespace> spec: port: <http_service_listen_port> regionID: <region_ID> # has to match an existed region template: spec: containers: - args: - -brokers=<kafka_brokers_ip:port,...> - -port=<http_service_listen_port> # default 8080 when not specified image: monitor name: monitor ports: - containerPort: <http_service_listen_port> name: http protocol: TCP ``` ## 檔案結構 以下列需要注意的檔案 - pkg/apis/vgm/v1alpha1/*_types.go 定義 CRD 內容,並用來產生真正的 CRD 設定檔以及對應的型別定義 - pkg/apis/controller/\*/\*_controller.go 定義 CRD controller 事件迴圈 - pkg/coordinator coordinator 元件(remote 包裝) - pkg/monitor kafka consumer lag 監控元件(remote 包裝) - pkg/rest restful API service