# RKE2 logging 收集 kubelet log * 在 rke2 環境 kubelet log 是放在 `/var/lib/rancher/rke2/agent/logs/` 如果要收集 kubelet 日誌到 logging 需要透過 HostTailer 功能來幫助我們實現。 ## 實作 ### 0. 建立一個 crash 的 pod ``` $ kubectl create ns test $ kubectl -n test create deploy crash --image=quay.io/hahappyman/myapp $ kubectl -n test get po NAME READY STATUS RESTARTS AGE crash-85bcfd65f6-mh8bq 1/2 CrashLoopBackOff 1 (108s ago) <invalid> ``` ### 1. 安裝 logging 應用 * Rancher Logging 是日誌收集,過濾和輸出元件,並不能對收集到的日誌做監控警告發送。因此還是需要一個接受日誌的工具且具備警告功能的,來實現這項能力。 ![image](https://hackmd.io/_uploads/rymnsgd66.png) ``` $ kubectl -n cattle-logging-system get all NAME READY STATUS RESTARTS AGE pod/rancher-logging-787c5fdbbc-jq8xg 1/1 Running 0 95s pod/rancher-logging-rke2-journald-aggregator-tpt96 1/1 Running 0 95s pod/rancher-logging-rke2-journald-aggregator-z4qzm 1/1 Running 0 95s pod/rancher-logging-root-fluentbit-gmq55 1/1 Running 0 24s pod/rancher-logging-root-fluentbit-kllzr 1/1 Running 0 23s pod/rancher-logging-root-fluentd-0 2/2 Running 0 25s pod/rancher-logging-root-fluentd-configcheck-ac2d4553 0/1 Completed 0 74s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/rancher-logging ClusterIP None <none> 8080/TCP 98s service/rancher-logging-root-fluentd ClusterIP 10.43.51.213 <none> 24240/TCP,24240/UDP 28s service/rancher-logging-root-fluentd-headless ClusterIP None <none> 24240/TCP,24240/UDP 27s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/rancher-logging-rke2-journald-aggregator 2 2 2 2 2 kubernetes.io/os=linux 98s daemonset.apps/rancher-logging-root-fluentbit 2 2 2 2 2 kubernetes.io/os=linux 24s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/rancher-logging 1/1 1 1 96s NAME DESIRED CURRENT READY AGE replicaset.apps/rancher-logging-787c5fdbbc 1 1 1 96s NAME READY AGE statefulset.apps/rancher-logging-root-fluentd 1/1 28s ``` ### 2. 安裝 loki 應用 1. 先在 system project 底下建立一個 loki namespace 2. 在 Apps 建立一個 repositories ``` Name:grafana-charts URL:https://grafana.github.io/helm-charts ``` ![](https://i.imgur.com/dlVQV5a.png) 3. 安裝這個 loki ![](https://i.imgur.com/OHupSRk.png) ``` $ kubectl -n loki get all NAME READY STATUS RESTARTS AGE pod/loki-0 1/1 Running 0 3m46s pod/loki-promtail-9tlrn 1/1 Running 0 3m46s pod/loki-promtail-fhbq8 1/1 Running 0 3m46s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/loki ClusterIP 10.43.150.97 <none> 3100/TCP 3m47s service/loki-headless ClusterIP None <none> 3100/TCP 3m47s service/loki-memberlist ClusterIP None <none> 7946/TCP 3m47s NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/loki-promtail 2 2 2 2 2 <none> 3m47s NAME READY AGE statefulset.apps/loki 1/1 3m46s ``` ### 3. 設定 HostTailer ``` apiVersion: logging-extensions.banzaicloud.io/v1alpha1 kind: HostTailer metadata: name: rke2-kubelet-logs namespace: cattle-logging-system spec: fileTailers: - containerOverrides: volumeMounts: - mountPath: /var/lib/rancher/rke2/agent/logs/ name: var-lib-rke2-kubelet-log - mountPath: /var/pos name: positions name: kubelet-log path: /var/lib/rancher/rke2/agent/logs/kubelet*.log workloadOverrides: # tolerations: # - effect: string # key: string # operator: string # tolerationSeconds: int # value: string volumes: - hostPath: path: /var/lib/rancher/rke2/agent/logs/ name: var-lib-rke2-kubelet-log - hostPath: path: /var/pos name: positions workloadMetaOverrides: labels: {} annotations: {} ``` ### 4. 設定 ClusterOutputs & ClusterFlows ``` apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterOutput metadata: name: demo-output namespace: cattle-logging-system spec: loki: configure_kubernetes_labels: true extract_kubernetes_labels: true url: http://loki.loki:3100 --- apiVersion: logging.banzaicloud.io/v1beta1 kind: ClusterFlow metadata: name: demo-flow namespace: cattle-logging-system spec: globalOutputRefs: - demo-output match: - select: container_names: - kubelet-log ``` ### 5. 登入 grafana > 帳號: admin > 密碼: prom-operator * 到 Data Source 新增 loki ![image](https://hackmd.io/_uploads/H14VpldTp.png) * 新增 loki 位置 ![](https://i.imgur.com/703hNnX.png) * 點選 save and test ![image](https://hackmd.io/_uploads/r1woTgO6a.png) * 進到 Explore 選擇 loki * 指定搜尋 `container: kubelet-log` 然後搜尋 crash 有關的訊息 ![image](https://hackmd.io/_uploads/BkvgeTKxgg.png) * 在 grafana 可以看到 kubelet 紀錄持續 crash 的 pod 的 log。 ![image](https://hackmd.io/_uploads/rybaR3Kxee.png) ## 參考 https://kube-logging.dev/docs/configuration/extensions/kubernetes-host-tailer/