--- tags: Kubernetes, Sidecar for log - fluentd description: about pod design for service log, how to add sidecar(use fluentd) for web service. robots: index, follow --- # Kubernetes Sidecar for log - fluentd 在Kubernetes中,若我們需要把AP的Log彙整到其他地方,在過往的虛擬化技術中,大概就是安裝一些套件,把AP Log向其他的Log系統進行拋轉,現在我們可以透過pod設計,將Log存放到指定的目錄,另外建立一個容器協助進行Log拋轉動作,這個行為稱為Sidecar,有其他類型的行為,改天再來探討。 官方針對[Sidecar](https://kubernetes.io/docs/concepts/cluster-administration/logging/#sidecar-container-with-a-logging-agent "Sidecar container with a logging agent")有做詳細的介紹,但美中不足的,就是資料沒有轉接到其他的環境進行呈現。 本文將撰寫從EFK(Elasticsearch、Fluentd、Kibana)建立、如何建立fluentd docker image、Sidecar設計到最後Log存放到EFK進行呈現。 至於為什麼要建立fluentd docker image,踩過坑之後就會發現,fluentd所放到的dockerhub上面的image,可能沒有你要的套件。 本文的目標是將AP Log存放到elasticsearch中,所以需要安裝fluent-plugin-elasticsearch套件,這個過程被詳細的紀錄在`build elasticsearch fluent image`中。 再來就是Sidecar設計,如何設計一個能夠自動將Log丟到EFK中的pod,將Log透過pod內部DNS解析方式送到EFK中,過程也詳細的紀錄在`pod design(sidecar)`中。 最後就是AP Log如何進行呈現的操作,可以參考`Kibana設定`。 本文適合正在評估、初探EFK或已經在用EFK的讀者,在某些環節上這篇資料應該非常有幫助。 ## 1. environment 1. Rancher 2. Master * 1, worker * 3(must) 3. Kubernetes: 1.20.8 4. CNI: Calico 5. CRI: Docker :::warning 1. Log收集對Disk IO的要求偏高,如果只用一台實體主機上的虛擬化與傳統式硬碟做驗證環境,系統的回應速度體驗極差。 ::: ## 2. EFK EFK安裝上有點複雜,如果叢集管理環境沒有openshift或rancher,可能整備環境會需要比較多的時間,強烈建議使用上述兩種工具進行安裝,本例使用rancher app安裝後,會需要等環境安裝完成,完成的結果可以透過以下兩種方式確認。 1. CLI ``` inwin@master:~$ kubectl -n efk get all NAME READY STATUS RESTARTS AGE pod/efk-filebeat-mskwf 1/1 Running 3 26h pod/efk-filebeat-rbrls 1/1 Running 3 26h pod/efk-filebeat-wb8pr 1/1 Running 5 26h pod/efk-kibana-dfd59f779-lqvkd 2/2 Running 1 26h pod/efk-kube-state-metrics-546dcf5d7d-5csls 1/1 Running 0 14h pod/efk-metricbeat-5h58g 1/1 Running 5 26h pod/efk-metricbeat-jqfrr 1/1 Running 2 26h pod/efk-metricbeat-metrics-bc9d5744-8ddld 1/1 Running 6 26h pod/efk-metricbeat-nfcrc 1/1 Running 3 26h pod/elasticsearch-master-0 1/1 Running 1 26h pod/elasticsearch-master-1 1/1 Running 0 26h pod/elasticsearch-master-2 1/1 Running 0 174m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/efk-kibana NodePort 10.43.84.200 <none> 5601:30859/TCP 26h service/efk-kube-state-metrics ClusterIP 10.43.64.177 <none> 8080/TCP 26h service/elasticsearch-master ClusterIP 10.43.123.194 <none> 9200/TCP,9300/TCP 26h service/elasticsearch-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 26h service/kibana-http ClusterIP 10.43.239.82 <none> 80/TCP 26h NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE daemonset.apps/efk-filebeat 3 3 3 3 3 <none> 26h daemonset.apps/efk-metricbeat 3 3 3 3 3 <none> 26h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/efk-kibana 1/1 1 1 26h deployment.apps/efk-kube-state-metrics 1/1 1 1 26h deployment.apps/efk-metricbeat-metrics 1/1 1 1 26h NAME DESIRED CURRENT READY AGE replicaset.apps/efk-kibana-dfd59f779 1 1 1 26h replicaset.apps/efk-kube-state-metrics-546dcf5d7d 1 1 1 26h replicaset.apps/efk-metricbeat-metrics-bc9d5744 1 1 1 26h NAME READY AGE statefulset.apps/elasticsearch-master 3/3 26h ``` 2. GUI **rancher** ![](https://i.imgur.com/4ng1bzh.png) **EFK** ![](https://i.imgur.com/AiE1Uja.jpg) ## 3. build elasticsearch fluentd image 預設的fluentd image中沒有安裝elasticsearch的plugin套件,我們需要追加fluent-plugin-elasticsearch,使用1.13.3版。 ```dockerfile= # AUTOMATICALLY GENERATED # DO NOT EDIT THIS FILE DIRECTLY, USE /Dockerfile.template.erb FROM alpine:3.13 LABEL Description="this is sidecar sample." # Do not split this into multiple RUN! # Docker creates a layer for every RUN-Statement # therefore an 'apk delete' has no effect RUN apk update \ && apk add --no-cache \ ca-certificates \ ruby ruby-irb ruby-etc ruby-webrick \ tini \ && apk add --no-cache --virtual .build-deps \ build-base linux-headers \ ruby-dev gnupg \ && echo 'gem: --no-document' >> /etc/gemrc \ && gem install oj -v 3.10.18 \ && gem install json -v 2.4.1 \ && gem install async-http -v 0.54.0 \ && gem install ext_monitor -v 0.1.2 \ && gem install fluentd -v 1.13.3 \ && gem install bigdecimal -v 1.4.4 \ && gem install fluent-plugin-elasticsearch -v 1.13.3 \ # NOTE: resolv v0.2.1 includes the fix for CPU spike issue due to DNS resolver. # This hack is needed for Ruby 2.6.7, 2.7.3 and 3.0.1. (alpine image is still kept on 2.7.3) && gem install resolv -v 0.2.1 \ && apk del .build-deps \ && rm -rf /tmp/* /var/tmp/* /usr/lib/ruby/gems/*/cache/*.gem /usr/lib/ruby/gems/2.*/gems/fluentd-*/test RUN addgroup -S fluent && adduser -S -g fluent fluent \ # for log storage (maybe shared with host) && mkdir -p /fluentd/log \ # configuration/plugins path (default: copied from .) && mkdir -p /fluentd/etc /fluentd/plugins \ && chown -R fluent /fluentd && chgrp -R fluent /fluentd COPY fluent.conf /fluentd/etc/ COPY entrypoint.sh /bin/ ENV FLUENTD_CONF="fluent.conf" ENV LD_PRELOAD="" # NOTE: resolv v0.2.1 includes the fix for CPU spike issue due to DNS resolver. # Forcing to load specific version of resolv (instead of bundled by default) is needed for Ruby 2.6.7, 2.7.3 and 3.0.1. # alpine image is still kept on 2.7.3. See https://pkgs.alpinelinux.org/packages?name=ruby&branch=v3.13 ENV RUBYLIB="/usr/lib/ruby/gems/2.7.0/gems/resolv-0.2.1/lib" EXPOSE 24224 5140 USER fluent ENTRYPOINT ["tini", "--", "/bin/entrypoint.sh"] CMD ["fluentd"] ``` :::warning 1. build image之前,請先到[fluent官方github](https://github.com/fluent/fluentd-docker-image/tree/master/v1.13/alpine "fluent github")取得entrypoint.sh、fluent.conf這兩個檔案,並與Dockerfile存放在同一個目錄。 2. entrypoint.sh檔案需要執行權限,記得使用chmod a+x加入權限。 ::: 建立image ```shell= inwin@rancher:~/efkes$ sudo docker build -t sidecarfd . [sudo] password for inwin: Sending build context to Docker daemon 6.656kB Step 1/13 : FROM alpine:3.13 3.13: Pulling from library/alpine 540db60ca938: Pull complete Digest: sha256:1d30d1ba3cb90962067e9b29491fbd56997979d54376f23f01448b5c5cd8b462 Status: Downloaded newer image for alpine:3.13 ---> 6dbb9cc54074 ... ... ... Step 13/13 : CMD ["fluentd"] ---> Running in e8a26532a538 Removing intermediate container e8a26532a538 ---> d5ddba0a3a7b Successfully built d5ddba0a3a7b Successfully tagged sidecarfd:latest ``` 列出image, 並且tag後上傳到docker hub ```shell= inwin@rancher:~/efkes$ sudo docker image list REPOSITORY TAG IMAGE ID CREATED SIZE sidecarfd latest d5ddba0a3a7b 19 seconds ago 50.5MB inwin@rancher:~/efkes$ sudo docker image tag sidecarfd yansheng133/sidecarfd:0.1 inwin@rancher:~/efkes$ sudo docker push yansheng133/sidecarfd:0.1 The push refers to repository [docker.io/yansheng133/sidecarfd] 9f26f7ca8b1e: Pushed a4d4c803b809: Pushed c0628dc213c6: Pushed 053fabb5278f: Pushed b2d5eeeaba3a: Mounted from library/alpine 0.1: digest: sha256:7f16aadd6b9677059c21188b63c443daa3e4b5f939e07f37e7e46db9c5c79c6f size: 1362 ``` :::info 1. 如果沒有docker hub帳號,請記得申請一個,並且在push前進行[login](https://docs.docker.com/engine/reference/commandline/login/ "login")動作。 ::: [確認image已上傳](https://hub.docker.com/repository/docker/yansheng133/sidecarfd "yansheng133 sidecar sample") ![](https://i.imgur.com/zKnRpmr.png) fluentd image這樣就完成了,準備進入Sidecar設計階段。 ## 4. pod design(sidecar) Pod Design一直是AP開發上相當有趣味的地方,程式設計與config map搭配得好,可以很方便的利用他進行組態更換,之後把pod刪除,讓他重新讀取就可以了,不過刪除pod的動作,還是要依照應用的類型進行評估與確認,config map更新不會自動更新pod內的組態,使用上需要注意這點。 config map基本說明: comfig map是一個讓pod能夠直接掛載指定config的機制,也可以將組態檔整個植入,在pod建立時,透過volume直接掛載到指定目錄,當然我們也可以將其他東西放入config map中,例如變數等等,不過篇幅有點長,我們先把焦點集中在本文的重點,把`AP Log`放入EFK中。 在以下的config map中,我們指定fluent log的位置在/var/log/access.log, elasticsearch的位置在elasticsearch-master.efk.svc.cluster.local。 :::warning 1. 注意,這個位置在外面無法被存取,主因是cluster IP沒有對外提供存取,僅存在pod內解析時才找得到。 2. 如果要使用其他的位置,請記得替換。 ::: **回顧一下efk service** ```shell= inwin@master:~$ kubectl -n efk get svc elasticsearch-master NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/elasticsearch-master ClusterIP 10.43.123.194 <none> 9200/TCP,9300/TCP 26h ``` fluent config map,這是個基礎範例,有更多應用可以參考[官方網站](https://docs.fluentd.org/ "fluentd")。 log將會使用logstash格式,並且在前面加註nginx.aplog,在kibana設定的時候就可以當作keywork來建立index pattern,建立index的動作稍後介紹。 ```yaml= apiVersion: v1 kind: ConfigMap metadata: name: fluentd-config-test data: fluent.conf: | <source> @type tail format nginx path /var/log/access.log pos_file /var/log/access.log.pos tag nginx.aplog </source> <match **> @type elasticsearch logstash_format true logstash_prefix nginx.aplog host elasticsearch-master.efk.svc.cluster.local port 9200 index_name nginx.aplog #(optional; default=fluentd) flush_interval 60s reconnect_on_error true reload_on_failure true reload_connections false time_key @log_time time_format %Y%m%dT%H%M%S%z </match> ``` 這樣config map就完成了,接下來是pod design的最大重點,Sidecar該如何設計? 如前文所提到,我們需要另外一個容器掛載指定的目錄,進行Log拋轉,重點就會在如何使用共通的volume。 以下的efkweb樣本需要達成這個目標,我們需要哪些東西的說明: 1. 建立一個empty dir(web-volume),用來存放web log,同時提供給fluentd使用。 2. 建立一個volume存放config map(fluentd-config-test)。 3. 建立一個web pod,使用nginx image,允許使用80 port,並且掛載web-volume volume,將web log存放在這。 4. 建立一個count agent,使用在前面build好的docker image(本例使用作者的image: yansheng133/sidecarfd:0.2),掛載web-volume volume,同時也掛載config map(fluentd-config-test)。 5. 建立一個service(efkweb),使用NodePort mode,啟用32100 port。 ```yaml= apiVersion: v1 kind: Pod metadata: labels: run: efkweb name: efkweb spec: containers: - image: nginx name: efkweb ports: - containerPort: 80 volumeMounts: - mountPath: /var/log/nginx/ name: web-volume resources: {} - name: count-agent image: yansheng133/sidecarfd:0.2 env: - name: FLUENTD_ARGS value: -c /fluentd/etc/fluent.conf volumeMounts: - name: web-volume mountPath: /var/log - name: config-volume mountPath: /fluentd/etc/ dnsPolicy: ClusterFirst restartPolicy: Always volumes: - name: web-volume emptyDir: {} - name: config-volume configMap: name: fluentd-config-test --- apiVersion: v1 kind: Service metadata: labels: run: efkweb name: efkweb spec: ports: - nodePort: 32100 port: 80 protocol: TCP targetPort: 80 selector: run: efkweb type: NodePort status: loadBalancer: {} ``` 建立config map與aplog pod ```shell= inwin@rancher:~/efkes$ kubectl create -f fluent_cm.yaml configmap/fluentd-config-test created inwin@rancher:~/efkes$ kubectl create -f aplog.yaml pod/efkweb created service/efkweb created inwin@rancher:~/efkes$ kubectl get po efkweb NAME READY STATUS RESTARTS AGE efkweb 2/2 Running 0 53s ``` 建立一個nginx pod,在裡面安裝watch套件 ```shell= inwin@rancher:~/efkes$ kubectl run accessgen --image=nginx pod/accessgen created inwin@rancher:~/efkes$ kubectl exec -it accessgen -- bash root@accessgen:/# apt update Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB] Get:2 http://deb.debian.org/debian buster InRelease [122 kB] Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB] Get:4 http://security.debian.org/debian-security buster/updates/main amd64 Packages [301 kB] Get:5 http://deb.debian.org/debian buster/main amd64 Packages [7907 kB] Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [15.2 kB] Fetched 8463 kB in 3s (2754 kB/s) Reading package lists... Done Building dependency tree Reading state information... Done 4 packages can be upgraded. Run 'apt list --upgradable' to see them. root@accessgen:/# apt install watch -y Reading package lists... Done Building dependency tree Reading state information... Done .... .... .... update-alternatives: warning: skip creation of /usr/share/man/man1/w.1.gz because associated file /usr/share/man/man1/w.procps.1.gz (of link group w) doesn't exist Processing triggers for libc-bin (2.28-10) ... root@accessgen:/# exit ``` 開始產生log ```shell= inwin@rancher:~/efkes$ kubectl exec -it accessgen -- bash root@accessgen:/# watch -n 0.1 curl efkweb.default.svc ``` 讓accessgen pod持續產生log,接下來我們需要設定一下kibana相關的介面。 ## 5. Kibana設定 建立kibana index,點左下角齒輪選項後,點擊index patterns。 ![](https://i.imgur.com/BTHZclj.png) 輸入nginx.aplog-2021.XX.XX(本例為08.17)後,點擊next step。 ![](https://i.imgur.com/F0lTzPI.png) 選擇時間過濾欄位名稱(Time Filter Field Name) ![](https://i.imgur.com/fwOiXiA.png) 選擇使用@timestamp後,點擊Create index pattern。 ![](https://i.imgur.com/RrufvxK.png) 確認建立完成。 ![](https://i.imgur.com/kR6xu3p.png) 點擊指南針圖示(Discover) ![](https://i.imgur.com/EX1Fna1.png) 在下拉式選單中選擇剛剛建立的index(nginx.aplog-2021.08.17) ![](https://i.imgur.com/on4U2iR.png) 確認是否log有匯入Elasticsearch中(透過kibana查看) ![](https://i.imgur.com/Q7tjZFC.png) 以上就是一個完整的建立EFK、建立可以用的fluentd docker image、設計config map、設計sidecar、設定Kibana到最後AP Log呈現的過程。 ## 6. reference 1. [OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift "OpenShift") 2. [Rancher](https://rancher.com/ "Rancher") 3. [fluent 1.13 docker files and others](https://github.com/fluent/fluentd-docker-image/tree/master/v1.13/alpine "fluent 1.13") 4. [entrypoint-sh執行權限問題](https://github.community/t/permission-denied-exec-entrypoint-sh/16216 "entrypoint-sh執行權限問題") 5. [fluentd官方網站](https://docs.fluentd.org/ "fluentd") 6. [Elastic Kibana 快速入門](https://linyencheng.github.io/2020/09/10/elastic-kibana-quick-start/ "Elastic Kibana 快速入門") 7. [Kubernetes Sidecar – Logging with FluentD to EFK](https://www.middlewareinventory.com/blog/kubernetes-sidecar-logging-with-fluentd-to-efk/ "Kubernetes Sidecar – Logging with FluentD to EFK") 8. [在 Kubernetes 上搭建 EFK 日志收集系统](https://www.qikqiak.com/post/install-efk-stack-on-k8s/ "在 Kubernetes 上搭建 EFK 日志收集系统") 9. [fluentd stop sending logs to elasticsearch after a few hours](https://github.com/fluent/fluentd/issues/2334 "fluentd stop sending logs to elasticsearch after a few hours") 10. [踩坑 - fluentd daemonset failed to flush the buffer](https://blog.downager.com/2019/11/24/%E8%B8%A9%E5%9D%91-fluentd-daemonset-failed-to-flush-the-buffer/ "踩坑-fluentd daemonset failed to flush the buffer ") 11. [Fluentd 使用自定 Log 時間當做 Timestamp](https://blog.yowko.com/fluentd-log-time/ "Fluentd 使用自定 Log 時間當做 Timestamp") 12. [學習 Fluentd(三):寫入 Elasticsearch](https://www.gushiciku.cn/pl/p89f/zh-tw "學習 Fluentd(三):寫入 Elasticsearch")