# Set up OpenShift Service Mesh 3.0 + Distributed Tracing (Tempo) ## !!!此文件目的為驗證服務架構,非生產環境安裝指引!!! ## Environment and important pre-set - OCP 4.16 cluster - minio is installed in ns/minio - Minio Access Key : `tempo` - Minio Secret Key : `supersecret` - Minio's svc is named `minio` - TempoStack is installed in namespace`tempo` - TempoStack is named `sample` - A OpenTelemetryCollector named `otel` is installed in ns/istio-system - otel collector's svc is named `otel-collector` ## Install NFS Provisioner Operator (You can use other Storage class) ### Install Operator ```bash= # Create a new namespace oc new-project nfsprovisioner-operator # Deploy NFS Provisioner operator in the terminal (You can also use OpenShift Console cat << EOF | oc apply -f - apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: nfs-provisioner-operator namespace: openshift-operators spec: channel: alpha installPlanApproval: Automatic name: nfs-provisioner-operator source: community-operators sourceNamespace: openshift-marketplace EOF ``` ### Create NFS server mount path on a worker node ```bash= # Check nodes oc get nodes NAME STATUS ROLES AGE VERSION master-0.smtest.lab.psi.pnq2.redhat.com Ready control-plane,master 44m v1.29.6+aba1e8d master-1.smtest.lab.psi.pnq2.redhat.com Ready control-plane,master 44m v1.29.6+aba1e8d master-2.smtest.lab.psi.pnq2.redhat.com Ready control-plane,master 44m v1.29.6+aba1e8d worker-0.smtest.lab.psi.pnq2.redhat.com Ready worker 22m v1.29.6+aba1e8d worker-1.smtest.lab.psi.pnq2.redhat.com Ready worker 22m v1.29.6+aba1e8d # Set Env variable for the target node name export target_node=worker-0.smtest.lab.psi.pnq2.redhat.com oc label node/${target_node} app=nfs-provisioner # ssh to the node oc debug node/${target_node} # Create a directory and set up the Selinux label. chroot /host mkdir -p /home/core/nfs chcon -Rvt svirt_sandbox_file_t /home/core/nfs exit exit ``` ### Create NFS Provisioner CRD ```bash= # Create NFSProvisioner Custom Resource cat << EOF | oc apply -f - apiVersion: cache.jhouse.com/v1alpha1 kind: NFSProvisioner metadata: name: nfsprovisioner-sample namespace: nfsprovisioner-operator spec: hostPathDir: /home/core/nfs nodeSelector: app: nfs-provisioner EOF # Check if NFS Server is running oc get pod NAME READY STATUS RESTARTS AGE nfs-provisioner-77bc99bd9c-57jf2 1/1 Running 0 2m32s ``` oc adm policy add-scc-to-user hostmount-anyuid -z nfs-provisioner -n nfsprovisioner-operator ```bash= # Update annotation of the NFS StorageClass oc patch storageclass nfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' # Check the default next to nfs StorageClass oc get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE nfs (default) example.com/nfs Delete Immediate false 4m29s ``` ## Install Minio to provide Object storage ```bash= # Create NS for minio oc new-project minio # Create SS for minio cat << EOF | oc apply -f - kind: StatefulSet apiVersion: apps/v1 metadata: name: minio namespace: minio labels: app: minio spec: serviceName: minio revisionHistoryLimit: 10 persistentVolumeClaimRetentionPolicy: whenDeleted: Retain whenScaled: Retain volumeClaimTemplates: - kind: PersistentVolumeClaim apiVersion: v1 metadata: name: data creationTimestamp: null spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi storageClassName: nfs # << Should fit your default SC volumeMode: Filesystem status: phase: Pending template: metadata: creationTimestamp: null labels: app: minio spec: containers: - resources: {} terminationMessagePath: /dev/termination-log name: minio env: - name: MINIO_ACCESS_KEY value: tempo - name: MINIO_SECRET_KEY value: supersecret - name: MINIO_KMS_SECRET_KEY value: 'my-minio-key:oyArl7zlPECEduNbB1KXgdzDn2Bdpvvw0l8VO51HQnY=' ports: - containerPort: 9000 protocol: TCP imagePullPolicy: Always volumeMounts: - name: data mountPath: /data terminationMessagePolicy: File image: 'quay.io/rhn_support_jaliang/minio/minio:RELEASE.2023-05-27T05-56-19Z' args: - server - /data restartPolicy: Always terminationGracePeriodSeconds: 30 dnsPolicy: ClusterFirst securityContext: {} schedulerName: default-scheduler podManagementPolicy: OrderedReady replicas: 1 updateStrategy: type: RollingUpdate rollingUpdate: partition: 0 selector: matchLabels: app: minio EOF ``` Create Service for Minio ```bash= cat << EOF | oc apply -f - apiVersion: v1 kind: Service metadata: name: minio labels: app: minio spec: type: NodePort ports: - port: 9000 name: minio nodePort: 30080 selector: app: minio EOF ``` Create route for minio client ```bash= oc expose svc/minio -n minio ``` Using minio client to create bucket for Tempo on Bastion ```bash= curl https://dl.min.io/client/mc/release/linux-amd64/mc \ --create-dirs \ -o $HOME/minio-binaries/mc chmod +x $HOME/minio-binaries/mc export PATH=$PATH:$HOME/minio-binaries/ [quickcluster@upi-0 ~]$ oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD minio minio-minio.apps.smtest.lab.psi.pnq2.redhat.com minio minio None [quickcluster@upi-0 ~]$ [quickcluster@upi-0 ~]$ mc alias set myminio http://minio-minio.apps.smtest.lab.psi.pnq2.redhat.com --insecure mc: Configuration written to `/home/quickcluster/.mc/config.json`. Please update your access credentials. mc: Successfully created `/home/quickcluster/.mc/share`. mc: Initialized share uploads `/home/quickcluster/.mc/share/uploads.json` file. mc: Initialized share downloads `/home/quickcluster/.mc/share/downloads.json` file. Enter Access Key: tempo Enter Secret Key: Added `myminio` successfully. [quickcluster@upi-0 ~]$ mc ready play The cluster is ready [quickcluster@upi-0 ~]$ mc mb myminio/tempo Bucket created successfully `myminio/tempo`. ``` ## Install ServiceMesh Operator 3.0 ### Install required Operators https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/installing/index#ossm-installing-operator_ossm-about-deployment-and-update-strategies Install Servcie Mesh Operator 3.0 with default setting. ![image](https://hackmd.io/_uploads/BJRRs5X0yx.png) ### Create ns/istio-system for istio and create a istio resource ```bash= oc new-project istio-system cat << EOF | oc apply -f - kind: Istio apiVersion: sailoperator.io/v1 metadata: name: default spec: namespace: istio-system updateStrategy: inactiveRevisionDeletionGracePeriodSeconds: 30 type: InPlace version: v1.24.3 EOF ``` ### Create ns/istio-cni for istio cni and create a istioCNI resource ```bash= oc new-project istio-cni ``` ```bash= cat << EOF | oc apply -f - kind: IstioCNI apiVersion: sailoperator.io/v1 metadata: name: default spec: namespace: istio-cni version: v1.24.3 EOF ``` ### Label istio-system to enable service mesh discovery ```bash= oc label namespace istio-system istio-discovery=enabled ``` Modify the Istio control plane resource to include a discoverySelectors section with the same label. ```bash= oc edit Istio default kind: Istio apiVersion: sailoperator.io/v1 metadata: name: default spec: namespace: istio-system values: # ADD meshConfig: # ADD discoverySelectors: # ADD - matchLabels: # ADD istio-discovery: enabled # ADD ``` By enabling this configuration, all namespace with lable `istio-discovery: enabled` will join the mesh. ## Join a workload into Mesh https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/installing/index#deploying-book-info_ossm-about-bookinfo-application ### Create a project for bookinfo sample application. ```bash= oc new-project bookinfo oc label namespace bookinfo istio-discovery=enabled istio-injection=enabled oc apply -f https://raw.githubusercontent.com/openshift-service-mesh/istio/release-1.24/samples/bookinfo/platform/kube/bookinfo.yaml -n bookinfo ``` ### Create Service Mesh ingress gateway in ns/bookinfo 註: Ingress Gateway也可以統一安裝在ns/istio-system如同SM2.0一樣。 取決於後續App的Gateway物件的gateway selector ```bash= cat << EOF | oc apply -f - apiVersion: v1 kind: Service metadata: name: istio-ingressgateway spec: type: ClusterIP selector: istio: ingressgateway ports: - name: http2 port: 80 targetPort: 8080 - name: https port: 443 targetPort: 8443 --- apiVersion: apps/v1 kind: Deployment metadata: name: istio-ingressgateway spec: selector: matchLabels: istio: ingressgateway template: metadata: annotations: # Select the gateway injection template (rather than the default sidecar template) inject.istio.io/templates: gateway labels: # Set a unique label for the gateway. This is required to ensure Gateways can select this workload istio: ingressgateway # Enable gateway injection. If connecting to a revisioned control plane, replace with "istio.io/rev: revision-name" sidecar.istio.io/inject: "true" spec: containers: - name: istio-proxy image: auto # The image will automatically update each time the pod starts. --- # Set up roles to allow reading credentials for TLS apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: istio-ingressgateway-sds rules: - apiGroups: [""] resources: ["secrets"] verbs: ["get", "watch", "list"] --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: istio-ingressgateway-sds roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: istio-ingressgateway-sds subjects: - kind: ServiceAccount name: default #--- # Allow outside traffic to access the gateway # This is optional, only needed in case your cluster contains restrictive NetworkPolicies # 只有Cluster內的NetworkPolicy預設deny All才需要這條白名單規則 #apiVersion: networking.k8s.io/v1 #kind: NetworkPolicy #metadata: # name: gatewayingress #spec: # podSelector: # matchLabels: # istio: ingressgateway # ingress: # - {} # policyTypes: # - Ingress EOF ``` ### Create a Gateway definition for bookinfo app ```bash= cat << EOF | oc apply -f - apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: bookinfo-gateway namespace: bookinfo spec: # The selector matches the ingress gateway pod labels. # If you installed Istio using Helm following the standard documentation, this would be "istio=ingress" selector: istio: ingressgateway # use istio default controller,指向ingressGW deploy的label servers: - port: number: 8080 name: http protocol: HTTP hosts: - "*" --- apiVersion: networking.istio.io/v1 kind: VirtualService metadata: name: bookinfo namespace: bookinfo spec: hosts: - "*" gateways: - bookinfo-gateway http: - match: - uri: exact: /productpage - uri: prefix: /static - uri: exact: /login - uri: exact: /logout - uri: prefix: /api/v1/products route: - destination: host: productpage port: number: 9080 EOF ``` ### Expose istio ingress gateway svc ```bash= oc expose service istio-ingressgateway -n bookinfo HOST=$(oc get route istio-ingressgateway -n bookinfo -o jsonpath='{.spec.host}') curl -k -I "http://$HOST/productpage" HTTP/1.1 200 OK server: istio-envoy date: Thu, 10 Apr 2025 02:32:21 GMT content-type: text/html; charset=utf-8 content-length: 9429 vary: Cookie x-envoy-upstream-service-time: 77 set-cookie: cd10b69e39387eb7ec9ac241201ab1ab=3a438e42bb8cb835119129d0c202e377; path=/; HttpOnly ``` ## Enable SM distributed tracing with Tempo https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/observability/index#ossm-distr-tracing ### Install Tempo Operator with default setting ![image](https://hackmd.io/_uploads/ByiZBjroC.png) ### Create TempoStack in ns/tempo ```bash= oc new-project tempo ``` Check minio's svc under ns/minio ```bash= [quickcluster@upi-0 ~]$ oc get svc -n minio NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE minio NodePort 172.30.63.66 <none> 9000:30080/TCP 8m9s ``` >> minio.minio.svc:9000 Create Secret with S3 endpoint info for Tempo ```bash= cat << EOF | oc apply -f - apiVersion: v1 kind: Secret metadata: name: minio namespace: tempo stringData: endpoint: http://minio.minio.svc:9000 bucket: tempo access_key_id: tempo access_key_secret: supersecret type: Opaque EOF ``` Create a TempoStack named `sample` under ns/tracing-system ```bash= cat << EOF | oc apply -f - apiVersion: tempo.grafana.com/v1alpha1 kind: TempoStack metadata: name: sample namespace: tempo spec: storageSize: 1Gi # 1Gi is for POC, Default is 10Gi storage: secret: name: minio type: s3 resources: total: limits: memory: 2Gi cpu: 2000m template: queryFrontend: jaegerQuery: enabled: true ingress: route: termination: edge type: route EOF ``` Once TempoStack is ready, You will be able to access Jaeger console in ns/tracing-system. ```bash= [quickcluster@upi-0 ~]$ oc get route NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD tempo-sample-query-frontend tempo-sample-query-frontend-tracing-system.apps.smtest.lab.psi.pnq2.redhat.com tempo-sample-query-frontend jaeger-ui edge None ``` ### Install Red Hat build of OpenTelemetry Operator Install RH build of OpenTelemetry Operator with default setting. https://docs.redhat.com/en/documentation/openshift_container_platform/4.16/html/red_hat_build_of_opentelemetry/install-otel ![image](https://hackmd.io/_uploads/H1cZbnEAJl.png) ### Create a OpenTelemetryCollector in ns/istio-system ```bash= oc project istio-system cat << EOF | oc apply -f - kind: OpenTelemetryCollector apiVersion: opentelemetry.io/v1beta1 metadata: name: otel namespace: istio-system spec: observability: metrics: {} deploymentUpdateStrategy: {} config: exporters: otlp: endpoint: 'tempo-sample-distributor.tempo.svc.cluster.local:4317' tls: insecure: true receivers: otlp: protocols: grpc: endpoint: '0.0.0.0:4317' http: {} service: pipelines: traces: exporters: - otlp receivers: - otlp EOF ``` Check otel-collector pod and svc ```bash= [quickcluster@upi-0 ~]$ oc get po,svc -n istio-system NAME READY STATUS RESTARTS AGE pod/otel-collector-7659b9d696-ktsgk 2/2 Running 0 112s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/otel-collector ClusterIP 172.30.215.76 <none> 4317/TCP 112s service/otel-collector-headless ClusterIP None <none> 4317/TCP 112s service/otel-collector-monitoring ClusterIP 172.30.185.70 <none> 8888/TCP 112s ``` ### Configure Red Hat OpenShift Service Mesh to enable tracing, and define the distributed tracing data collection tracing providers in your meshConfig ```bash= > oc edit Istio default apiVersion: sailoperator.io/v1 kind: Istio metadata: # ... name: default spec: namespace: istio-system # ... values: meshConfig: # <- Add enableTracing: true # <- Add extensionProviders: # <- Add - name: otel # <- Add opentelemetry: # <- Add port: 4317 # <- Add service: otel-collector.istio-system.svc.cluster.local # <- Add ``` ### Create Telemetry resource as we defined in Istio extensionProvider ```bash= cat << EOF | oc apply -f - apiVersion: telemetry.istio.io/v1 kind: Telemetry metadata: name: otel-demo namespace: istio-system spec: tracing: - providers: - name: otel randomSamplingPercentage: 100 EOF ``` You can now generate some traffic to bookinfo and check the Jaegr frontend console ```bash= HOST=$(oc get route istio-ingressgateway -n bookinfo -o jsonpath='{.spec.host}') curl -k -I "http://$HOST/productpage" HTTP/1.1 200 OK server: istio-envoy date: Thu, 10 Apr 2025 02:32:21 GMT content-type: text/html; charset=utf-8 content-length: 9429 vary: Cookie x-envoy-upstream-service-time: 77 set-cookie: cd10b69e39387eb7ec9ac241201ab1ab=3a438e42bb8cb835119129d0c202e377; path=/; HttpOnly ``` ```bash= [quickcluster@upi-0 ~]$ oc get route -n tempo NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD tempo-sample-query-frontend tempo-sample-query-frontend-tempo.apps.smtest.lab.upshift.rdu2.redhat.com tempo-sample-query-frontend jaeger-ui edge None ``` ## Integrate SM3.0 with Kiali https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/observability/index#ossm-kiali-about_ossm-kiali Install Kiali Operator with default setting. ![image](https://hackmd.io/_uploads/SyuswsNAkx.png) ### Enable User Workload Monitoring ```bash= oc -n openshift-monitoring edit configmap cluster-monitoring-config apiVersion: v1 kind: ConfigMap metadata: name: cluster-monitoring-config namespace: openshift-monitoring data: config.yaml: | enableUserWorkload: true # ADD ``` ### Create a YAML file named servicemonitor.yml to monitor the Istio control plane https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/observability/index#ossm-config-openshift-monitoring-only_ossm-metrics ```bash= cat << EOF | oc apply -f - apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: istiod-monitor namespace: istio-system spec: targetLabels: - app selector: matchLabels: istio: pilot endpoints: - port: http-monitoring interval: 30s EOF ``` ### Create a YAML file named podmonitor.yml to collect metrics from the Istio proxies 1. for istio-system namespace ```bash= vi podMonitor-istioSystem.yaml apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: istio-proxies-monitor namespace: istio-system spec: selector: matchExpressions: - key: istio-prometheus-ignore operator: DoesNotExist podMetricsEndpoints: - path: /stats/prometheus interval: 30s relabelings: - action: keep sourceLabels: ["__meta_kubernetes_pod_container_name"] regex: "istio-proxy" - action: keep sourceLabels: ["__meta_kubernetes_pod_annotationpresent_prometheus_io_scrape"] - action: replace regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4}) replacement: '[$2]:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - action: replace regex: (\d+);((([0-9]+?)(\.|$)){4}) replacement: '$2:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_name","__meta_kubernetes_pod_label_app"] separator: ";" targetLabel: "app" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_version","__meta_kubernetes_pod_label_version"] separator: ";" targetLabel: "version" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_namespace"] action: replace targetLabel: namespace - action: replace replacement: "mesh-default" targetLabel: mesh_id ``` ``` oc apply -f podMonitor-istioSystem.yaml ``` 2. for bookinfo namespace ```bash= vi podMonitor-bookinfo.yaml apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: istio-proxies-monitor namespace: bookinfo spec: selector: matchExpressions: - key: istio-prometheus-ignore operator: DoesNotExist podMetricsEndpoints: - path: /stats/prometheus interval: 30s relabelings: - action: keep sourceLabels: ["__meta_kubernetes_pod_container_name"] regex: "istio-proxy" - action: keep sourceLabels: ["__meta_kubernetes_pod_annotationpresent_prometheus_io_scrape"] - action: replace regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4}) replacement: '[$2]:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - action: replace regex: (\d+);((([0-9]+?)(\.|$)){4}) replacement: '$2:$1' sourceLabels: ["__meta_kubernetes_pod_annotation_prometheus_io_port","__meta_kubernetes_pod_ip"] targetLabel: "__address__" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_name","__meta_kubernetes_pod_label_app"] separator: ";" targetLabel: "app" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_pod_label_app_kubernetes_io_version","__meta_kubernetes_pod_label_version"] separator: ";" targetLabel: "version" action: replace regex: "(.+);.*|.*;(.+)" replacement: "${1}${2}" - sourceLabels: ["__meta_kubernetes_namespace"] action: replace targetLabel: namespace - action: replace replacement: "mesh-default" targetLabel: mesh_id ``` ``` oc apply -f podMonitor-bookinfo.yaml ``` ### On the OpenShift Console go to Observe → Metrics, and run the query istio_requests_total Should see metrics ![image](https://hackmd.io/_uploads/Bk8Q8hVCJg.png) ### Create a ClusterRoleBinding for Kiali to access metrics ```bash= cat << EOF | oc apply -f - apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kiali-monitoring-rbac roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-monitoring-view subjects: - kind: ServiceAccount name: kiali-service-account namespace: istio-system EOF ``` ### Create Kiali instance with default setting in ns/istio-system **!!!! replace tempo URLs !!!!** ```bash= cat << EOF | oc apply -f - kind: Kiali apiVersion: kiali.io/v1alpha1 metadata: name: kiali namespace: istio-system spec: external_services: grafana: enabled: false prometheus: auth: type: bearer use_kiali_token: true thanos_proxy: enabled: true url: 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091' tracing: auth: ca_file: /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt insecure_skip_verify: true type: bearer use_kiali_token: true enabled: true in_cluster_url: 'http://tempo-sample-query-frontend.tempo.svc.cluster.local:3200' provider: tempo tempo_config: url_format: jaeger url: 'https://tempo-sample-query-frontend-tempo.apps.smtest.lab.upshift.rdu2.redhat.com' use_grpc: false version: default istio_namespace: istio-system deployment: logger: log_level: info view_only_mode: false EOF ``` ## [Optional] Enable Kiali OCP console plugin https://docs.redhat.com/en/documentation/red_hat_openshift_service_mesh/3.0/html-single/observability/index#ossm-install-console-plugin-ocp-web-console_ossm-console-plugin ```bash= cat <<EOM | oc apply -f - apiVersion: kiali.io/v1alpha1 kind: OSSMConsole metadata: namespace: openshift-operators name: ossmconsole spec: version: default EOM ``` # References Install NFS Operator - https://developers.redhat.com/articles/2022/04/20/create-and-manage-local-persistent-volumes-codeready-containers Install Minio in OCP CRC - https://blog.min.io/develop-on-openshift-with-minio/