# Openshift cluster monitoring operator 我才不告訴你勒
<!-- Put the link to this slide here so people can follow -->
slide: https://hackmd.io/p/OonUQ9QKQ7-7JPBd1N9tOA?both
---
We have a collaborative session
please prepare laptop or smartphone to join!
---
## Who am I?
- Jason Li
- SRE/Backend developer
- :heart: kubernetes Go Rust
- :cat: lover
- 不斷的從入門到放棄
---
## Agenda
- Background
- Related Work
- Method
- Conclusion
---
## Background
Prometheus Operator, Prometheus, Prometheus Adapter, kube-state-metrics, ... e.t.c.
In order to manage such diverse components, a centralized management configuration file is required.
---
## Related Work
- UI
- Prometheus
- Metrics
- Thanos
---
### UI
- Grafana
---
### Prometheus
- Prometheus Operator
- Prometheus-k8s
:-1: - Prometheus-user-workload
- Alertmanager
---
#### Prometheus Operator
- Provide Kubernetes native deployment and management related monitoring components.
- automate the configuration of a Prometheus based monitoring stack for Kubernetes clusters.
- Prometheus
- Alertmanager
- Related components
---
#### Prometheus Operator(cont’d)

---
### Metrics
- node-exporter
- kube-state-metrics
- openshift-state-metrics
:-1: prometheus-adapter
:-1: Telemeter Client
:-1: configuration sharing
---
#### node-exporter
- Node exporter for hardware and OS metrics exposed by *NIX kernels.
- We can scrape, including a wide variety of system metrics further down in the output (prefixed with node_).
---
```bash
# HELP node_network_transmit_queue_length transmit_queue_length value of /sys/class/net/<iface>.
# TYPE node_network_transmit_queue_length gauge
node_network_transmit_queue_length{device="br0"} 1000
node_network_transmit_queue_length{device="eth0"} 1000
node_network_transmit_queue_length{device="lo"} 1000
node_network_transmit_queue_length{device="ovs-system"} 1000
node_network_transmit_queue_length{device="tun0"} 1000
node_network_transmit_queue_length{device="veth24377b8e"} 0
node_network_transmit_queue_length{device="veth58bd788d"} 0
...
```
---

---
#### kube-state-metrics
- Focused on the health of the individual Kubernetes components, such as deployments, nodes and pods.
- Exposes raw data unmodified from the Kubernetes API
- Designed to be consumed either by Prometheus
---
#### openshift-state-metrics
- Expands upon kube-state-metrics by adding metrics for OpenShift specific resources.
- Expose cluster-level metrics for OpenShift specific resources
---
#### openshift-state-metrics (cont’d)
- BuildConfig Metrics
- Build Metrics
- DeploymentConfig Metrics
- ClusterResourceQuota Metrics
- Route Metrics
- Group Metrics
ref: https://github.com/openshift/openshift-state-metrics
---
### Thanos
- Thanos
- Thanos Querier
- Thanos Ruler
---
### Thanos
- Have a global view
- Have an HA in place
- Unlimited retention
ref : https://kkc.github.io/2019/08/22/coscup-ha-prometheus-solution-thanos/
---
### Thanos

ref : https://banzaicloud.com/blog/multi-cluster-monitoring/
## Method
|Component|Key|
|--- |--- |
|Prometheus Operator|prometheusOperator|
|Prometheus|prometheusK8s|
|Alertmanager|alertmanagerMain|
|kube-state-metrics|kubeStateMetrics|
|openshift-state-metrics|openshiftStateMetrics|
|Grafana|grafana|
|Telemeter Client|telemeterClient|
|Prometheus Adapter|k8sPrometheusAdapter|
|Thanos Querier|thanosQuerier|
---
## Method (cont’d)
- Only Prometheus and Alertmanager have extensive configuration options.
- Other components usually provide only the nodeSelector field.
---
## Method (cont’d)
move components to the node
```yaml
data:
config.yaml: |
prometheusOperator:
nodeSelector:
foo: bar
prometheusK8s:
nodeSelector:
foo: bar
```
---
persistent volume claim
```yaml
data:
config.yaml: |
prometheusK8s:
volumeClaimTemplate:
metadata:
name: localpvc
spec:
storageClassName: local-storage
resources:
requests:
storage: 40Gi
```
---
### custom Alertmanager configuration
- At this stage, cluster monitoring does not provide Alertmanager settings
---
## Conclusion
:100: :muscle: :tada:
### Wrap up
- Self-updating monitoring stack that is based on Prometheus wider eco-system
- Provides monitoring of cluster components
- Expect to manage each component through the configuration file:tada:
---
## Thank you! :sheep:
{"metaMigratedAt":"2023-06-15T10:44:33.099Z","metaMigratedFrom":"YAML","title":"Openshift cluster monitoring operator 我才不告訴你勒","breaks":true,"description":"View the slide with \"Slide Mode\".","contributors":"[{\"id\":\"677281f5-41a8-40d8-be4d-0f41319f96cf\",\"add\":7472,\"del\":2672}]"}