> The helm charts mentioned in this doc are primarily from [Prometheus Community](https://github.com/prometheus-community/helm-charts/)
## Prometheus Stack
The [kube-prometheus stack](https://github.com/prometheus-operator/kube-prometheus), a collection of Kubernetes manifests, [Grafana](http://grafana.com/) dashboards, and [Prometheus rules](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with [Prometheus](https://prometheus.io/) using the [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator).
### Installation
```bash
helm pull kube-prometheus-stack --repo https://prometheus-community.github.io/helm-charts
helm install prometheus-stack kube-prometheus-stack-48.3.1.tgz --create-namespace --namespace monitoring
```
* **Note:**
The release name is `prometheus-stack`. It will be referenced multiple times in this doc.
## Redis Exporter
### Installation
```
helm pull prometheus-redis-exporter --repo https://prometheus-community.github.io/helm-charts
helm install redis-exporter prometheus-redis-exporter-5.5.0.tgz --namespace redis -f redis-exporter.yaml
```
### Values file `redis-exporter.yaml`
```yaml
env:
- name: REDIS_EXPORTER_IS_CLUSTER
value: "true"
redisAddress: ml-redis-leader:6379
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
interval: 60s
timeout: 30s
auth:
enabled: true
secret:
name: ml-redis-auth
key: redis-password
```
### Grafana Dashboards
#### [redis-dashboard-for-prometheus-redis-exporter-1](https://grafana.com/grafana/dashboards/11835-redis-dashboard-for-prometheus-redis-exporter-helm-stable-redis-ha/)
*Currently used*
#### [redis-dashboard-for-prometheus-redis-exporter-2](https://grafana.com/grafana/dashboards/11692-redis-dashboard-for-prometheus-redis-exporter-1-x/)
## RabbitMQ Exporter
### Installation
```bash
helm pull prometheus-rabbitmq-exporter --repo https://prometheus-community.github.io/helm-charts
helm install rabbitmq-exporter prometheus-rabbitmq-exporter-1.8.0.tgz --namespace rabbitmq -f rabbitmq-exporter.yaml
```
The helm chart has a bug. The "rabbitmq.password" set in the values file doesn't take effect. Has to use a k8s Security. See the [link](https://github.com/prometheus-community/helm-charts/pull/3649) for details.
I modified the helm chart in order to make "rabbitmq.password" work. Why didn't use the Rabbitmq Security created by the rabbitmq Operator? Because somehow the username/password stored in the Security is the default ones but the ones I set in RabbitmqCluster, a k8s CR handled by the rabbitmq Operator. The Patch is as follows,
```diff
diff --git a/charts/prometheus-rabbitmq-exporter/templates/deployment.yaml b/charts/prometheus-rabbitmq-exporter/templates/deployment.yaml
index 7c4bfd0b..4b66890b 100644
--- a/charts/prometheus-rabbitmq-exporter/templates/deployment.yaml
+++ b/charts/prometheus-rabbitmq-exporter/templates/deployment.yaml
@@ -47,7 +47,7 @@ spec:
{{- if .Values.rabbitmq.configMapOverrideReference }}
- configMapRef:
name: {{ .Values.rabbitmq.configMapOverrideReference }}
- {{- end }}
+ {{- end }}
env:
{{- if .Values.rabbitmq.existingPasswordSecret }}
- name: RABBIT_PASSWORD
@@ -55,6 +55,9 @@ spec:
secretKeyRef:
name: "{{ Values.rabbitmq.existingPasswordSecret }}"
key: {{ Values.rabbitmq.existingPasswordSecretKey }}
+ {{- else if .Values.rabbitmq.password }}
+ - name: RABBIT_PASSWORD
+ value: {{ .Values.rabbitmq.password }}
{{- end }}
ports:
- containerPort: {{ .Values.service.internalPort }}
```
### Values file `rabbitmq-exporter.yaml`
```yaml
rabbitmq:
url: http://ml-rmq.rabbitmq:15672
user: ml-rabbitmq
password: s1t2c3b4@rabbitmq
capabilities: bert,no_sort
include_queues: ".*"
include_vhost: ".*"
skip_queues: "^$"
skip_verify: "false"
skip_vhost: "^$"
exporters: "exchange,node,overview,queue"
output_format: "TTY"
timeout: 30
max_queues: 0
excludeMetrics: ""
prometheus:
monitor:
enabled: true
interval: 60s
namespace:
- rabbitmq
additionalLabels:
release: prometheus-stack
```
### Grafana Dashboards
#### [The Dashboard for prometheus rabbitmq exporter](https://grafana.com/grafana/dashboards/4279-rabbitmq-monitoring/)
*Currently used*
#### [The Official Dashboard](https://grafana.com/grafana/dashboards/10991-rabbitmq-overview/)
*Currently used*
The official dashboard doesn't consume the data from rabbitmq exporter, but consumes the data directly from rabbitmq server.
It requires `rabbitmq-prometheus` to be enabled, a built-in plugin since [RabbitMQ v3.8.0](https://github.com/rabbitmq/rabbitmq-server/releases/tag/v3.8.0). The plugin is enabled by default in most of the cases.
```bash
# Enable the plugin
# Run the command within a rabbitmq container
rabbitmq-plugins enable rabbitmq_prometheus
```
So has to configure the `prometheus` to let it scrape data from rabbitmq server. Thus, a `ServiceMonitor` is going to be need (if Prometheus Operator is being used, which is just my case). The `ServiceMonitor` can be installed through the following command. See the [official link](https://www.rabbitmq.com/kubernetes/operator/operator-monitoring.html) for details.
```bash
curl -O https://raw.githubusercontent.com/rabbitmq/cluster-operator/main/observability/prometheus/monitors/rabbitmq-servicemonitor.yml
# Edit the manifest file, see the diff below
kubectl -n monitoring apply -f rabbitmq-servicemonitor.yml
```
```diff
--- rabbitmq-servicemonitor.yml.orig 2023-08-08 15:58:22.505454451 -0700
+++ rabbitmq-servicemonitor.yml 2023-08-08 16:02:14.368766617 -0700
@@ -3,7 +3,8 @@
kind: ServiceMonitor
metadata:
name: rabbitmq
- # If labels are defined in spec.serviceMonitorSelector.matchLabels of your deployed Prometheus object, make sure to include them here.
+ labels:
+ release: prometheus-stack
spec:
endpoints:
- port: prometheus
@@ -45,4 +46,5 @@
matchLabels:
app.kubernetes.io/component: rabbitmq
namespaceSelector:
- any: true
+ matchNames:
+ - rabbitmq
```
* **Note:**
The label `release: prometheus-stack` is very important. Until the label is added, the `ServiceMonitor` won't work.
Why?
Run the following command to check the spec the CR `Prometheus`
```bash
kubectl -n monitoring get prometheus prometheus-stack-kube-prom-prometheus -o yaml
```
And note the part as follows,
```yaml
spec:
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prometheus-stack
```
It means the Prometheus Operator will keep watching the `ServiceMonitor` in any namespaces and only pay attention to those having label `release: prometheus-stack`.
## ElasticSearch Exporter
### Installation
```bash
helm pull prometheus-elasticsearch-exporter --repo https://prometheus-community.github.io/helm-charts
helm install elasticsearch-exporter prometheus-elasticsearch-exporter-5.2.0.tgz --namespace opensearch -f elasticsearch-exporter.yaml
```
### Values file `elasticsearch-exporter.yaml`
```yaml
env:
ES_USERNAME: ml-elastic
extraEnvSecrets:
ES_PASSWORD:
secret: opensearch-extra-admin-password
key: password
es:
uri: https://ml-os:9200
sslSkipVerify: true
all: true
indices: true
indices_settings: true
indices_mappings: true
aliases: false
shards: true
snapshots: true
cluster_settings: false
slm: false
data_stream: false
timeout: 30s
serviceMonitor:
enabled: true
namespace: monitoring
labels:
release: prometheus-stack
interval: 30s
scrapeTimeout: 10s
```
### Grafana Dashboards
#### [ElasticSearch](https://grafana.com/grafana/dashboards/6483-elasticsearch/)
*Currently used*
Bugs
- For "cluster health", it displays "N/A"
#### [Elasticsearch Exporter Quickstart and Dashboard](https://grafana.com/grafana/dashboards/14191-elasticsearch-overview/)
*Currently used*
Similar to the upper one and with more recent update
#### [Elasticsearch - Index Stats](https://grafana.com/grafana/dashboards/13072-elasticsearch-index-stats/)
*Currently used*
#### [Elasticsearch Cluster - Indices](https://grafana.com/grafana/dashboards/3598-elasticsearch-cluster-indices/)
> This dashboard monitors a cluster using the data collected through the x-pack monitoring collector.
??? I haven't figured out how to add x-pack data source for this dashboard.
## OpenCTI Server
Two kinds of Metrics
- General NodeJS metrics
There are plenty of third-party Grafana Dashboards (see the [Grafana Dashboards](#Grafana-Dashboards3) below).
- OpenCTI specific metrics
There is no any third-party Grafana Dashboards.
Have to create on my own.
Access `<http://<opencti-server-exporter>:14269/metrics` to find out what OpenCTI specific metrics there are.
### How a NodeJS Application exposes its metrics
The NodeJS metrics works this way
```mermaid
graph LR;
a("NodeJS Application<br/>(Also a Prometheus Exporter)") --> |scraped by| b(Prometheus) --> |fetched by| c(Grafana) --> |displayed in| d(A Dashboard)
```
I.e., it works as long as the Nodejs Applications make themselves the Prometheus Exporter. AFAIK, there are two ways for a NodeJS Application to be a Prometheus Exporter, one is to use [Prom-Client](https://www.npmjs.com/package/prom-client), another is to use [Express Prometheus Middleware](https://www.npmjs.com/package/express-prometheus-middleware).
### Enable Prometheus Exporter of OpenCTI Server
Via configuration file
```json
"app": {
"telemetry": {
"metrics": {
"enabled": true,
"exporter_prometheus": 14269
}
}
}
```
Via environment variables
```bash
APP_TELEMETRY__METRICS__ENABLED="true"
APP_TELEMETRY__METRICS__EXPORTER_PROMETHEUS="14269"
```
### Kubernetes Resources
#### The changes of `Service`
*Used to open the exporter port for Prometheus*
Example
```yaml
apiVersion: v1
kind: Service
metadata:
...
spec:
ports:
- name: prometheus
port: 14269
targetPort: 14269
```
The `prometheus` port can be added in the existing OpenCTI `Service`.
Or create a new `Service` for the exporter only, if the existing OpenCTI `Service` is a public-facing service (e.g., of type `NodePort` and `LoadBalancer`), and you don't want to expose the `prometheus` port to the public.
#### A new `ServiceMonitor`
*Used to configure Prometheus to scrape data from OpenCTI*
Example
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: opencti-server
app.kubernetes.io/component: opencti-server
prometheus-exporter: opencti-server
release: prometheus-stack
name: opencti-prometheus-exporter-opencti-server
namespace: monitoring
spec:
endpoints:
- honorLabels: true
port: prometheus
jobLabel: opencti
namespaceSelector:
matchNames:
- opencti
selector:
matchLabels:
app: opencti-server
app.kubernetes.io/component: opencti-server
prometheus-exporter: opencti-server
```
### Grafana Dashboards
#### [Node.js and Express Metrics](https://grafana.com/grafana/dashboards/14565-node-js-dashboard/)
*Currently Used*
Work with the Nodejs Applications that uses [Express Prometheus Middleware](https://www.npmjs.com/package/express-prometheus-middleware) to turn themselves into the Prometheus Exporter.
Some visual panel doesn't work (has no data to display), because the current OpenCTI is not using [Express Prometheus Middleware](https://www.npmjs.com/package/express-prometheus-middleware) any more. (It was using it, here is the [evidence](https://github.com/OpenCTI-Platform/opencti/pull/1598/files), take look at the diff of file `package.json`)
#### [NodeJS Application Dashboard](https://grafana.com/grafana/dashboards/11159-nodejs-application-dashboard/)
*Currently Used*
*The most downloaded nodejs dashboards*
Work with the Nodejs Applications that uses [Prom-Client](https://www.npmjs.com/package/prom-client) to turn themselves into the Prometheus Exporter.
## OpenCTI Worker
### Enable Prometheus Exporter of OpenCTI Worker
Via environment variable
```bash
WORKER_TELEMETRY_ENABLED="TRUE"
WORKER_PROMETHEUS_TELEMETRY_PORT="14270"
```
### Kubernetes Resources
#### A new `Service`
```yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: opencti-worker
app.kubernetes.io/component: opencti-worker
prometheus-exporter: opencti-worker
name: opencti-prometheus-exporter-opencti-worker
namespace: opencti
spec:
ports:
- name: prometheus
port: 14270
protocol: TCP
targetPort: 14270
selector:
app: opencti-worker
app.kubernetes.io/component: opencti-worker
app.kubernetes.io/instance: opencti
app.kubernetes.io/name: opencti
```
#### A new `ServiceMonitor`
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: opencti-worker
app.kubernetes.io/component: opencti-worker
prometheus-exporter: opencti-worker
release: prometheus-stack
name: opencti-prometheus-exporter-opencti-worker
namespace: monitoring
spec:
endpoints:
- honorLabels: true
port: prometheus
jobLabel: opencti
namespaceSelector:
matchNames:
- opencti
selector:
matchLabels:
app: opencti-worker
app.kubernetes.io/component: opencti-worker
prometheus-exporter: opencti-worker
```
### Grafana Dashboards
No any predefined (third-party of official) Grafana Dashboards available.
Have to create on my own.
Access `http://<opencti-worker-exporter-service>:14270/metrics` to find out what metrics are exposed.