Prometheus@EPFL

<style> .reveal { /*background-color: #eaeaea;*/ background-image: url('https://epfl-si.github.io/elements/svg/epfl-logo.svg'); background-repeat: no-repeat; background-position: 5px 5px; } .reveal { color: #1c1c1c; } .reveal h1, .reveal h2, .reveal h3, .reveal h4, .reveal h5, .reveal h6 { color: #eee; text-shadow: 2px 2px #ff0000; } .reveal a { color: #f009; } .reveal a:hover { color: #f00; } .reveal code { padding-top: 0.2em; padding-bottom: 0.2em; margin: 0; font-size: 85%; background-color: rgba(255, 255, 255, 0.46); border-radius: 3px; } [data-contrast="on"] > div { background-color: #ffffff50; } [data-contrast="on+"] > div { background-color: #ffffff99; } </style>  ## Prometheus Présentation à l'attention de ITOP-SDDC Nicolas Borboën <<nicolas.borboen@epfl.ch>> ----  ## Introduction ----  ## Pourquoi est-on là Note: - Mieux comprendre prometheus - Grouper les efforts - Partage de connaissance - Homogénisation des outils ----  ## Lien vers la présentation https://hackmd.io/@ponsfrilus/prometheus Toutes les photos viennent du site https://unsplash.com (libre de droits) Note: Le thème des images de fond est le sable noir (https://en.wikipedia.org/wiki/Black_sand) ---  ![](https://i.imgur.com/0vw8VpF.png) ----  ## Histoire * Prometheus was developed at SoundCloud starting in 2012 * https://sre.google/sre-book/practical-alerting/#the-rise-of-borgmon ----  > Prometheus is a **metrics collection** and **alerting tool** developed and released to open source by SoundCloud. Prometheus is similar in design to **Google's Borgmon monitoring system**. Properly tuned and deployed, a Prometheus cluster **can collect millions of metrics every second**. > Source: https://www.redhat.com/sysadmin/introduction-prometheus-metrics-and-performance-monitoring   ----  ## Time series > Prometheus fundamentally **stores all data as time series**: streams of timestamped values belonging to the same metric and the same set of labeled dimensions. > Source: https://prometheus.io/docs/concepts/data_model/   ----  ## Metrics > Every time series is uniquely identified by its **metric name** and optional key-value pairs called labels. > Source: https://prometheus.io/docs/concepts/data_model/   ----  ## Metrics * Counter * Gauge * Histogram * Summary Source: https://prometheus.io/docs/concepts/metric_types/ ----  ## Counter > A counter is a cumulative metric that represents a single monotonically increasing counter whose value can only **increase or be reset to zero** on restart.   ----  ## Gauge > A gauge is a metric that represents a **single numerical value** that can arbitrarily go up and down.   ----  ## Histogrammes and summary > A histogram **samples** observations and counts them in configurable **buckets**. It also provides a **sum of all observed values**. > > Similar to a histogram, a summary samples observations.   ----  ## Labels > Labels enable Prometheus's **dimensional data model**: any given combination of labels for the same metric name identifies a particular dimensional instantiation of that metric. See also the [best practices for naming metrics and labels](https://prometheus.io/docs/practices/naming/). ----  ## Examples Given a metric name and a set of labels, time series are frequently identified using this notation: `<metric name>{<label name>=<label value>, ...}` For example, a time series with the metric name api_http_requests_total and the labels method="POST" and handler="/messages" could be written like this: `api_http_requests_total{method="POST", handler="/messages"}` ----  ## PromQL > The **query language** allows filtering and aggregation based on these dimensions. > > Changing any label value, including adding or removing a label, will create a new time series.   ----  ## Federation > Federation allows a Prometheus server to **scrape** selected time series from **another** Prometheus server. Source: https://prometheus.io/docs/prometheus/latest/federation/ ---  ## Eco-system ----  ## Prometheus ----  ## Exporters * node-exporter * blackbox-exporter * ... * https://prometheus.io/docs/instrumenting/exporters/ ----  ## Alertmanager * https://prometheus.io/docs/alerting/latest/alertmanager/ * https://github.com/prometheus/alertmanager ----  ## Pushgateway The Pushgateway is an intermediary service which allows you to push metrics from jobs which cannot be scraped. https://prometheus.io/docs/practices/pushing/#when-to-use-the-pushgateway ----  ## Grafana ![](https://i.imgur.com/azj9mxH.png) https://grafana.com/ ----  ## Thanos ![](https://i.imgur.com/JlrJjKl.png) https://thanos.io/ ---  ## Deployment * Prometheus Operator * kube-prometheus * helm chart → détails ----  ### Prometheus Operator https://github.com/prometheus-operator/prometheus-operator The Prometheus Operator uses Kubernetes custom resources to simplify the deployment and configuration of Prometheus, Alertmanager, and related monitoring components. ----  ### kube-prometheus https://github.com/prometheus-operator/kube-prometheus kube-prometheus provides example configurations for a complete cluster monitoring stack based on Prometheus and the Prometheus Operator. ----  ### helm chart https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack The prometheus-community/kube-prometheus-stack helm chart provides a similar feature set to kube-prometheus. This chart is maintained by the Prometheus community. ----  ## Argo CD ![](https://i.imgur.com/89rQUUg.png) https://argoproj.github.io/ ---  ## Stockage * Thanos, S3, WAL 2Go, PVC 4Go (4h) ---  ## Mise à jour [stability promise](https://prometheus.io/docs/prometheus/latest/migration/) ----  ## Promtool * https://prometheus.io/docs/prometheus/latest/configuration/unit_testing_rules/ --- ## WordPress ![](https://i.imgur.com/bCNcwr6.png) ---- ### Conteneurs dans les pods prometheus et pushgateways d'OpenShift ![](https://i.imgur.com/3WfG82O.png) ---- ### Détails des containers du NOC ![](https://i.imgur.com/Eve9I05.png) ---  ## Demo(s) ----  ## Monitoringrafana Un petit projet ayant pour but de déployer prometheus, grafana et un node-exporter avec docker. Fait pour hacker et comprendre. Parfait sur un laptop. https://github.com/epfl-dojo/monitoringrafana ----  ## Demo as a service ---  ## This is the end Encore une question ? Un point à disctuer ? Nicolas Borboën <<nicolas.borboen@epfl.ch>>