Prometheus Overview - HackMD

Prometheus Overview

B06902031 資工四何承勳

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Introduction

Prometheus is an pen-source systems monitoring and alerting toolkit. It is written in Go, licensed under the Apache 2 License
Prometheus collects metrics from configured targets at given intervals, and store the time series data in its database.
Prometheus runs rules over the collected data to aggregate these data or generate alerts.
Several dashboards are available for administrator to visualize the collected data.

Features

Prometheus stores all data as time series, which can be identified by metric names.
Prometheus provides a query language called PromQL that allows the user to query and aggregate time series data.
Prometheus collects data via HTTP PULL method. Alternatively, pushing mechanism is supported through push gateway.
Prometheus can trigger alerts if certain condition is observed to be true.

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Prometheus Server

The Prometheus Server retrieves data from monitored target, stores time series data into the database, and provide interface for users to query the database.
Generically, consists of three components:
- Time Series Database (TSDB)
- HTTP Server
- Prometheus Quering Language (PromQL)

TSDB

Prometheus server consists of a Time Series database (TSDB). A TSDB is a database optimized for handling time series data.
Prometheus stores all data by time series. Every time series is uniquely identified by its metric name and optional key-value pairs called labels.

Metrics

For example, a time series with the metric name prometheus_http_requests_total (which indicates the number of accumulated http requests to Prometheus Server), and the labels method="POST" (which specifies the number of POST requests) could be written like the following:

prometheus_http_requests_total{method="POST", handler="/messages"}

Metrics

Metrics can be categorized into four types:
1. Counter: Metrics that can be accumulated, such as the number of occurrences of an HTTP Get requests.
2. Gauge: Any change metric that is instantaneous and independent of time, such as memory usage.
3. Histogram: Mainly used to represent data sampling within a period of time.
4. Summary: Similar to Histogram, it is used to represent the summary of data sampling in a time range.

PromQL

PromQL (Prometheus Query Language) is a quering language provided by Prometheus that allows the user to select, examine, and aggregate time series data.
Official Documentation of PromQL

HTTP Server

Prometheus server provides a HTTP API for users to query the database.
The current stable HTTP API is reachable under /api/v1 on a Prometheus server.
The API response format is in JSON.

HTTP Server

Example: We can use curl to query the Prometheus server:

curl 'http://localhost:9090/api/v1/query?query=up

The Prometheus server will then return the result in JSON format:

{
   "status" : "success",
   "data" : {
      "resultType" : "vector",
      "result" : [
         {
            "metric" : {
               "__name__" : "up",
               "job" : "prometheus",
               "instance" : "localhost:9090"
            },
            "value": [ 1435781451.781, "1" ]
         },
      ]
   }
}

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Exporters

Exporters are used to expose metrics of third-party services to Prometheus Server. The Exporters are installed on the monitored device.
Exporters will expose an http endpoint for Prometheus server to retrieve metrics. Prometheus mainly uses HTTP PULL method to collect metrices. It retrieves metrics by periodically pulling metrics from the monitored target's http endpoints.
Exporters is written using Prometheus Client Libraries. The library supports many differnt languages. The client library provides an API that can sends the metrics back to the server when Prometheus scrapes the target's HTTP endpoint.

Exporter

Node exporter is one of the most common official exporter. It exposes some hardware and OS metrics of UNIX kernels. For example: CPU usage, memory statics, disk I/O statistics, network statistics, and so on. (Node Exporter Github Page)
Mysql server exporter is another common official exporter. It allows us to monitor, measure database performance, examine resource utilization, and so on. (MySQL Exporter Github Page)
If no existing exporters meet our need, we can write our own exporter using Prometheus Client Libraries.

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Pushgateway

Occasionally, we might need to monitor components which cannot be scraped. In this case, the Pushgateway is used to tackle the problem. These metrices will be pushed onto the Pushgateway first, then Prometheus will periodically pull the metrics from the Pushgateway.
In the official documentation, it states that "Usually, the only valid use case for the Pushgateway is for capturing the outcome of a service-level batch job". An example of "service-level" batch job is deleting a number of users for an entire service. is a discrete job which is not related to a specific machine.
In conclusion, the Pushgateway is seldomly used.

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Alert Manager

By defining alarm rule in Prometheus' configuration file, Prometheus will periodically calculate the alarm rule. If it meets the alarm trigger conditions, it will push an alarm to the Alertmanager.
The Alertmanager can further inform the administrator some abnormal events via email, Pagerduty, etc.

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Expression Browser

The expression browser is available at /graph on the Prometheus server. It allowing us to enter any PromQL query and see its result in a table or a graph.

Grafana

Grafana is a universal visualization tool suitable for visualizing and displaying data stored in different databases including Prometheus.

Console Template

Prometheus consists of a simple built-in console template that allows users to create any console interface.
Official documentation of Prometheus Console Template

Outline

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Service Discovery

In cloud environment, there is no fixed monitoring target, and nearly every monitored object in the cloud changes dynamically. Thus, we cannot statically monitor every device in the cloud.
The solution to the above problem is introducing an intermediate agent. This agent has access to all current monitored targets.
Prometheus only needs to ask the agent what monitoring targets there are. Such mechanism is called service discovery.

Example

In some cloud environments like AWS, Prometheus has the ability to find all cloud hosts that need to be monitored by using the API provided by the platform.
In Kubernetes, The master node manages all nodes information, Thus, Prometheus only need to interact with the master node to find all the containers and service objects that need to be monitored.

Introduction
Architecture
Components
- Prometheus Server
- Exporters
- Push Gateway
- Alertmanager
- Consoles and Dashboards
- Service Discovery
Pros and Cons

Pros

Prometheus is mainly used for event monitoring and event alerting. It works prticularly well for recording purely time series data.
Prometheus fits well in monitoring dynamic service-oriented cloud environments such as Kubernetes.
Prometheus has higher reliability since Prometheus server is a standalone monitoring system, ane it does not depending on network storage or other remote services.

Cons

Prometheus does not offer durable long-term storage. The data storage of Prometheus is ephemeral since is mainly used for event monitoring and alerting.
Prometheus does not support logging. Prometheus is designed to collect and process metrics, not an event logging system.

Concusion

In our project, Elastic Stack can be used to perform long-term data storage, monitoring, and data retrieval, while Prometheus can be used to perform short-term event monitoring and alerting.
Since Prometheus works well in monitoring cloud enviroment, it can be deployed into our Kubernetes and perform monitoring on the entire opKubernetes.

Prometheus Overview B06902031 資工四何承勳

{"metaMigratedAt":"2023-06-15T11:57:23.210Z","metaMigratedFrom":"YAML","title":"Prometheus Overview","breaks":true,"slideOptions":"{\"theme\":\"night\",\"transition\":\"fade\"}","contributors":"[{\"id\":\"e66dada3-cc52-4a40-b5fc-d3fc1e163dde\",\"add\":20239,\"del\":14207}]"}

changed 5 years ago 647 views