Yasin Taha Erol
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    ## Blackbox Operator: ServiceMonitor?Probe? How? Written by Emin Aktaş, Furkan Türkal, Necatican Yıldırım developer-guy Yasintahaerol Probing endpoints is highly important if you have multiple internal or external endpoints. In any failure or down situation, you have to be informed immediately about that. Blackbox exporter has an important role be able to observe the variety of endpoints. So, let us look at blackbox exporter closely. 🌚 Blackbox exporter creates metrics that depend on response time of many kind of internal or external endpoints, such as HTTP/S, TCP, ICMP, DNS. It gathers information about the SSL certificate. You can create alert in case of a certificate expired situation or an invalid certificate. It could be passed all metrics to the grafana and create detailed dashboards. ( Like DNS lookup, HTTP latencies, etc… ) Blackbox exporter could be used in different ways. One of them is using as a service inside systemd. The Second one is deploying with Kubernetes. Today we will focus on deploying with Kubernetes and use helm chart to configure it. Before we start, I am willing to inform you about some concepts about prometheus-operator. Prometheus operator always monitors the Kubernetes API server for any changes in configuration and compares actual state and desired state. Then, it tries to sync without manual operation. The Operator has many custom resource definitions (CRDs). One of them is ServiceMonitor. ### What is ServiceMonitor ?  In documentation, ServiceMonitor, which declaratively specifies how groups of Kubernetes services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the current state of the objects in the API server. So that we can specify set of targets to be monitored by prometheus without any changes on the prometheus server side. ### What is probe ❓ In documentation, Probe defines monitoring for a set of static targets or ingresses. A declarative way of defining how set of ingress or static targets monitored. Actually, probe resembles servicemonitor, if we look what they do. When any probe is created in the cluster, prometheus will start to scrape configuration automatically. #### Why are these CRD ' s are important ❓ It is easy to set up any (probe or servicemonitor) independentally, So that you don't have to any manual change in prometheus configuration. These crd's will take care of the integration. Entire Teams could create their own resources without affecting each others. Easy deployment and troubleshooting. ### How we can use blackbox-exporter ❓ Now, it is start to time for hands-on. It is showed that three different methods and examples of how we configure all of them in this article. To be more clear, I will separate services as external or internal. Let me start first with external services. --- ![](https://i.imgur.com/fPA4WiL.png) --- #### External Services To be able to probe external services, there exist two variety ways. The first one is creating servicemonitor, second one is creating a probe. When we considered our use case, it is better to use probe resources. Each team will be able to scrape metrics about their external services by creating a probe. 1. ServiceMonitor If you deploy blackbox-exporter via using helm, it is easy to configure serviceMonitor. There exist a section that enable us to activate serviceMonitor. When choosing to enable this property, necessary configurations would be created automatically for you. All urls to be probing are specified in the targets section. ``` serviceMonitor: enabled: true defaults: additionalMetricsRelabels: {} labels: {} interval: 30s scrapeTimeout: 30s module: http_2xx scheme: http tlsConfig: {} bearerTokenFile: targets: [] - name: google url: http://google.com/ interval: 60s scrapeTimeout: 60s module: http_2xx ``` 2. Probe Also, it could be deployed an external probe resource in the cluster. Basically, they have similar results, when we look at what they do at the end of the day. In prober part, relevant blackbox-exporter service's FQDN information should be entered. ``` apiVersion: monitoring.coreos.com/v1 kind: Probe metadata: name: blackbox-exporter namespace: monitoring spec: jobName: http-get interval: 60s module: http_2xx prober: url: <blackbox-exporter-svc>.<ns>.svc:19115 scheme: http path: /probe targets: staticConfig: static: [] ``` #### Internal Services There exist a feature on p8s-operator. We will create a job in prometheus.yml. With kubernetes_sd_configs feature (by choosing the service role), development teams could define an annotation for their services to get the metrics by blackbox-exporter. In the commit, if any service has specific annotation "promethesu.io/probe: true", Prometheus will start sending requests to blackbox-exporter automatically. Also, with the power of Prometheus relabeling mechanism, it is possible to probe a variety of different sources such as consul catalog, endpoints, etc. Moreover, variety module definitions could be added to p8s-operator configuration. Common modules are HTTP/S, TCP, ICMP, etc. kubernetes_sd_configs: - role: service metrics_path: /probe params: module: - http_2xx relabel_configs: - action: keep regex: true source_labels: - __meta_kubernetes_service_annotation_prometheus_io_probe - source_labels: - __address__ target_label: __param_target - replacement: blackbox-exporter-prometheus-blackbox-exporter:9115 target_label: __address__ - source_labels: - __param_target target_label: instance - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: - __meta_kubernetes_namespace target_label: kubernetes_namespace - source_labels: - __meta_kubernetes_service_name target_label: kubernetes_name A simple service example with prometheus.io/probe: true annotation. Here, an example. 2s ➜ kubectl describe svc nginx ``` Name: nginx Namespace: monitoring Labels: app=nginx Annotations: prometheus.io/probe: true Selector: app=nginx Type: NodePort IP Families: <none> IP: 10.233.40.80 IPs: 10.233.40.80 Port: http 80/TCP TargetPort: 80/TCP NodePort: http 30301/TCP Endpoints: 10.233.71.180:80 Session Affinity: None External Traffic Policy: Cluster Events: <none> ``` ### 🔔How to create ALERTRULES ❓ There exist numerous alert rules to be configured for blackbox-exporter. Alerts could be created for different troubling issues such as SSL expiration time, probe slowdown or non-reach to service. these warnings can be broadcast on different channels via webhook. → https://awesome-prometheus-alerts.grep.to/rules.html#blackbox-1 ![](https://i.imgur.com/0Ny9r2G.png) ### 🌅How to visualize data ❓ Blackbox metrics could be converted to human-readable format by using detailed grafana dashboards. Here, you can find many dashboard templates for blackbox exporter depending on your need. ![](https://i.imgur.com/OATbivh.png) ## BONUS Let us get our hands dirty with the Blackbox Exporter to understand how it works since we provide necessary parameters via Prometheus which does the dirty job for us. Time to do it ourselves. ![](https://i.imgur.com/RCUovz1.png) Blackbox Exporter has many abilities through modules. Here are a couple of examples; it makes basic HTTP requests such as GET, POST and expects to receive a 2xx status code within the timeout period. Or, it can make matching with regex to body or header. If you want more details about the probes and options, check the [documantations](https://github.com/prometheus/blackbox_exporter/blob/master/CONFIGURATION.md). ```bash # Create a tmp directory and get the example config file $ pushd $(mktemp -d -q "/tmp/blackbox_exporter.XXXXXX") # Download a good configuration file $ wget https://raw.githubusercontent.com/prometheus/blackbox_exporter/master/config/testdata/blackbox-good.yml -O blackbox.yml # Run the Blackbox Exporter as a Docker container $ docker run --rm -d -p 9115:9115 --name blackbox_exporter -v `pwd`:/config prom/blackbox-exporter:master --config.file=/config/blackbox.yml # When you are done, you can get out the directory with $ popd ``` We can now probe any target with `http_2xx` probe which is defined in the configuration file along with other probe configurations. By simply calling the url http://localhost:9115/probe?target=www.trendyol.com&module=http_2xx returns Prometheus metrics. `probe_success` is the first metrics we should check. 1 means that probe succeeded. Also, we can do debugging with just add `debug=true` end of the url like this: http://localhost:9115/probe?target=www.trendyol.com&module=http_2xx&debug=true We are going to see more details along with our module configuration. ``` Logs for the probe: ts=2021-11-10T12:03:19.539609322Z caller=main.go:320 module=http_2xx target=www.trendyol.com level=info msg="Beginning probe" probe=http timeout_seconds=5 ts=2021-11-10T12:03:19.539685705Z caller=http.go:335 module=http_2xx target=www.trendyol.com level=info msg="Resolving target address" ip_protocol=ip6 ts=2021-11-10T12:03:19.570921716Z caller=http.go:335 module=http_2xx target=www.trendyol.com level=info msg="Resolved target address" ip=104.17.133.16 ts=2021-11-10T12:03:19.570980068Z caller=client.go:251 module=http_2xx target=www.trendyol.com level=info msg="Making HTTP request" url=http://104.17.133.16 host=www.trendyol.com ts=2021-11-10T12:03:19.74709647Z caller=client.go:492 module=http_2xx target=www.trendyol.com level=info msg="Received redirect" location=https://www.trendyol.com/ ts=2021-11-10T12:03:19.747202186Z caller=client.go:251 module=http_2xx target=www.trendyol.com level=info msg="Making HTTP request" url=https://www.trendyol.com/ host= ts=2021-11-10T12:03:19.747223777Z caller=client.go:251 module=http_2xx target=www.trendyol.com level=info msg="Address does not match first address, not sending TLS ServerName" first=104.17.133.16 address=www.trendyol.com ts=2021-11-10T12:03:20.085912327Z caller=main.go:130 module=http_2xx target=www.trendyol.com level=info msg="Received HTTP response" status_code=200 ts=2021-11-10T12:03:20.309809321Z caller=main.go:130 module=http_2xx target=www.trendyol.com level=info msg="Response timings for roundtrip" roundtrip=0 start=2021-11-10T12:03:19.571052069Z dnsDone=2021-11-10T12:03:19.571052069Z connectDone=2021-11-10T12:03:19.631688228Z gotConn=2021-11-10T12:03:19.631718525Z responseStart=2021-11-10T12:03:19.747027915Z tlsStart=0001-01-01T00:00:00Z tlsDone=0001-01-01T00:00:00Z end=0001-01-01T00:00:00Z ts=2021-11-10T12:03:20.309844977Z caller=main.go:130 module=http_2xx target=www.trendyol.com level=info msg="Response timings for roundtrip" roundtrip=1 start=2021-11-10T12:03:19.747300002Z dnsDone=2021-11-10T12:03:19.751055881Z connectDone=2021-11-10T12:03:19.846510737Z gotConn=2021-11-10T12:03:19.914806905Z responseStart=2021-11-10T12:03:20.085834663Z tlsStart=2021-11-10T12:03:19.846537968Z tlsDone=2021-11-10T12:03:19.914701661Z end=2021-11-10T12:03:20.309796122Z ts=2021-11-10T12:03:20.309911491Z caller=main.go:320 module=http_2xx target=www.trendyol.com level=info msg="Probe succeeded" duration_seconds=0.770276769 Metrics that would have been returned: # HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds # TYPE probe_dns_lookup_time_seconds gauge probe_dns_lookup_time_seconds 0.031248997 # HELP probe_duration_seconds Returns how long the probe took to complete in seconds # TYPE probe_duration_seconds gauge probe_duration_seconds 0.770276769 # HELP probe_failed_due_to_regex Indicates if probe failed due to regex # TYPE probe_failed_due_to_regex gauge probe_failed_due_to_regex 0 # HELP probe_http_content_length Length of http content response # TYPE probe_http_content_length gauge probe_http_content_length -1 # HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects # TYPE probe_http_duration_seconds gauge probe_http_duration_seconds{phase="connect"} 0.156121312 probe_http_duration_seconds{phase="processing"} 0.286337175 probe_http_duration_seconds{phase="resolve"} 0.035004882 probe_http_duration_seconds{phase="tls"} 0.068163702 probe_http_duration_seconds{phase="transfer"} 0.223961445 # HELP probe_http_redirects The number of redirects # TYPE probe_http_redirects gauge probe_http_redirects 1 # HELP probe_http_ssl Indicates if SSL was used for the final redirect # TYPE probe_http_ssl gauge probe_http_ssl 1 # HELP probe_http_status_code Response HTTP status code # TYPE probe_http_status_code gauge probe_http_status_code 200 # HELP probe_http_uncompressed_body_length Length of uncompressed response body # TYPE probe_http_uncompressed_body_length gauge probe_http_uncompressed_body_length 222945 # HELP probe_http_version Returns the version of HTTP of the probe response # TYPE probe_http_version gauge probe_http_version 2 # HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes. # TYPE probe_ip_addr_hash gauge probe_ip_addr_hash 1.231528671e+09 # HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6 # TYPE probe_ip_protocol gauge probe_ip_protocol 4 # HELP probe_ssl_earliest_cert_expiry Returns earliest SSL cert expiry in unixtime # TYPE probe_ssl_earliest_cert_expiry gauge probe_ssl_earliest_cert_expiry 1.652864248e+09 # HELP probe_ssl_last_chain_expiry_timestamp_seconds Returns last SSL chain expiry in timestamp seconds # TYPE probe_ssl_last_chain_expiry_timestamp_seconds gauge probe_ssl_last_chain_expiry_timestamp_seconds 1.652864248e+09 # HELP probe_ssl_last_chain_info Contains SSL leaf certificate information # TYPE probe_ssl_last_chain_info gauge probe_ssl_last_chain_info{fingerprint_sha256="0315524193aa6ceb020b85a8311534d51d7b32d0344895687c57b9f0928eb9bb"} 1 # HELP probe_success Displays whether or not the probe was a success # TYPE probe_success gauge probe_success 1 # HELP probe_tls_version_info Contains the TLS version used # TYPE probe_tls_version_info gauge probe_tls_version_info{version="TLS 1.3"} 1 Module configuration: prober: http timeout: 5s http: ip_protocol_fallback: true follow_redirects: true tcp: ip_protocol_fallback: true icmp: ip_protocol_fallback: true dns: ip_protocol_fallback: true ``` ## References https://sysdig.com/blog/blackbox-exporter-sysdig/ https://github.com/prometheus/blackbox_exporter https://medium.com/codex/prometheus-blackbox-what-why-how-28290dbb22ce

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully