Declare Your Dashboards

# Declare Your Dashboards Provisioning Grafana with NixOS --- ## Why declare your dashboards? ---- - Commit your configs - Reduce toil - Patterns $\to$ functions Note: - git will always remember better then you do. - having all your config in the same place is a prerequisite for push button deployment. - once your configs live next to your code, you start noticing potential abstractions. - it lets you normalize and deduplicate stuff. - in other words, the same reasons you declare any resource. a good dashboard is complicated! save and edit your work like you would any other code. --- ## Why declare your dashboards in Nix/OS? ---- ### What is Nix? - Nix - a language - also a package manager - builds _derivations_ - NixOS - Linux distro built around Nix - Nixpkgs - Primary package repository used by NixOS - Also provides _NixOS modules_ -- ways to configure services to run on NixOS Note: Some quick definitions up front: Nix is the language. Its builds derivations, which are basically packages wrapped with a nice dependency idiom and udmped somewhere read-only. Nix is relatively easy to write, which means it's actually pretty cheap to package things like configuration files, systemd service files, etc in it as well. NixOS is a Linux distro that's built around Nix, i.e. entirely out of derivations. You have to understand, it's really more than a package manager _or_ a language _or_ a distro; it's really a whole way of being, if you zoom out... you can run distributed builds, provision remote machines, walk your dog, call your mom... For our purposes today, we also care about nixpkgs, which is the primary package repository for NixOS. Now nixpkgs provides something called NixOS modules, which are basically ways to configure _services_ to run in NixOS. ---- ### Use NixOS modules to declare Grafana resources - One True Answer To Declarative Configuration™ - Declare all your observability resources _as dependencies of the services themselves_ - Build bespoke abstractions and control logic - Sidestep (much of) service discovery - Profit Note: The best reason to use NixOS for observability is if you're already using it for your services. - Declare all your observability resources - your service, its exporter, the Prometheus job that scrapes it, the data source entry for the Prometheus, and the dashboard that displays it, together - and once you ahve all of those resources defined in one language, you get to Parametrize! Figure out what's changing between machines, services, regions, or whatever other dimension you care about, and turn your code into a function that takes that dimension as an argument. Lots of times, that type of variable - a network address, say, or a node id - needs to be re-discvered at runtime through service discovery. This way, you don't need to! - Let's do a walkthrough of what that might look like. --- ### What's a module? ```nix!= // module-a.nix {config,pkgs,lib,...}: { imports = [ ./module-b.nix]; // link modules here options = { // define parameters here module-a-option = lib.mkOption { type = lib.types.int; default = 1234; description = "a value I want to reference elsewhere"; }; }; config = { // set parameter values here module-b-option = config.module-a-option + 3; }; } ``` Note: So here's how a module works. There's three fields: imports, which help you link modules together; options, which are parameters you can define; and config, which is where you set or override option values. Once youve' defined an option, you can reference it from anywhere in the entire DAG of modules; think of a module set as like one giant function built via composition, to which every option is an argument. This bottoms out at _derivations_ -- eventually you need a module DAG to _build_ something. --- ### Declaring dashboards (and datasources) in Grafana https://grafana.com/docs/grafana/latest/administration/provisioning/ Note: Now, Grafana dashboards have a backing JSON spec, I'm sure you've seen, which you can access via the dashboard settings. Though it's less accessible fromt eh UI, datasources do as well! You can save these JSONs and put them in a folder and tell Grafana to pull those dashboards from that folder. Nix leverages this provisioning method under the hood to convert options into Grafana-legible JSON. --- - Let's take a look at an example that puts this all together: - http://github.com/itihas/dashdeclare - you can see it running on - http://catpix.itihas.xyz - Hit the New Cat button a few times! - Dashboards at - http://catpix.itihas.xyz/grafana - guest creds guest:guest Note: Click into the code and follow along! ---- ## So what's in here? ```mermaid graph LR A[Browser] --> B[Nginx] B --> C[MinIO] B -.->|metrics| D[Prometheus] C -.->|metrics| D D --> E[Grafana] F[Cat Images] --> C ``` --- ## Nginx ```nix!= nixosModules.catpix-nginx = { config, pkgs, lib, ... }: { // [...] services.nginx = { enable = true; package = pkgs.openresty; statusPage = true; virtualHosts.${config.networking.fqdn} = { // [...] }; ``` Note: So here I'm creating a module to handle all the nginx service stuff. I enable the nginx service, configure what it serves... ---- ### Prometheus ```nix!=11 // [...] // exporter services.prometheus.exporters.nginx.enable = true; // prometheus scrape target services.prometheus.scrapeConfigs = [{ job_name = "nginx"; scheme = "https"; static_configs = [{ targets = [ "localhost:${ builtins.toString config.services.prometheus.exporters.nginx.port }" ]; }]; }]; ``` Note: ...enable an exporter for it, create the prometheus scrape job that's pointed at that exporter... ---- ### Grafana ```nix!=25 // [...] // grafana dashboard services.grafana.provision.dashboards.settings.providers = [ (self.lib.mkProvider "nginx" ./dashboards/nginx.json) ]; } ``` Note: ...and pull the dashboards from the dashboard directory! --- ## Minio ```nix!= nixosModules.catpix-minio = { config, pkgs, lib, ... }: { services.minio = { enable = true; browser = true; }; // hack to remove prometheus auth for this demo systemd.services.minio.environment.MINIO_PROMETHEUS_AUTH_TYPE = "public"; ``` Note: Same thing for the minio module! ---- ### Prometheus ```nix!=11 services.prometheus.scrapeConfigs = [{ job_name = "minio"; metrics_path = "/minio/v2/metrics/cluster"; scheme = "https"; static_configs = [{ targets = [ "localhost" ]; }]; }]; ``` ---- ### Grafana ```nix!=17 services.grafana.provision.dashboards.settings.providers = [ (self.lib.mkProvider "minio" ./dashboards/minio.json) ]; ``` --- ## Monitoring ```nix!= nixosModules.catpix-monitoring = { config, pkgs, lib, ... }: { services.prometheus.enable = true; ``` Note: Now you might, correctly, expect the monitoring module to work a little differently. You enable the prometheus service... ---- ## Grafana ```nix!=3 services.grafana = { enable = true; [...] provision = { enable = true; datasources.settings = { apiVersion = 1; datasources = [{ name = "prometheus"; type = "prometheus"; url = "http://localhost:${ toString config.services.prometheus.port }"; }]; }; }; }; ``` Note: ...enable the grafana service, enable provisioning, configure our prometheus as a grafana datasource... ---- ``` services.nginx.virtualHosts."${config.networking.fqdn}" = { locations = { "/grafana/" = self.lib.mkProxy "http://localhost:${toString config.services.grafana.settings.server.http_port}"; }; }; }; ``` Note: ...and configure the location where we want grafana to serve! --- # Machine definition ```nix!= catpix = { name, nodes, ... }: { deployment = { targetHost = "34.29.30.70"; }; networking.fqdn = "catpix.itihas.xyz"; services.prometheus.exporters.node.enable = true; imports = with self.nixosModules; [ "${nixpkgs}/nixos/modules/virtualisation/google-compute-image.nix" catpix-nginx catpix-minio catpix-monitoring ]; }; ``` Note: Now I'm using colmena to deploy my nixos machine, which is just one among several solutions you can use for this. All of those solutions, however, will necessarily have to deal with nixosModules, because modules are the building blocks with which nixOS is configured. In this case, the "machine", i.e. the thing that compiles to a complete system image, is almost entirely specified by the contents of `imports`. everything under the `deployment` namespace is there, as you might imagine, to control _deployment context_. In my case, I consider the FQDN also part of the dpeloyment context, which is partly a fact about how I'm deploying this -- I made the A record manually -- and partly an opinion I have that I get to enforce since this is my repository. It could be lots of other ways -- most notably in situations where you want to change your DNS records programmatically. In that case, you'd put that option in a different place. --- ## things I think are cool about this - you can declare your scrapeConfigs _next to your services_! - this means that for a single nix-based deployment, you don't need service discovery. - dashboards are versioned and parametrized - modules can be imported to any machine, or even multiple machines Note: So why'd I divide these up the way I did? Essentially, so I can reuse this stuff. I want more targets to be added when i deploy more minios! It comes down to that. This repo isn't a complete illustration of how to do that, but it does get you well started along that path. - further reading and other approaches to moudlarizing deployments: - [Flakes overview](https://nixos.wiki/wiki/Flakes) - [flake.parts](https://flake.parts/) ---- # things I wish were different - you have to harvest dashboard jsons from the Grafana WebUI. - Dashboard provisioning and WISYWYG editing don't play well together. - current setup doesn't actually parametrize eveyrhting that ought to be parametrized. - this ~~wasn't done in time for this talk~~ is left as an exercise for the reader. ---- ## Questions, comments, snide remarks? --- # Acknowledgments - long suffering friends who let me demo to them at short notice - https://github.com/max-mapper/cats - NixOS community