--- tags: Prometheus, Grafana, Histogram, Summary, Metrics, Bucket, le --- # Prometheus的指標的簡單理解 ## Histogram Histogram的使用時機通常是用來記錄某一個事件過程所耗費的時間,以下的例子是用來記錄每一次 response time的時間: ```csharp= private static readonly Histogram ResponseTime = Metrics.CreateHistogram("HttpClientResponseTime", "Number of response time.", new HistogramConfiguration { Buckets = Histogram.ExponentialBuckets(0.001, 2, 16), LabelNames = new[] { "StatusCode", "Path" } }); protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) { var watch = Stopwatch.StartNew(); var response = await base.SendAsync(request, cancellationToken).ConfigureAwait(false); ResponseTime.Labels(response.StatusCode.ToString(), url.ToString()).Observe(watch.Elapsed.Seconds); return response; } ``` 以下是 Dashboard的簡單範例: ``` histogram_quantile(0.95, sum(rate(HttpClientResponseTime_bucket{StatusCode="OK", job="ticketservice"}[5m])) by (le, Path)) * 1000 ``` (待補充圖片 dashboard) ### 以下是 Metrics的樣貌: ![](https://i.imgur.com/MmqEYVt.png) **[basename]_bucket{le="上邊界"}**, 這個值爲小於等於上邊界的所有采樣點數量,表達的意涵是小於指定秒數的的 response數量,因為每一個 response都小於 0.001s,所以所有的 bucket數量都是 3 這是另外一個 histogram的 bucket的分布,可以觀察到總共 10個 request,其中 5筆小於 0.016s, 9筆小於 0.032s, 10筆小於 2.048s ![](https://i.imgur.com/9xQ5xDx.png) ### Metrics的設計 再回到一開始的 Metrics程式碼的部分 1. 0.001 表示第一個 bucket的秒數 2. 表示每個 bucket遞增 N倍,這裡採用的是每個 bucket逐步增加 2倍 3. 16表示總共有幾個 bucket,也就是最後會有 0.001, 0.002, ......16.384, 32.768總共 16個 bucket 4. "HttpClientResponseTime"表示這個指標的名稱 5. "Number of response time."表示這個指標的敘述 6. "StatusCode"與 "Path"表示要為這個指標加上甚麼 label,方便在 Grafana做資料的分群使用 ```csharp= private static readonly Histogram ResponseTime = Metrics.CreateHistogram("HttpClientResponseTime", "Number of response time.", new HistogramConfiguration { Buckets = Histogram.ExponentialBuckets(0.001, 2, 16), LabelNames = new[] { "StatusCode", "Path" } }); ``` ## Counter Counter可用來記錄某個事件在週期內發生幾次,這是一個只增不減的計數器,以下的例子是用來記錄每一 GetTicket的 SportType和 IsLicensee分布: ```csharp= public class SportTypeMetricsAttribute : ActionFilterAttribute { private readonly Counter _counter = Metrics.CreateCounter("under_over_get_ticket_sports", "Count of received underOver.GetTicket() of sports", "IsLicensee", "SportType"); public override void OnActionExecuting(ActionExecutingContext context) { if (context.ActionArguments.Any() && context.ActionArguments.Values.First() is UnderOverTicket underOverTicket) { var withLabels = _counter.WithLabels( (underOverTicket.DepositSiteType != 0).ToString(), underOverTicket.SportType.ToString()); withLabels.Inc(); } base.OnActionExecuting(context); } } ``` ``` sum by (IsLicensee, SportType) (rate(under_over_get_ticket_sports{IsLicensee="$IsLicenseeOptions"}[5m])) ``` 實際的 Dashboard如下: ![](https://i.imgur.com/3o0bInJ.png) ## Gauge Gauge著重於反應系統當前的狀態,這是可增可減的,經常用來表示 memory size, disk size等等,以下的例子是用來記錄每一次 job的執行時間: ```csharp= private static readonly Gauge ProcedureJobExecuteTime = Metrics .CreateGauge("ProcedureJob", "Number of jobs waiting for processing in the queue.", "JobName"); public async Task Execute(IJobExecutionContext context) { var watch = Stopwatch.StartNew(); await ExecuteJob(context); ProcedureJobExecuteTime.WithLabels(typeof(T).Name).Set(watch.ElapsedMilliseconds); } ``` ``` avg (ProcedureJob{job="refdatawebapi"}/1000) by (JobName) ``` 實際的 Dashboard如下: ![](https://i.imgur.com/jWTyAsm.png)