# Using Opentelemetry and Prometheus - .NET ==專案使用 .NET Framework 實作,將Opentelemetry遙測資料 (Metrics)在 Prometheus 呈現數據。== ## Opentelemetry > OpenTelemetry 是一個開源框架,旨在提供統一的方式來收集、處理和導出應用程式的==遙測數據==。遙測數據幫助開發者和維運人員更好地了解和監控他們的應用程式的運作狀況。 >> **遙測數據的類型** >> ==Tracing==(追蹤數據): >> 定義:記錄應用程式中的請求流動。追蹤請求從一個服務到另一個服務的路徑,包括每個請求的開始和結束時間。 用途:分析和排查問題、性能優化、了解服務間的依賴關係。可視化請求的處理過程和延遲。 ==Metrics==(指標數據): >> 定義:量化應用程式的性能和行為,例如請求數量、錯誤率、響應時間等。 用途:設定告警、監控系統健康狀態、評估資源使用情況。可以幫助發現性能瓶頸和資源浪費。 ==Logs==(日誌數據): >> 定義:記錄事件的詳細資訊,通常用於偵錯和故障排除。 用途:跟蹤應用程式的運行狀態、分析異常和錯誤、記錄重要事件。日誌與追蹤和指標結合,可以提供更全面的上下文。 [Microsoft Opentelemetry 概觀](https://learn.microsoft.com/zh-tw/dotnet/core/diagnostics/observability-with-otel) ## Prometheus > 接收Metrics格式資料,呈現數據圖表 ( 可搭配Grafana )。 | 指標 | The USE Method | The Four Golden Signals | The RED Method | |---------------------|-------------------------------|--------------------------------------|---------------------------------------| | **使用率 (Utilization)** | 是核心指標,適合資源診斷 | 提及較少,偏重服務觀察 | 不包含此指標 | | **飽和度 (Saturation)** | 是核心指標,適合資源診斷 | 強調,但偏重資源滿載時的效能下降 | 不包含此指標 | | **錯誤情況 (Errors)** | 有,但較籠統 | 核心指標,區分顯式與隱式錯誤 | 核心指標,適合微服務架構 | | **請求處理時間 (Latency)** | 無 | 核心指標,專注服務端觀察 | 包含,但只偏重於分佈觀察 | | **請求率 (Rate)** | 無 | 次要指標 | 核心指標,適合請求驅動的系統 | | **適用性** | 資源層級診斷工具,偏系統 | 服務層級監控,適用分散式與 Web 系統 | 微服務架構優化工具,但與 Four Signals 重疊 | [Prometheus Using Prometheus as your OpenTelemetry backend](https://prometheus.io/docs/guides/opentelemetry/) # First attempt #### .NET Framework Web MVC專案 [安裝套件](https://opentelemetry.io/docs/languages/net/netframework/) : OpenTelemetry : OpenTelemetry.Instrumentation.AspNet : OpenTelemetry.Instrumentation.Http (供 HttpClient 和 HttpWebRequest 追蹤輸出 HTTP 呼叫的檢測) : OpenTelemetry.Exporter.Prometheus.HttpListener (將Metrics輸出到Prometheus) [套件參考](https://learn.microsoft.com/zh-tw/dotnet/core/diagnostics/observability-with-otel) 註冊 ```CSharp public class MvcApplication : System.Web.HttpApplication { private MeterProvider _meterProvider; private static readonly ActivitySource ActivitySource = new ActivitySource("MyService"); private static readonly Meter Meter = new Meter("MyService.Metrics"); private static readonly Counter<int> RequestCounter = Meter.CreateCounter<int>("api_call_count"); private static readonly Histogram<double> RequestDuration = Meter.CreateHistogram<double>("api_call_duration"); protected void Application_Start() { // 設置 MeterProvider 用於度量數據 _meterProvider = Sdk.CreateMeterProviderBuilder() .AddMeter("MyService.Metrics") .AddPrometheusHttpListener(options => { options.UriPrefixes = new string[] { "http://localhost:9464/" }; }) .SetResourceBuilder( ResourceBuilder.CreateDefault() .AddService(serviceName: "MyService")) //與第4行名稱相同 .Build(); } protected void Application_BeginRequest(object sender, EventArgs e) { // 每個請求開始時記錄開始時間 HttpContext.Current.Items["RequestStartTime"] = Stopwatch.StartNew(); } protected void Application_EndRequest(object sender, EventArgs e) { // 記錄請求結束時間和計算持續時間 if (HttpContext.Current.Items["RequestStartTime"] is Stopwatch stopwatch) { stopwatch.Stop(); var duration = stopwatch.Elapsed.TotalMilliseconds; // 記錄 API 呼叫計數和執行時間 RequestCounter.Add(1); RequestDuration.Record(duration); } } protected void Application_End() { _meterProvider?.Dispose(); } } ``` 1. MeterProvider 用來定義Metrics資料。 2. 利用Meter來定義需要的[Metrics Type](https://prometheus.io/docs/concepts/metric_types/) 3. 定義了兩個Metrics Type,一個**Counter**叫api_call_count (計數器),一個是**Histogram**叫api_call_duration (直方圖) 4. .AddMeter設置Metric來源 5. Prometheus 通過這個端點來拉取Metric數據 (第56行) 6. 每次請求結束時,在Application_EndRequest 停止 Stopwatch 計時,計算請求的持續時間,記錄到 RequestCounter 和 RequestDuration :::info 如果要在其他控制器也要加入Metrics資訊,一樣加上ActivitySource就可以去紀錄需要的Metrics數據。 ::: ------ #### 安裝Prometheus 新增一個prometheus.yml檔案 ```yaml global: scrape_interval: 5s scrape_configs: - job_name: 'opentelemetry' scrape_interval: 5s static_configs: - targets: ['localhost:9464'] ``` [使用Docker安裝,預設的Prometheus receiver是關閉的,記得要手動開啟](https://prometheus.io/docs/guides/opentelemetry/) ```shell docker run --rm -v "Try\prometheus.yml:/etc/prometheus/pro metheus.yml" -p 9090:9090 prom/prometheus --enable-feature=otlp-write-receiver ``` 開啟後請去targets頁面查看是否有連接到 (UP代表成功,DOWN代表失敗) ![image](https://hackmd.io/_uploads/rJ9b6fAx1e.png) 如果未連接可能是Docker在認localhost:9464是與本機專案不同Network區段,請去Docker Container的etc/hosts新增ip。 ```shell docker ps docker exec -it --user root <containerId> /bin/sh vi /etc/hosts i + enter (編輯) <本機ip> localhost :wq! (儲存並退出) ``` 1. http://localhost:9464/Metrics (查看Metrics資料) 2. 用Metrics Name去Prometheus的Graph頁面Execute就會跑出圖表了 --- ## 將OpenTelemetry導入到現有.Net Framework(4.8)專案 ### 需要安裝的套件 :::success 1. OpenTelemetry 2. OpenTelemetry.Instrumentation.AspNet 3. OpenTelemetry.Exporter.Prometheus.HttpListener 4. OpenTelemetry.Exporter.Zipkin (替換成Tempo可以不用裝這個) 5. OpenTelemetry.Exporter.OpenTelemetryProtocol (Tempo支援) (其餘套件可能會有無法參考的問題,再依錯誤訊息安裝相關的參考套件即可) ::: ### Global.asax ```csharp public class MvcApplication : System.Web.HttpApplication { private TracerProvider _tracerProvider; private MeterProvider _meterProvider; protected void Application_Start() { var resourceBuilder = ResourceBuilder.CreateDefault().AddService("FarmerWelfare"); _tracerProvider = Sdk.CreateTracerProviderBuilder().SetResourceBuilder(resourceBuilder) .AddAspNetInstrumentation() .AddZipkinExporter(opt => { opt.Endpoint = new Uri("http://localhost:9411/api/v2/spans"); }).Build(); _meterProvider = Sdk.CreateMeterProviderBuilder() .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("FarmerWelfare")).AddAspNetInstrumentation() .AddPrometheusHttpListener(options => { options.UriPrefixes = new string[] { "http://localhost:9464/" }; }).Build(); } } ``` ### 啟動Zipkin、Prometheus ```yaml services: prometheus: image: prom/prometheus:latest container_name: prometheus volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml ports: - "9090:9090" zipkin: image: openzipkin/zipkin:latest container_name: zipkin ports: - "9411:9411" ``` ```shell docker compose up -d ``` (本機記得去加上localhost跟ip的設定) :::warning Zipkin在Grafana只支援單筆TraceID查詢,故無法查看整個Flow的Trace log。 ::: --- --- --- # Second attempt ## OpenTelemetry -> Prometheus, Grafana Tempo -> Grafana ### OpenTelemetry(.Net Framework) [Trace & Metric](https://www.notion.so/Trace-Metric-17b9b1c7e2e3802c8586d26b1fb465dd?pvs=21) ## Docker啟動Prometheus, Grafana Tempo, Grafana ⇒ 照著Grafana官方步驟 [https://grafana.com/docs/tempo/latest/getting-started/docker-example/](https://grafana.com/docs/tempo/latest/getting-started/docker-example/) ### docker-compose.yaml ```yaml services: # Tempo runs as user 10001, and docker compose creates the volume as root. # As such, we need to chown the volume in order for Tempo to start correctly. init: image: &tempoImage grafana/tempo:latest user: root entrypoint: - "chown" - "10001:10001" - "/var/tempo" volumes: - ./tempo-data:/var/tempo memcached: image: memcached:1.6.29 container_name: memcached ports: - "11211:11211" environment: - MEMCACHED_MAX_MEMORY=64m # Set the maximum memory usage - MEMCACHED_THREADS=4 # Number of threads to use tempo: image: *tempoImage command: [ "-config.file=/etc/tempo.yaml" ] volumes: - ./tempo.yaml:/etc/tempo.yaml - ./tempo-data:/var/tempo ports: - "14268:14268" # jaeger ingest - "3200:3200" # tempo - "9095:9095" # tempo grpc - "4317:4317" # otlp grpc - "4318:4318" # otlp http - "9411:9411" # zipkin depends_on: - init - memcached k6-tracing: image: ghcr.io/grafana/xk6-client-tracing:v0.0.5 environment: - ENDPOINT=tempo:4317 restart: always depends_on: - tempo prometheus: image: prom/prometheus:latest command: - --config.file=/etc/prometheus.yaml - --web.enable-remote-write-receiver - --enable-feature=exemplar-storage - --enable-feature=native-histograms volumes: - ../shared/prometheus.yaml:/etc/prometheus.yaml ports: - "9090:9090" grafana: image: grafana/grafana:11.2.0 volumes: - ../shared/grafana-datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml environment: - GF_AUTH_ANONYMOUS_ENABLED=true - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin #- GF_AUTH_DISABLE_LOGIN_FORM=true 略過登入表單 - GF_SECURITY_ADMIN_USER=admin - GF_SECURITY_ADMIN_PASSWORD=admin - GF_FEATURE_TOGGLES_ENABLE=traceqlEditor metricsSummary - GF_INSTALL_PLUGINS=https://storage.googleapis.com/integration-artifacts/grafana-exploretraces-app/grafana-exploretraces-app-latest.zip;grafana-traces-app ports: - "4000:3000" ``` ### prometheus.yml ```yaml global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: 'prometheus' static_configs: - targets: [ 'localhost:9090' ] - job_name: 'opentelemetry' scrape_interval: 5s static_configs: - targets: ['localhost:9464'] - job_name: 'tempo' static_configs: - targets: [ 'tempo:3200' ] ``` ### tempo.yml ```yaml stream_over_http_enabled: true server: http_listen_port: 3200 log_level: info query_frontend: search: duration_slo: 5s throughput_bytes_slo: 1.073741824e+09 metadata_slo: duration_slo: 5s throughput_bytes_slo: 1.073741824e+09 trace_by_id: duration_slo: 5s distributor: receivers: otlp: protocols: grpc: endpoint: "tempo:4317" ingester: max_block_duration: 5m # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally compactor: compaction: block_retention: 1h # overall Tempo trace retention. set for demo purposes metrics_generator: registry: external_labels: source: tempo cluster: docker-compose storage: path: /var/tempo/generator/wal remote_write: - url: http://prometheus:9090/api/v1/write send_exemplars: true traces_storage: path: /var/tempo/generator/traces storage: trace: backend: local # backend configuration to use wal: path: /var/tempo/wal # where to store the wal locally local: path: /var/tempo/blocks overrides: defaults: metrics_generator: processors: [service-graphs, span-metrics, local-blocks] # enables metrics generator generate_native_histograms: both ``` > 記得確認.net 註冊內的prometheus, tempo的exporter port號要與yaml檔案的port號一致。 > ```csharp .AddOtlpExporter(opt => { opt.Endpoint = new Uri("[http://localhost:4317](http://localhost:4317/)"); // Tempo 的 OTLP gRPC 默認端口 }).Build(); ``` ```yaml # tempo.yml distributor: receivers: otlp: protocols: grpc: endpoint: "tempo:4317" ``` ```csharp .AddPrometheusHttpListener(options => { options.UriPrefixes = new string[] { "[http://localhost:9464/](http://localhost:9464/)" }; }).Build(); ``` ```yaml scrape_configs: - job_name: 'opentelemetry' scrape_interval: 5s static_configs: - targets: ['localhost:9464'] - job_name: 'tempo' static_configs: - targets: [ 'tempo:3200' ] ``` </aside> <aside> 💡 ## Nuget Install - OpenTelemetry - OpenTelemetry.Instrumentation.AspNet - OpenTelemetry.Instrumentation.Http (HttpClient) - OpenTelemetry.Instrumentation.SqlClient (SqlClient) - OpenTelemetry.Exporter.Prometheus.HttpListener (Prometheus) - OpenTelemetry.Exporter.OpenTelemetryProtocol (統一輸出OTLP規格) </aside> <aside> 💡 ## .Net Framework 註冊 OpenTelemetry Web.Config ```xml <system.webServer> <modules> <add name="TelemetryHttpModule" type="OpenTelemetry.Instrumentation.AspNet.TelemetryHttpModule, OpenTelemetry.Instrumentation.AspNet.TelemetryHttpModule" preCondition="integratedMode,managedHandler" /> </modules> </system.webServer> ``` Global.asax ```csharp public class MvcApplication : System.Web.HttpApplication { private TracerProvider _tracerProvider; private MeterProvider _meterProvider; protected void Application_Start() { var resourceBuilder = ResourceBuilder.CreateDefault().AddService("MyService"); _tracerProvider = Sdk.CreateTracerProviderBuilder().SetResourceBuilder(resourceBuilder) .AddAspNetInstrumentation(options => { // 自定義 EnrichWithHttpRequest 回調 options.EnrichWithHttpRequest = (activity, httpRequest) => { var routeData = httpRequest.RequestContext.RouteData; // 添加 Controller 名稱到 Tracing if (routeData.Values.TryGetValue("controller", out var controller)) { activity.SetTag("http.route.controller", controller.ToString()); } // 添加 Action 名稱到 Tracing if (routeData.Values.TryGetValue("action", out var action)) { activity.SetTag("http.route.action", action.ToString()); } if (routeData.Values.TryGetValue("id", out var id)) { activity.SetTag("http.route.action", action.ToString()); } // 添加完整路徑到 Tracing tag activity.SetTag("http.route.path", httpRequest.Url?.AbsolutePath); }; }) .AddHttpClientInstrumentation() //視需求 .AddSqlClientInstrumentation() //視需求 .AddProcessor(new CustomProcessor()) // 註冊 CustomProcessor .AddOtlpExporter(opt => { opt.Endpoint = new Uri("http://localhost:4317"); // Tempo 的 OTLP gRPC 默認端口 }).Build(); _meterProvider = Sdk.CreateMeterProviderBuilder() .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MyService")) .AddAspNetInstrumentation(options => //手動新增tag options.Enrich = (HttpContext context, ref TagList tags) => { // Add request content type to the metric tags. if (!string.IsNullOrEmpty(context.Request.ContentType)) { tags.Add("custom.content.type", context.Request.ContentType); } tags.Add("route", context.Request.Url.AbsolutePath); }) .AddHttpClientInstrumentation() //視需求 .AddSqlClientInstrumentation() //視需求 .AddPrometheusHttpListener(options => { options.UriPrefixes = new string[] { "http://localhost:9464/" }; }).Build(); } ``` ```csharp public class CustomProcessor : BaseProcessor<Activity> { //自定義Processor 可以去覆蓋http.route的tag (預設都會覆蓋掉) public override void OnEnd(Activity activity) { if (activity.Tags.FirstOrDefault(t => t.Key == "http.route").Value is string originalRoute) { // 完整路徑 var newRoute = activity.Tags.FirstOrDefault(t => t.Key == "http.route.path").Value; var displayName = activity.Tags.FirstOrDefault(t => t.Key == "http.request.method").Value + newRoute; //預設兩個tag都會被覆蓋所以要用Processor自定義最後再覆蓋一次 activity.SetTag("http.route", newRoute); activity.DisplayName = displayName; } } } ``` </aside> <aside> 💡 ## 查看Prometheus、Grafana連線狀況 <aside> 📌 Docker Network是找不到運行在本機任一[localhost](http://localhost)的服務的,必須在服務內的 /etc/hosts 加上本機ip -> localhost 才可以讓Docker利用port號找到對應服務。 ```yaml 1. docker ps (查看container ID) 2. docker exec -it --user root <containerId> /bin/sh (進入服務編輯) 3. vi /etc/hosts (選定編輯檔案) 4. i + enter (編輯) 5. <本機ip> localhost (設定網路) 6. :wq! (儲存並退出) ``` </aside> ### Prometheus Status → Targets 查看狀態UP 表示連接成功 ![image](https://hackmd.io/_uploads/BJPbLoXD1x.png) ### Grafana Tempo - Add new connection → Add new data source (新增資料來源)。 ![image](https://hackmd.io/_uploads/HyEfUiXPkg.png) - Data source中設定 Name, URL 其餘可以之後再設定。 ![image](https://hackmd.io/_uploads/SJmQ8iXv1l.png) - 最後測試連線。 ![image](https://hackmd.io/_uploads/Bye4Ls7Dye.png) - 查看Trace資料,至Explore點選Tempo,在QueryType選擇Search,ServiceName選擇您在.Net Framework專案設定的服務名稱 (如上是”MyService”),最後Run Query。 ![image](https://hackmd.io/_uploads/S1A48jXwkl.png) --- --- --- ### Grafana Prometheus - Data Source 設定連線。 ![image](https://hackmd.io/_uploads/BJaSLsmDyg.png) ![image](https://hackmd.io/_uploads/rJdLUjXPyx.png) - Explore → Metrics (指標名稱可以在http://localhost/9464/metrics查看) ![image](https://hackmd.io/_uploads/SJEv8j7vkx.png) </aside> <aside> 💡 ## Dashboards (建立圖表) ![image](https://hackmd.io/_uploads/H1udUoQwJg.png) ![image](https://hackmd.io/_uploads/B1-KIimwkx.png) </aside>