<div style="display:flex"> <img src="https://hackmd.io/_uploads/By0N9OUjgg.png" alt="grafana-icon" style="width:250px"/> <img src="https://hackmd.io/_uploads/H1OgPPIjxe.png" alt="opentelemetry-icon" style="width:250px"/> </div> ## Why? It's painful to trace production across multiple serices. Although our team use OpenSearch as a log system, it only provides plain log with limitation so we can't really understand the bottleneck of the issue with visualized chart. With faro web sdk in browser and opentelemetry in server, it makes the tracking flow simple and clear. ## Get started with Faro Web SDK ### Install dependencies ``` pnpm add @grafana/faro-react @grafana/faro-web-sdk @grafana/faro-web-tracing @opentelemetry/api @opentelemetry/context-zone @opentelemetry/instrumentation-document-load @opentelemetry/instrumentation-fetch @opentelemetry/instrumentation-xml-http-request @opentelemetry/propagator-jaeger @opentelemetry/resources @opentelemetry/sdk-trace-web ``` ### Add required global variables in .env file ``` NEXT_PUBLIC_FARO_URL=https://alloy-gw.line.dev/ecbot-d0ec/collect NEXT_PUBLIC_FARO_API_KEY=demo-xxxxxxx NEXT_PUBLIC_FARO_APP_NAME=nextjs-demo:client NEXT_PUBLIC_FARO_APP_NAMESPACE=nextjs-demo ``` ### Implement Client Component This component initializes faro instance. You can specify your custom global env in `.env.{ENV}`. Notice that `propagateTraceHeaderCorsUrls` is `window.location.origin` because our nextjs project use only server action and server side rendering in our project, you can specify custom origin if you use client side directly fetching to the third party api which integrates opentelemetry. `FetchInstrumentation` is the crucial instrumentation that will send request with specific request header `uber-trace-id`, this is helpful to trace the issue in grafana dashboard. ```tsx // components/faro-telemetry.tsx 'use client' import type { Faro } from '@grafana/faro-react' import { initializeFaro as coreInit } from '@grafana/faro-react' import { FaroSessionSpanProcessor, FaroTraceExporter } from '@grafana/faro-web-tracing' import { context, trace } from '@opentelemetry/api' import { ZoneContextManager } from '@opentelemetry/context-zone' import { registerInstrumentations } from '@opentelemetry/instrumentation' import { DocumentLoadInstrumentation } from '@opentelemetry/instrumentation-document-load' import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch' import { XMLHttpRequestInstrumentation } from '@opentelemetry/instrumentation-xml-http-request' import { JaegerPropagator } from '@opentelemetry/propagator-jaeger' import { resourceFromAttributes } from '@opentelemetry/resources' import { BatchSpanProcessor, WebTracerProvider } from '@opentelemetry/sdk-trace-web' import { env } from 'next-runtime-env' import { useEffect, useRef } from 'react' function initializeFaro(): Faro { const PHASE = env('NEXT_PUBLIC_PHASE') const URL = env('NEXT_PUBLIC_FARO_URL') const API_KEY = env('NEXT_PUBLIC_FARO_API_KEY') const NAME = env('NEXT_PUBLIC_FARO_APP_NAME') const VERSION = '1.0.0' // initialize faro const faro = coreInit({ url: URL, apiKey: API_KEY, app: { name: NAME, version: VERSION, environment: PHASE } }) // set up otel const resource = resourceFromAttributes({ 'service.name': NAME, 'service.version': VERSION, 'deployment.environment.name': PHASE }) const provider = new WebTracerProvider({ resource, spanProcessors: [ new FaroSessionSpanProcessor( new BatchSpanProcessor(new FaroTraceExporter({ ...faro })), faro.metas ) ] }) provider.register({ propagator: new JaegerPropagator(), contextManager: new ZoneContextManager() }) const ignoreUrls = [URL] const propagateTraceHeaderCorsUrls = [window.location.origin] // This is a list of specific URIs or regular expressions // Please be aware that this instrumentation originates from OpenTelemetry // and cannot be used directly in the initializeFaro instrumentations options. // If you wish to configure these instrumentations using the initializeFaro function, // please utilize the instrumentations options within the TracingInstrumentation class. registerInstrumentations({ instrumentations: [ new DocumentLoadInstrumentation(), new FetchInstrumentation({ ignoreUrls, propagateTraceHeaderCorsUrls }), new XMLHttpRequestInstrumentation({ ignoreUrls, propagateTraceHeaderCorsUrls }) ] }) // register OTel with Faro faro.api.initOTEL(trace, context) return faro } export const FaroTelemetry = () => { const faroRef = useRef<Faro | null>(null) useEffect(() => { if (!faroRef.current) { faroRef.current = initializeFaro() } }, []) return null } ``` ### How to use it to trace production issue? 1. Copy `uber-trace-id` (text before the colon) in request header. ![截圖 2025-10-14 下午4.56.53](https://hackmd.io/_uploads/HyqJZqopll.png) 2. Paste it to tempo and search. ![截圖 2025-10-14 下午4.57.42](https://hackmd.io/_uploads/BJeZZ9jpeg.png) 3. Get the trace and node graph (screenshot is on result section). ## Get started with Opentelmetry ### Install dependencies ``` pnpm add @opentelemetry/api @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-trace-otlp-http @opentelemetry/propagator-jaeger @opentelemetry/resources @opentelemetry/sdk-node @opentelemetry/sdk-trace-base ``` ### Add required global variables in .env file ``` OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector.xxx-system.svc.cluster.xxxxx:xxxxx/api/traces OTEL_SERVICE_NAME=nextjs-demo:server OTEL_RESOURCE_ATTRIBUTES=service.namespace=nextjs-demo ``` ### Manual Implementation You can check [nextjs tutorial](https://nextjs.org/docs/app/guides/open-telemetry) to setup on your own. To my case, I use jaeger exporter to send log. ```typescript // instrumentation.ts export async function register() { if (process.env.NEXT_RUNTIME === 'nodejs') { await import('./instrumentation.node') } } ``` ```typescript // instrumentation.node.ts import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' import { JaegerPropagator } from '@opentelemetry/propagator-jaeger' import { resourceFromAttributes } from '@opentelemetry/resources' import { NodeSDK } from '@opentelemetry/sdk-node' import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base' import { env } from 'next-runtime-env' const exporter = new OTLPTraceExporter({ url: env('OTEL_EXPORTER_OTLP_ENDPOINT') }) const sdk = new NodeSDK({ serviceName: env('OTEL_SERVICE_NAME'), resource: resourceFromAttributes({ 'service.name': env('OTEL_SERVICE_NAME'), 'service.version': '1.0.0', 'deployment.environment.name': env('NEXT_PUBLIC_PHASE') }), traceExporter: exporter, spanProcessors: [new BatchSpanProcessor(exporter)], textMapPropagator: new JaegerPropagator(), instrumentations: [getNodeAutoInstrumentations({})] }) sdk.start() ``` ``` // Get traceId from api response header if target service has integrated with opentelemetry. // Log the response and the traceId in your nextjs server. // You should customize the code with your own. ``` ### How to use it to trace production issue? 1. Copy traceId. - In local, clone it in web next terminal. - In production, clone it in IU log. 2. Paste it to tempo. ## Result All of work is done! Let's query with target trace id and get the following two type of diagram in tempo dashboard. ### Trace It shows the overall spans under same trace id. Each implemetation timing is recorded and it's helpful to find performance issue. ![截圖 2025-09-08 下午4.24.33](https://hackmd.io/_uploads/ryfg3UUile.png) ### Node Graph A plain graph to view service relationship. It's fancy and cool but I seldom use it to trace production issue. ![截圖 2025-09-08 下午4.24.59](https://hackmd.io/_uploads/Hkgb2LIoxg.png) ## Reference - https://nextjs.org/docs/app/guides/open-telemetry - https://github.com/grafana/faro-web-sdk/blob/main/docs/sources/tutorials/quick-start-browser.md#with-custom-opentelemetry-tracing-configuration - https://opentelemetry.io/docs/languages/js/propagation/