Ticket Selling Service --- - NoSQL database: Redis - Monitoring tools: Prometheus + Grafana ![ticket-selling-architecture](https://hackmd.io/_uploads/ryAts5Itp.png) Figure. 服務及監控架構 **說明:** 本服務為票券登錄系統(Restful API),使用者短時間內湧入高量請求,為增進存取資料操作效率,採用Redis,在監控部分,以Prometheus收集指標,搭配Grafana將資料視覺化,提供予監控程式情況,並能設立規則主動發出警告 (另外可用JMeter模擬不同情境) **程式** (1) Github: https://github.com/yaahsin/ticket-selliing (2) Docker Hub: yahsin/side-project-phrase1 / image: ticket-selling.jar Spring Boot Application to Docker image --- 0. 必須要先build好jar檔 >> maven install >> jar檔存在target資料夾內 1. 專案根目錄新增Dockerfile (和pom同階層) ```yml FROM openjdk:17-oracle COPY ./target/*.jar <filename>.jar RUN sh -c 'touch changeinfo.jar' ENTRYPOINT ["java","-jar","<filename>.jar"] ``` 2. project的terminal下指令: `docker build -t <filename>.jar . ` 3. 完成後即可在docker內查到image: `docker images` 4. 啟用container: `docker run -p 8080:8080 -d <filename>.jar` REF: - [Spring Boot Docker](https://spring.io/guides/topicals/spring-boot-docker/) - [Day16: Docker Container 簡介,從 Jar 到 Container (上)](https://ithelp.ithome.com.tw/articles/10302393) 監控機制: Prometheus + Grafana --- > 透過prometheus從程式收集metrics, 作為datasource, 傳遞予grafana做dashboard, 將資訊進一步視覺化/統整, 並加上alarm做主動監控 1. 導入套件 ```xml <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId> <scope>runtime</scope> </dependency> ``` 2. 配置endpoint設定以存取metrics ```yml management: endpoints: web: exposure: include: '*' ``` actuator: http://localhost:8080/actuator prometheus metrics: http://localhost:8080/actuator/prometheus 3. 啟用監控微服務 using docker compose (0-1) 單redis 前置作業: 設定為同network(容器互連的hostname為container name) - 自訂network: `docker network create -d local-network` - Redis: `docker run -d --name local-redis -p 6379:6379 --network local-network redis` - Redis-exporter: `docker run -d --name redis-exporter-local -p 9121:9121 --network local-network oliver006/redis_exporter --redis.addr=redis://local-redis:6379` ![image](https://hackmd.io/_uploads/H191_BH5p.png) Figure. 確認容器位於同network: `docker network inspect local-network` (0-2) docker compose 啟用程式、redis exporter ```yml version: '3.7' services: ticket-selling: image: ticket-selling.jar container_name: ticket-selling restart: always ports: - "8080:8080" networks: - my-local-redis-network environment: - SPRING_PROFILES_ACTIVE=ut redis-exporter-local: image: oliver006/redis_exporter:v1.51.0 container_name: redis-exporter-local ports: - "9121:9121" restart: unless-stopped networks: - my-local-redis-network networks: my-local-redis-network: external: name: local-redis-cluster ``` ![image](https://hackmd.io/_uploads/BJ2KiQ9h6.png) Figure. 確認容器位於同network: `docker network inspect local-redis-cluster` (1-0) Prometheus + Grafana: docker compose ```yml version: '3.7' services: prometheus: image: prom/prometheus:v2.44.0 container_name: prometheus ports: - "9090:9090" volumes: - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml networks: - my-local-redis-network grafana: image: grafana/grafana:9.5.2 container_name: grafana ports: - "3000:3000" restart: unless-stopped volumes: - ./grafana/provisioning/datasources:/etc/grafana/provisioning/datasources networks: - my-local-redis-network # 取用外部已配置的網路 networks: my-local-redis-network: external: name: local-redis-cluster ``` (1-1) Prometheus配置, 取用內部ip adress - 配置2個target取得Spring Boot及Redis的metrics(from redis-exporter) ```yml scrape_configs: - job_name: 'MyAppMetrics' metrics_path: '/actuator/prometheus' scrape_interval: 3s static_configs: - targets: ['host.docker.internal:8080'] labels: application: 'My Spring Boot Application' - job_name: 'RedisMetrics' static_configs: - targets: ['host.docker.internal:9121'] ``` ![image](https://hackmd.io/_uploads/SJFEQBS56.png) Figure. Prometheus targets ```yml scrape_configs: - job_name: 'MyAppMetrics' metrics_path: '/actuator/prometheus' scrape_interval: 3s static_configs: - targets: ['host.docker.internal:8080'] labels: application: 'My Spring Boot Application' ## config for scraping the exporter itself - job_name: 'Redis_exporter' static_configs: - targets: ['host.docker.internal:9121'] ## config for the multiple Redis targets that the exporter will scrape - job_name: 'redis_exporter_targets' static_configs: - targets: - redis://host.docker.internal:9011 - redis://host.docker.internal:9012 - redis://host.docker.internal:9013 - redis://host.docker.internal:9014 - redis://host.docker.internal:9015 - redis://host.docker.internal:9016 metrics_path: /scrape relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: host.docker.internal:9121 ``` ![image](https://hackmd.io/_uploads/BJaCtmcnT.png) figure. redis cluster target (1-2) grafana配置datasource來源prometheus ```yml apiVersion: 1 datasources: - name: Prometheus type: prometheus access: proxy url: http://prometheus:9090 isDefault: true ``` ![image](https://hackmd.io/_uploads/HkzkPrSqT.png) Figure. Grafana datasource (2) 啟用containers: `docker compose up` ![image](https://hackmd.io/_uploads/B1T4DBrqT.png) Figure. 啟用畫面 (3) 檢查metrics收集狀態 ![image](https://hackmd.io/_uploads/ryHyrBH5T.png) Figure.Application's metric ![image](https://hackmd.io/_uploads/SyesVHScT.png) Figure. Redis metrics from Redis Exporter (4) Grafana dashboard ![image](https://hackmd.io/_uploads/Hkyw9HSq6.png) Figure. 套用template<763>進行測試 <!-- TODO: 自定義dashboard metrics --> ![image](https://hackmd.io/_uploads/BJp3qrBca.png) Figure. JMeter分別設定200、2000 threads/sec 比較差異 ![image](https://hackmd.io/_uploads/Byx6tl232p.png) ![image](https://hackmd.io/_uploads/Hkwjln33a.png) Figure. Grafana dashboard --- URL: - Prometheus: http://localhost:9090/graph - Grafana: http://localhost:3000/ - Actuator: http://localhost:8080/actuator/prometheus - Redis-exporter: http://localhost:9121/metrics - 預設登入為admin/admin 4. 重要監控參數 1. K8S/container - Pod Restart Count - Node Health - CPU usage - memory usage - network throughput 2. JVM - Heap Memory Usage - Non-Heap Memory Usage - GC (Garbage Collection) 3. 程式 - Request Count - Request Duration - Request Error Rate 4. Redis - 一般性指標 - uptime(redis_uptime_in_seconds): 不反複啟用, 設置警示標準 -> 特定時長 - client(redis_connected_clients): 設置警示 -> 連接數不小於application數量 - 持久化timestamp(rdb_last_save_time): 儲存至disck時間點, 設置警示 -> 可接受的間隔 - 連接slaves(redis_connected_slaves) - 系統資源 - memory usage(redis_memory_used_bytes, redis_memory_max_bytes) - fragmentation ratio: < 1 表示內存交換SWAP(數據從内存和磁碟換入換出, 空間變大, 但將導致效能下降), 理想狀態下used_memory_rss 略高於used_memory, 設置警示 -> >1.5 - used_memory_rss:Number of bytes that Redis allocated as seen by the operating system - used_memory: Total number of bytes allocated by Redis using its allocator - evicted_keys: 使用內存量大於maxmemory時, 會執行evicted淘汰key/value, 以讓出空間, 但情況太多時會造成latency - Throughput (rate of the operations) - 命令執行速率(instantaneous_ops_per_sec) - 命中率Hits / Misses per Sec (redis_keyspace_hits_total, redis_keyspace_misses_total) - Total Commands (redis_commands_total) - Network I/O (redis_net_input_bytes_total, redis_net_output_bytes_total) - Latency - AVG Response Time(redis_commands_duration_seconds_total, redis_commands_processed_total): - Slow query(redis_slowlog_length紀錄筆數, slowlog-log-slower-than): 追蹤慢指令(執行指令階段, 不包含回傳等工作時間), 預設10ms, 若要取得資訊可搭配其他工具呈現(Elasticsearch, ElastiCache)其content, 設置警示 -> 時間區間慢查詢次數 5. 其它: - 登記票數 - source - 數量 - response time --- 監控的分類方法 - 來源類別source, 加以分組 - 特定事件的前後狀態 REF. monitor - [Monitoring Made Simple: Empowering Spring Boot Applications with Prometheus and Grafana](https://medium.com/simform-engineering/revolutionize-monitoring-empowering-spring-boot-applications-with-prometheus-and-grafana-e99c5c7248cf) - [Docker localhost IP](https://stackoverflow.com/questions/33726929/docker-localhost-ip) - [How to Monitor Redis Performance](https://medium.com/@MetricFire/how-to-monitor-redis-performance-819125702401) - [oliver006/redis_exporter](https://github.com/oliver006/redis_exporter/blob/master/README.md) - [redis-3.0.7内存碎片过高 mem_fragmentation_ratio >1.5](https://www.twblogs.net/a/5d7ef832bd9eee541c3484a3) - [Redis进阶 - 运维监控:Redis的监控详解](https://pdai.tech/md/db/nosql-redis/db-redis-y-monitor.html) - [Redis Monitoring | 101 Guide to Redis Metrics Monitoring](https://signoz.io/blog/redis-monitoring/) - [Redis](https://docs.asserts.ai/assertion-catalog/data-stores/redis) - [使用 Redis 的 slowlog get [n] 慢查询日志彻底解决生产问题!](https://blog.csdn.net/weixin_44018338/article/details/99460667) - [redis slow log metrics](https://github.com/oliver006/redis_exporter/issues/67) REF. redis - [Redis commands INFO](https://redis.io/commands/info/) - [local-redis-cluster](https://github.com/mnadeem/local-redis-cluster) - [DOCKER BASED REDIS CLUSTER UP AND RUNNING IN LOCAL WINDOWS MACHINE](https://reachmnadeem.wordpress.com/2020/10/03/docker-based-redis-cluster-up-and-running-in-local-windows-machine/) - [Redis (六) - 主從複製、哨兵與叢集模式](https://hackmd.io/@tienyulin/redis-master-slave-replication-sentinel-cluster#%E5%8F%A2%E9%9B%86%E6%A8%A1%E5%BC%8F) REF. [Networking in Compose](https://docs.docker.com/compose/networking/) --- <!-- GrayLog收集資料 ![image](https://hackmd.io/_uploads/BkEdujnip.png) ![image](https://hackmd.io/_uploads/H1zAjohoT.png) ![image](https://hackmd.io/_uploads/r1GFeHq26.png) <!-- ![image](https://hackmd.io/_uploads/S1cRsj3jT.png) -->