--- title: LogQL tags: Loki description: LogQL --- # LogQL LogQL(Language to Query Logs from Loki) 是Loki用的查詢語法, 語法格式與操作收到PromQL啟蒙很深. 使用上很像是個分散式的[grep](https://ithelp.ithome.com.tw/articles/10265120) log聚合檢視器. LogQL用Label與Operator來進行過濾. 語法查詢上分為兩個部分 - Log queries 對本文內容進行過濾查找 - Metric queries 對log queries進行擴展並允許基於查找結果來進行值的計算或聚合 利用這兩個部份就能在LogQL中組合出我們想要的功能 ## Log Stream Selector ![](https://i.imgur.com/0D4WVlD.png) 一個基本的Log查詢通常會由 Log stream selector 和 Log pipeline所組成 每一個查詢會包含一個stream selector, 然後有需要會面會跟著pipeline, 將值給傳遞下去. 一個查詢動作就是把這兩個類型, 依據前後執行順序給組合起來就是 舉例 : ```bash= {container="query-frontend",namespace="loki-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500 ``` ![](https://i.imgur.com/i2Ckp4p.png) ### Log stream selector 由**{ }**(curly braces) 做分隔符號 ```bash= {app="mysql",name="mysql-backup"} ``` 所有log steam滿足label是app, 且value是mysql 以及 label是name, value是mysql-backup的log都會被選取. 所以一定要有至少一個label-value的pair, 但也能多組. 這裡不只能用 = 還有其他的match operators - = exactly equal - != not equal - =~ regex matches - !~ regex does not match regex是比對整個字串做anchor定位, 也包含newlines換行符號 ```bash= {name =~ "mysql.+"} {name !~ "mysql.+"} {name !~ `mysql-\d+`} ``` ```bash= {job="ta3", level!~"FAT.+"} {job="ta3", level=~`FATA\w+`} ``` ### Log pipeline A Log pipeline會被append to 一個log stream selector後面, 後面就組合一組expression 每一個express會從左至右依序執行 如果都執行完了,該log就會被停止處理, 並且開始處理下一個log ```bash= | line_format "{{.status_code}}" ``` #### Log pipeline expressions log pipeline expression有三種類型 - Filter expressions - Line filter expressions - Label filter expressions - Parsing expressions - Formatting expressions - Line format expressions - Label format expressions ![](https://i.imgur.com/n3BGtkk.png) ##### Line filter expression - |= Log line containers string - != Log line does not contain string - |~ Log line container a match to the regex - !~ Log line does not contain a match to the regex ```bash= {job="mysql"} |= "error" # substring error {instance=~"kafka-[23]",name="kafka"} != "kafka.server:type=ReplicaManager" # 丟棄有substring kafka.server:type=ReplicaManager {name="cassandra"} |~ `error=\w+` # 查詢substring 開頭是error=的 {job="mysql"} |= "error" != "timeout" # 查詢 substring有error 但沒有timeout的 ``` ##### Label filter expression 用來過濾或者是提取每一筆log 上的label label的值目前允許是 - string : **\__error\_\_** 這label的value是string - duration : Valid time units are “ns”, “us” (or “µs”), “ms”, “s”, “m”, “h” - number : floating-point number (64bits) - bytes : Valid bytes units are “b”, “kib”, “kb”, “mib”, “mb”, “gib”, “gb”, “tib”, “tb”, “pib”, “pb”, “eib”, “eb” ```bash= {job="iis",site="default_site"} | json | cs_method = "GET" and (time_taken > 500 and sc_status > 200) ``` ##### Parser expression - JSON 能把所有的properties都當成label ```bash= | json ``` - logfmt ```bash= at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" service=8ms status=200 | logfmt "at" => "info" "method" => "GET" "path" => "/" "host" => "grafana.net" "fwd" => "124.133.124.161" "service" => "8ms" "status" => "200" ``` - pattern 透過pattern expression 描述log的結構, 以<label_name>做描述 **<_>** 表是會被捕捉但會被skip ```bash= 0.191.12.2 - - [10/Jun/2021:09:14:29 +0000] "GET /api/plugins/versioncheck HTTP/1.1" 200 2 "-" "Go-http-client/2.0" "13.76.247.102, 34.120.177.193" "TLSv1.2" "US" "" 3 <ip> - - <_> "<method> <uri> <_>" <status> <size> <_> "<agent>" <_> "ip" => "0.191.12.2" "method" => "GET" "uri" => "/api/plugins/versioncheck" "status" => "200" "size" => "2" "agent" => "Go-http-client/2.0" ``` - regexp 使用[Golang RE2 syntax](https://github.com/google/re2/wiki/Syntax) (?P<name>regex) 跟pattern一樣, 只是用regex對match到的做label描述 ```bash= POST /api/prom/api/v1/query_range (200) 1.5s | regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)" "method" => "POST" "path" => "/api/prom/api/v1/query_range" "status" => "200" "duration" => "1.5s" ``` ##### Line format expression ```bash= | line_format "{{.label_name}}" ``` 這expression用來rewrite log line content. 透過single string parameter 來宣告rewrite的樣貌 ```bash= {job="iis"} | json | line_format "{{.cs_method}},{{.s_ip}} " GET,10.10.240.103 ``` ##### Label format expression 用來rename, 修改或新增label ```bash= | label_format ``` ## Binary operators ### Arithmetic operators - + - - - * - / - % - ^ 能在LogQL中執行1+1看看結果! ```bash= sum(rate({job="ta3"}[10m])) * 100 ``` 把一段時間內每秒發生的Log數量 給*2 ```bash= sum(rate({job="ta3"}[1m]))*2 ``` ### Logical operators - and (intersection) : v1 and v2 - or (union) : v1 or v2 - unless (complement) : v1 - v2 ### Comparison operators - == (equality) - != (inequality) - > (greater than) - >= (greater than or equal to) - < (less than) - <= (less than or equal to) 一樣的能嘗試 ```bash= 1 >= 1 1 > 2 ``` ### Order of operations 跟常規數學一樣, 同優先權的operator則先進行左關聯(left-associative) ```bash= 1 + 2 / 3 is equal to 1 + (2/3) 2 * 3 % 2 is equal to (2 * 3 ) % 2 ``` ### Comments 使用 **#** ```bash= {app="foo"} | json # this line will be ignored | bar="baz" # this checks if bar = "baz" ``` ### Pipeline Errors ```bash= __error__ ``` 當pipeline過程中出問題, 其實Loki不會把那些log給濾掉, 一樣會pass到下一個pipeline執行, 但會新增一個label __error__ 在身上 當然就能去過濾這些error 來試試看, 並觀察log entity身上是不是多個label ```bash= {job="ta3"} | logfmt | __error__!="" ``` ## Metric Queries ### [Log range aggregations](https://grafana.com/docs/loki/latest/logql/metric_queries/#log-range-aggregations) aggregation函數搭配一個查詢,以及一個時間做計算 - rate(long-range) 計算每秒的Log數量 - count_over_time(long-range) 計算一個時間範圍內, 每個log stream的Log數量 ```bash= count_over_time({job="ta3"}[1h]) # 計算最後一小時內的log量 rate({job="ta3"}[1h]) ## 計算一小時內, 每秒的log量 ``` ### [Unwrapped range aggregations](https://grafana.com/docs/loki/latest/logql/metric_queries/#unwrapped-range-aggregations) 相較於上者, 這邊針對的是Label 選擇在aggregation內使用哪個label, 但是這裡的query expression, 必須以unwrap expression 做結尾 ```bash= <aggr-op>([parameter,] <unwrapped-range>) [without|by (<label list>)] ``` - rate(unwarpped-range) - sum_over_time(unwarpped-range) - avg_over_time(unwarpped-range) - max_over_time(unwarpped-range) - min_over_time(unwarpped-range) - quantile_over_time(unwarpped-range) 算分位數用the φ-quantile (0 ≤ φ ≤ 1) of the values in the specified interval. ```bash= # 對每個path, 計算p99的延遲 quantile_over_time(0.99, {cluster="ops-tools1",container="ingress-nginx"} | json | __error__ = "" | unwrap request_time [1m]) by (path) ``` ```bash= # 針對每個org_id, 計算bytes的總和 sum by (org_id) ( sum_over_time( {cluster="ops-tools1",container="loki-dev"} |= "metrics.go" | logfmt | unwrap bytes_processed [1m]) ) ``` ### [Built-in aggregation operators](https://grafana.com/docs/loki/latest/logql/metric_queries/#built-in-aggregation-operators) 主要透過buit-in aggregation operator 能對一個vector計算計算, 並以新的vector為其結果, 且這新的vector它內含的元素個數還更少 ```bash= <aggr-op>([parameter,] <vector expression>) [without|by (<label list>)] ``` aggr-operators : - sum - avg - min - max - stddev - stdvar - count - topk Select largest k elements by sample value - bottomk Select smallest k elements by sample value ```bash= # 計算iis的qps, 並按照site來分組 sum(rate({job="iis"}[5m])) by (site) #取應用程式名稱, 是前10高吞吐量的. 這裡的10就是topk的parameter topk(10,sum(rate({region="us-east1"}[5m])) by (name)) ```