--- title: postgresql json/jsonb tags: postgresql, json, yoctol, workshop --- # postgresql json/jsonb postgresql 提供了兩種 json 的 type: json/jsonb ## type json - 可以視為單純將 json stringify 後存入 text 欄位 - 寫入較快,因為比起 jsonb 沒有處理成本 - 每次取用(不論讀寫)都需要把字串 reparse 成 json,所以除了寫入以外的效率都相對較差。 - 不能做 index - 因為是存字串,空白、順序、甚至重複 key 都會被保留 ## type jsonb - 存了處理過後的 binary - 寫入需要花點成本來轉換成 binary - 省去 reparse 的動作,所以大部分操作都會比較快 - 可以做 index - 某些情況會讓 jsonb 佔的 storage 更大。 - 處理過程中,空白、順序、重複 key,都會被處理掉 - 重複 key 會以後面輸入的為準 - 排序以 key 的長度來排 ## JSON vs pg json | JSON primitive type | PostgreSQL type | Notes | | -------- | -------- | -------- | | string | text | \u0000 is disallowed, as are Unicode escapes representing characters not available in the database encoding | |number | numeric | NaN and infinity values are disallowed |boolean | boolean | Only lowercase true and false spellings are accepted |null |(none) | SQL NULL is a different concept | ```sql -- Simple scalar/primitive value -- Primitive values can be numbers, quoted strings, true, false, or null SELECT '5'::json; -- Array of zero or more elements (elements need not be of same type) SELECT '[1, 2, "foo", null]'::json; -- Object containing pairs of keys and values -- Note that object keys must always be quoted strings SELECT '{"bar": "baz", "balance": 7.77, "active": false}'::json; -- Arrays and objects can be nested arbitrarily SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json; -- Different handling for json and jsonb SELECT '{ "bar": "baz", "balance": 7.77, "bar": "bar", "active":false}'::jsonb; /* result: jsonb -------------------------------------------------- {"bar": "bar", "active": false, "balance": 7.77} (1 row) */ SELECT '{ "bar": "baz", "balance": 7.77, "bar": "bar", "active":false}'::json; /* result: json -------------------------------------------------- { "bar": "baz", "balance": 7.77, "bar": "bar", "active":false} (1 row) */ SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; /* result: json | jsonb ----------------------- ------------------------- {"reading": 1.230e-5} | {"reading": 0.00001230} */ ``` ## Designing JSON Documents - predictable structure. - each JSON column/documents should each represent an atomic datum that business rules dictate. - if constantly updating documents, should keep each JSON documents small to decrease row lock contention. ## json functions & operators ```sql -- Simple scalar/primitive values contain only the identical value: SELECT '"foo"'::jsonb @> '"foo"'::jsonb; -- The array on the right side is contained within the one on the left: SELECT '[1, 2, 3]'::jsonb @> '[1, 3]'::jsonb; -- Order of array elements is not significant, so this is also true: SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb; -- Duplicate array elements don't matter either: SELECT '[1, 2, 3]'::jsonb @> '[1, 2, 2]'::jsonb; -- The object with a single pair on the right side is contained -- within the object on the left side: SELECT '{"product": "PostgreSQL", "version": 9.4, "jsonb": true}'::jsonb @> '{"version": 9.4}'::jsonb; -- The array on the right side is not considered contained within the -- array on the left, even though a similar array is nested within it: SELECT '[1, 2, [1, 3]]'::jsonb @> '[1, 3]'::jsonb; -- yields false -- But with a layer of nesting, it is contained: SELECT '[1, 2, [1, 3]]'::jsonb @> '[[1, 3]]'::jsonb; -- Similarly, containment is not reported here: SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"bar": "baz"}'::jsonb; -- yields false -- A top-level key and an empty object is contained: SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"foo": {}}'::jsonb; ``` documents: https://www.postgresql.org/docs/13/functions-json.html ## json path https://github.com/digoal/blog/blob/master/202010/20201013_01_doc_001.pdf 要裝新版 postgres 12 以上才有 ```shell $ docker pull postgres $ YOUR_PORT=5566 $ docker run -d \ --name jsonpath_test_db \ -p $YOUR_PORT:5432 \ -e POSTGRES_USER=user \ -e POSTGRES_PASSWORD=password \ postgres $ psql -p $YOUR_PORT -U user -h localhost postgres or $ docker exec -it $YOUR_DATABASE_CONTAINER_ID psql -U user postgres ``` ```sql postgres=# \df *json*path* List of functions Schema | Name | Result data type | Argument data types | Type ------------+------------------------------+------------------+-------------------------------------------------------------------------------------------+------ pg_catalog | gin_consistent_jsonb_path | boolean | internal, smallint, jsonb, integer, internal, internal, internal, internal | func pg_catalog | gin_extract_jsonb_path | internal | jsonb, internal, internal | func pg_catalog | gin_extract_jsonb_query_path | internal | jsonb, internal, smallint, internal, internal, internal, internal | func pg_catalog | gin_triconsistent_jsonb_path | "char" | internal, smallint, jsonb, integer, internal, internal, internal | func pg_catalog | json_extract_path | json | from_json json, VARIADIC path_elems text[] | func pg_catalog | json_extract_path_text | text | from_json json, VARIADIC path_elems text[] | func pg_catalog | jsonb_delete_path | jsonb | jsonb, text[] | func pg_catalog | jsonb_extract_path | jsonb | from_json jsonb, VARIADIC path_elems text[] | func pg_catalog | jsonb_extract_path_text | text | from_json jsonb, VARIADIC path_elems text[] | func pg_catalog | jsonb_path_exists | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_exists_opr | boolean | jsonb, jsonpath | func pg_catalog | jsonb_path_exists_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_match | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_match_opr | boolean | jsonb, jsonpath | func pg_catalog | jsonb_path_match_tz | boolean | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query_array | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query_array_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query_first | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query_first_tz | jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonb_path_query_tz | SETOF jsonb | target jsonb, path jsonpath, vars jsonb DEFAULT '{}'::jsonb, silent boolean DEFAULT false | func pg_catalog | jsonpath_in | jsonpath | cstring | func pg_catalog | jsonpath_out | cstring | jsonpath | func pg_catalog | jsonpath_recv | jsonpath | internal | func pg_catalog | jsonpath_send | bytea | jsonpath | func (25 rows) ``` ```sql CREATE TABLE t_track ( a jsonb); INSERT INTO t_track (a) VALUES (' { "gpsname": "gps1", "track" : { "segments" : [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 130 } ] } }'); SELECT jsonb_pretty(a) FROM t_track; /* result: jsonb_pretty ----------------------------------------------------- { + "track": { + "segments": [ + { + "HR": 73, + "location": [ + 47.763, + 13.4034 + ], + "start time": "2018-10-14 10:05:14"+ }, + { + "HR": 130, + "location": [ + 47.706, + 13.2635 + ], + "start time": "2018-10-14 10:39:21"+ } + ] + }, + "gpsname": "gps1" + } (1 row) */ -- postgres 11 可以用 ->> 查詢 json 內部元素,如下: postgres=# SELECT a ->> 'gpsname' FROM t_track ; /* result: ?column? ---------- gps1 (1 row) */ -- postgres 12 可以使用 SQL/JSON path 查詢,如下: postgres=# SELECT jsonb_path_query(a,'$.gpsname') FROM t_track ; /* result: jsonb_path_query ------------------ "gps1" (1 row) */ -- 在 json 數據層數變多或是查詢條件複雜的時候 json path 查詢的優勢就會顯現出來,例如查詢 t_track 的 track.segments,如下: postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments')) FROM t_track ; /* result: jsonb_pretty --------------------------------------------- [ { "HR": 73, "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14" }, { "HR": 130, "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21" } ] (1 row) */ -- track.segments 是 array,可以用 [] 查詢 array 內元素,如下: /* result: postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments[0]')) FROM t_track ; jsonb_pretty ----------------------------------------- { "HR": 73, "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14" } (1 row) */ postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments[1]')) FROM t_track ; /* result: jsonb_pretty ----------------------------------------- { "HR": 130, "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21" } (1 row) */ -- 查詢下一層的元素,如下 postgres=# SELECT jsonb_path_query(a,'$.track.segments[1].HR') FROM t_track ; /* result: jsonb_path_query ------------------ 130 (1 row) */ -- SQL/JSON Path Expressions 篩選 -- 使用 SQL/JSON path 函式查询 Json 數據时,可以指定 filter 條件查詢滿足條件的 Json 元素,例如查詢 HR 元素值大於 100 的 track.segments 元素,如下: postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments ? ( @.HR > 100)')) FROM t_track ; /* result: jsonb_pretty ----------------------------------------- { "HR": 130, "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21" } (1 row) */ -- 如果只想挑出 "start time" 元素,如下: postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments ? ( @.HR > 100)."start time"')) FROM t_track ; /* result: jsonb_pretty ----------------------- "2018-10-14 10:39:21" (1 row) */ -- 可以設定多個 filter 條件,如下: postgres=# SELECT jsonb_pretty(jsonb_path_query(a,'$.track.segments ? ( @.HR > 100) ? ( @.location[*] < 40)')) FROM t_track ; /* result: jsonb_pretty ----------------------------------------- { "HR": 130, "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21" } (1 row) */ -- jsonb_path_exists() -- jsonb_path_exists() 判断是否存在指定的 json path,syntax 如下: -- jsonb_path_exists(target jsonb, path jsonpath [, vars jsonb, silent bool]) -- 範例如下: postgres=# SELECT jsonb_path_exists(a,'$.track.segments.HR') FROM t_track ; /* result: jsonb_path_exists ------------------- t (1 row) */ postgres=# SELECT jsonb_path_exists(a,'$.track.segments.ab') FROM t_track ; /* result: jsonb_path_exists ------------------- f (1 row) */ ``` ## Update 2024 https://paquier.xyz/postgresql-2/2023-05-07-postgres-16-json-predicate/ ## reference https://www.postgresql.org/docs/current/datatype-json.html https://github.com/digoal/blog/blob/master/202010/20201013_01_doc_001.pdf https://github.com/digoal/blog/blob/master/202010/20201013_01.md