owned this note
owned this note
Published
Linked with GitHub
# postgresql 樹狀結構
## 什麼是樹狀結構


## 可能的使用情境
- 目錄/麵包屑/分類/子分類
- 物料清單
- 討論串
## 範例樹

## 可能的解決方案
### Adjacency list
使用一個 parent_id 來儲存關聯,假設是討論串的情境,則欄位如下:
- id
- parent_id
- content
資料如下:
| id | parent_id | content |
|--- |-----------|---------|
| 1 | NULL | aa |
| 2 | 1 | bb |
| 3 | 1 | cc |
| 4 | 2 | dd |
| 5 | 3 | ee |
| 6 | 4 | ff |
| 7 | 4 | gg |
#### 查詢父層
```sql
SELECT parent.* from adjancency_lists node
join adjancency_lists parent on node.parent_id = parent.id
where node.id = 4;
```
#### 查詢子層
```sql
SELECT * from adjancency_lists node
where parent_id = 4;
```
#### 查詢所有祖先
WITH RECURSIVE CTE 參考文件:https://www.postgresql.org/docs/current/queries-with.html
```sql
WITH RECURSIVE ancestors(id, parent_id, content) AS (
SELECT node.id, node.parent_id, node.content
FROM adjancency_lists node
WHERE id = 4
UNION ALL
SELECT node.id, node.parent_id, node.content
FROM adjancency_lists node
JOIN ancestors ON ancestors.parent_id = node.id
)
SELECT * FROM ancestors;
```
#### 查詢所有後代
```sql
WITH RECURSIVE decendants(id, parent_id, content) AS (
SELECT node.id, node.parent_id, node.content
FROM adjancency_lists node
WHERE id = 4
UNION ALL
SELECT node.id, node.parent_id, node.content
FROM adjancency_lists node
JOIN decendants ON decendants.id = node.parent_id
)
SELECT * FROM decendants;
```
#### 寫入節點
```sql
INSERT INTO adjancency_lists(parent_id, content)
VALUES (4, 'qq');
SELECT * from adjancency_lists;
INSERT INTO adjancency_lists(parent_id, content)
VALUES (10, 'qqq');
```
#### 刪除節點(包含後代)使用 foreign_key 連動刪除
```sql
delete from adjancency_lists where id = 10;
```
### Path enumeration / Materialized Path
使用一個 path 字串([或陣列](https://medium.com/notes-from-a-messy-desk/representing-trees-in-postgresql-cbcdae419022))欄位來儲存關聯,假設是討論串的情境,則欄位如下:
- id
- path
- content
資料如下:
| id | path | content |
|----|----------|---------|
| 1 | | aa |
| 2 | 1/ | bb |
| 3 | 1/ | cc |
| 4 | 1/2/ | dd |
| 5 | 1/3/ | ee |
| 6 | 1/2/4/ | ff |
| 7 | 1/2/4/ | gg |
其中,path 可以是天然的麵包屑
#### 查詢 id 4 的父層
```sql
SELECT * from materialized_paths
where path = '1/2/' ;
```
#### 查詢 id 4 的子層
```sql
SELECT * from materialized_paths
where path = '1/2/4/';
```
#### 查詢 id 4 的所有祖先
```sql
SELECT * from materialized_paths
where '1/2/' ~* path;
```
#### 查詢 id 4 的所有後代
```sql
SELECT * from materialized_paths
where path ~* '1/2/';
```
#### 寫入節點
```sql
INSERT INTO materialized_paths(path, content)
VALUES ('1/2/4/', 'gg');
SELECT * from materialized_paths;
INSERT INTO materialized_paths(path, content)
VALUES ('1/2/4/11/', 'qqq');
```
#### 刪除節點(包含後代)
```sql
delete from materialized_paths where id = 11 or path ~* '1/2/4/11';
```
### Nested sets



使用兩個數字欄位(left, right)來儲存關聯,假設是討論串的情境,則欄位如下:
- id
- left
- right
- content
資料如下:
| id | left | right | content |
|----|------|-------|---------|
| 1 | 1 | 14 | aa |
| 2 | 2 | 9 | bb |
| 3 | 10 | 13 | cc |
| 4 | 3 | 8 | dd |
| 5 | 11 | 12 | ee |
| 6 | 4 | 5 | ff |
| 7 | 6 | 7 | gg |
#### 查詢 id 4 的父層
```sql
SELECT parent.* from nested_sets node
join nested_sets parent on node.left BETWEEN parent."left" + 1 and parent."right" - 1
where node.id = 4
order by parent.right limit 1;
```
#### 查詢 id 4 的子層
[因為太麻煩了所以可能多儲存一個 depth](https://en.wikipedia.org/wiki/Nested_set_model#Variations)
```sql
SELECT node.* from nested_sets node
join nested_sets parent on node.left BETWEEN parent."left" + 1 and parent."right" - 1
where parent.id = 4
and not exists (
select *
from nested_sets mid
where mid.left BETWEEN parent.left + 1 and parent.right - 1
and node.left BETWEEN mid.left + 1 and mid.right - 1
);
```
#### 查詢 id 4 的所有祖先
```sql
SELECT parent.* from nested_sets node
join nested_sets parent on node.left BETWEEN parent."left" and parent."right"
where node.id = 4;
```
#### 查詢 id 4 的所有後代
```sql
SELECT node.* from nested_sets node
join nested_sets parent on node.left BETWEEN parent."left" and parent."right"
where parent.id = 4;
```
#### 寫入節點
```sql
select "right" from nested_sets
where id = 4;
update nested_sets
set "right" = "right" + 2
WHERE "right" >= 8;
update nested_sets
set "left" = "left" + 2
WHERE "left" >= 8;
insert into nested_sets("left", "right", content)
values (8,9,'qq');
-- 看一下寫入的狀況
select * from nested_sets
order by "left", "right" desc;
-- 再寫入一筆
select "right" from nested_sets
where id = 8;
update nested_sets
set "right" = "right" + 2
WHERE "right" >= 9;
update nested_sets
set "left" = "left" + 2
WHERE "left" >= 9;
insert into nested_sets("left", "right", content)
values (9,10,'qqq');
select * from nested_sets
order by "left", "right" desc;
```
#### 刪除節點(包含後代)
```sql
select "left", "right" from nested_sets
where id = 8;
delete from nested_sets
where "left" >= 8 and "right" <= 11;
update nested_sets
set "left" = "left" - (11-8+1)
WHERE "left" >= 11;
update nested_sets
set "right" = "right" - (11-8+1)
WHERE "right" >= 11;
```
### Subsets / Closure table
使用一個資料表來儲存關聯,假設是討論串的情境,則欄位如下:
- closeure_table_comments
- id
- content
- closeure_table_relations
- ancestor_id
- descendant_id
- length
資料如下:
- closeure_table_comments
| id | content |
|----|---------|
| 1 | aa |
| 2 | bb |
| 3 | cc |
| 4 | dd |
| 5 | ee |
| 6 | ff |
| 7 | gg |
- closeure_table_relations
| id | ancestor_id | dencendant_id | length |
|----|-------------|---------------|--------|
| 2 | 1 | 1 | 0 |
| 3 | 1 | 2 | 1 |
| 4 | 1 | 3 | 1 |
| 5 | 1 | 4 | 2 |
| 6 | 1 | 5 | 2 |
| 7 | 1 | 6 | 3 |
| 8 | 1 | 7 | 3 |
| 9 | 2 | 2 | 0 |
| 10 | 2 | 4 | 1 |
| 11 | 2 | 6 | 2 |
| 12 | 2 | 7 | 2 |
| 13 | 3 | 3 | 0 |
| 14 | 3 | 5 | 1 |
| 15 | 4 | 4 | 0 |
| 16 | 4 | 6 | 1 |
| 17 | 4 | 7 | 1 |
| 18 | 5 | 5 | 0 |
| 19 | 6 | 6 | 0 |
| 20 | 7 | 7 | 0 |
#### 查詢 id 4 的父層
```sql
select c.* from closeure_table_comments c
join closeure_table_relations r on ancestor_id = c.id
where r.dencendant_id = 4 and r."length" = 1;
```
#### 查詢 id 4 的子層
```sql
select c.* from closeure_table_comments c
join closeure_table_relations r on dencendant_id = c.id
where r.ancestor_id = 4 and r."length" = 1;
```
#### 查詢 id 4 的所有祖先
```sql
select c.* from closeure_table_comments c
join closeure_table_relations r on ancestor_id = c.id
where r.dencendant_id = 4 and r."length" >= 1;
```
#### 查詢 id 4 的所有後代
```sql
select c.* from closeure_table_comments c
join closeure_table_relations r on dencendant_id = c.id
where r.ancestor_id = 4 and r."length" >= 1;
```
#### 寫入節點
```sql
insert into closeure_table_comments("content")
values('QQ');
select * from closeure_table_comments;
insert into closeure_table_relations(ancestor_id, dencendant_id, "length")
SELECT ancestor_id, 8 as dencendant_id, length + 1 as length FROM closeure_table_relations
WHERE dencendant_id = 4
union select 8,8,0;
-- 看一下寫入的狀況
select * from closeure_table_relations;
-- 再寫入一筆
insert into closeure_table_comments("content")
values('QQQ');
select * from closeure_table_comments;
insert into closeure_table_relations(ancestor_id, dencendant_id, "length")
SELECT ancestor_id, 9 as dencendant_id, length + 1 as length FROM closeure_table_relations
WHERE dencendant_id = 8
union select 9,9,0;
```
#### 刪除節點(包含後代)
```sql
select dencendant_id from closeure_table_relations
where ancestor_id = 8;
delete from closeure_table_relations
where dencendant_id in (8,9);
delete from closeure_table_comments
where id in (8,9);
```
## 結論
- RECURSIVE CTE 好像還不錯
- 大量寫入、少量查詢時,使用 Adjancency list
- 資料量大但是不高(depth)的樹,使用 Closure Table 或 Materialized Path
- Nested Set 感覺比較遜,但是我都沒實測過
- 需要在真實的大量資料上實測各種方法實際的效能(搭配 index)
## 相關套件
- ltree: https://www.postgresql.org/docs/current/ltree.html
- hierarchy-data-closure-table: https://github.com/developerworks/hierarchy-data-closure-table
## 參考資料
- https://www.facebook.com/etrex.kuo/posts/2725099867500858
- https://blog.frost.tw/posts/2017/10/23/Use-PostgreSQL-s-recursive-query-to-find-ancestors/
- https://www.slideshare.net/billkarwin/models-for-hierarchical-data
- https://www.postgresql.org/docs/current/queries-with.html
- https://bitworks.software/en/2017-10-20-storing-trees-in-rdbms.html