正規化是一個重要,但是近年來較少討論,甚至被誤解的議題.
最近時常看到一些人,說到正規化,關聯式資料庫,就是以前磁碟貴,要節省空間與金錢,所以才這樣做.現在儲存科技進步,所以就不需要云云.
正規化主要是做一致性,去冗餘只是為了一致性的過程中,帶來的特性,並不是目的.
來一個簡化情況的影片網站.
首先建立兩個 schema, s1212a 與 s1212bak.
s1212a 是主要的schema, s1212bak 的用處稍後就能理解.
create schema s1212a;
create schema s1212bak;
create table s1212a.gal (
id int not null primary key
, gname text
);
create table s1212a.company (
id int not null primary key
, cname text unique
);
insert into s1212a.gal values
(1, '小島みなみ'), (2, '葵つかさ'), (3, 'あやみ旬果'),
(4, '鈴村あいり');
insert into s1212a.company values
(1, 'Alice Japan'), (2, 'S1'), (3, 'PRESTIGE');
create table s1212a.contract (
id int not null primary key
, gal_id int not null REFERENCES s1212a.gal(id)
ON DELETE CASCADE ON UPDATE CASCADE
, com_id int not null REFERENCES s1212a.company(id)
ON DELETE CASCADE ON UPDATE CASCADE
, unique (gal_id, com_id)
);
insert into s1212a.contract values
(1, 1, 1), (2, 1, 2), -- 小島みなみ
(3, 2, 1), (4, 2, 2), -- 葵つかさ
(5, 3, 3), (6, 3, 2), -- あやみ旬果
(7, 4, 3);
select c.id, g.gname, co.cname
from s1212a.contract c
join s1212a.gal g
on c.gal_id = g.id
join s1212a.company co
on c.com_id = co.id;
id | gname | cname
----+------------+-------------
1 | 小島みなみ | Alice Japan
2 | 小島みなみ | S1
3 | 葵つかさ | Alice Japan
4 | 葵つかさ | S1
5 | あやみ旬果 | PRESTIGE
6 | あやみ旬果 | S1
7 | 鈴村あいり | PRESTIGE
(7 rows)
create table s1212a.videos (
id int not null primary key
, cid int not null REFERENCES s1212a.contract(id)
ON DELETE CASCADE ON UPDATE CASCADE
, vid text not null
);
insert into s1212a.videos values
(1, 1, 'DV-1279'), (2, 2, 'SNIS-133'), (3, 2, 'SSIS-911'), -- 小島みなみ
(4, 3, 'DV-1195'), (5, 4, 'SNIS–436'), (6, 4, 'SSIS-974'), -- 葵つかさ
(7, 5, 'EDD-209'), (8, 5, 'ABP-356'), (9, 6, 'SSNI-203'), -- あやみ旬果
(10, 7, 'EDD-218'), (11, 7, 'ABF-061'); -- 鈴村あいり
--
create view s1212a.current_vidoes as
select g.gname, co.cname
, array_agg(v.vid) as vids
from s1212a.contract c
join s1212a.gal g
on c.gal_id = g.id
join s1212a.company co
on c.com_id = co.id
join s1212a.videos v
on c.id = v.cid
group by g.gname, co.cname;
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
あやみ旬果 | PRESTIGE | {EDD-209,ABP-356}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(7 rows)
最後建立了一個view.方便我們取得資料.
在上面我們建立 REFERENCES 時,有宣告 ON DELETE CASCADE ON UPDATE CASCADE
我們以 葵つかさ 為例,來看在 gal 與 contract 的 id
select g.gname, g.id, c.gal_id
from s1212a.gal g
join s1212a.contract c
on c.gal_id = g.id
where g.gname = '葵つかさ';
gname | id | gal_id
----------+----+--------
葵つかさ | 2 | 2
葵つかさ | 2 | 2
(2 rows)
update s1212a.gal
set id = -1
where gname = '葵つかさ';
select g.gname, g.id, c.gal_id
from s1212a.gal g
join s1212a.contract c
on c.gal_id = g.id
where g.gname = '葵つかさ';
gname | id | gal_id
----------+----+--------
葵つかさ | -1 | -1
葵つかさ | -1 | -1
(2 rows)
-- s1212a.contract 的 gal_id 也跟著更新了!
-- view 的內容也跟上面相同(多table join 結果也維持一致性)
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
あやみ旬果 | PRESTIGE | {EDD-209,ABP-356}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(7 rows)
-- create mirror tables
-- 使用 create table like
create table s1212bak.gal (like s1212a.gal including all);
create table s1212bak.company (like s1212a.company including all);
create table s1212bak.contract (like s1212a.contract including all);
create table s1212bak.videos (like s1212a.videos including all);
這樣就會在 schmea 下建立對應的mirror table,而且會連pk,index 都建立好.
但是 references 是沒有建立的,可以讓我們視需要再行建立.
create or replace function s1212a.f_gal_del()
returns trigger
language plpgsql as
$code$
begin
insert into s1212bak.gal
select old.*;
return old;
end;
$code$;
create or replace function s1212a.f_company_del()
returns trigger
language plpgsql as
$code$
begin
insert into s1212bak.company
select old.*;
return old;
end;
$code$;
create or replace function s1212a.f_contract_del()
returns trigger
language plpgsql as
$code$
begin
insert into s1212bak.contract
select old.*;
return old;
end;
$code$;
create or replace function s1212a.f_videos_del()
returns trigger
language plpgsql as
$code$
begin
insert into s1212bak.videos
select old.*;
return old;
end;
$code$;
create trigger tri_bd_gal
before delete on s1212a.gal
for each row execute function s1212a.f_gal_del();
create trigger tri_bd_company
before delete on s1212a.company
for each row execute function s1212a.f_company_del();
create trigger tri_bd_contract
before delete on s1212a.contract
for each row execute function s1212a.f_contract_del();
create trigger tri_bd_videos
before delete on s1212a.videos
for each row execute function s1212a.f_videos_del();
select *
from s1212a.company;
id | cname
----+-------------
1 | Alice Japan
2 | S1
3 | PRESTIGE
(3 rows)
delete
from s1212a.company
where cname = 'PRESTIGE';
commit; -- 注意到這個 commit.
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SNIS-133,SSIS-911}
(5 rows)
select *
from s1212bak.company;
id | cname
----+----------
3 | PRESTIGE
(1 row)
select *
from s1212bak.contract;
id | gal_id | com_id
----+--------+--------
5 | 3 | 3
7 | 4 | 3
(2 rows)
select *
from s1212bak.videos;
id | cid | vid
----+-----+---------
7 | 5 | EDD-209
8 | 5 | ABP-356
10 | 7 | EDD-218
11 | 7 | ABF-061
(4 rows)
select g.gname, co.cname
, array_agg(v.vid) as vids
from s1212bak.contract c
join s1212a.gal g
on c.gal_id = g.id
join s1212bak.company co
on c.com_id = co.id
join s1212bak.videos v
on c.id = v.cid
group by g.gname, co.cname;
gname | cname | vids
------------+----------+-------------------
あやみ旬果 | PRESTIGE | {EDD-209,ABP-356}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(2 rows)
可以看到只有使用一個簡單的delete!
關聯資料都已經順利刪除,並且在s1212bak下的mirror table 也都對應存起來了.
-- 復原的方式
with t1 as (
delete
from s1212bak.company
where cname = 'PRESTIGE'
returning *
)
insert into s1212a.company
select *
from t1;
with t1 as (
delete
from s1212bak.contract c
using s1212a.company co
where co.cname = 'PRESTIGE'
and c.com_id = co.id
returning c.*
)
insert into s1212a.contract
select *
from t1;
-- 觀察現在s1212a下的 company, contract
select co.*, '|'
, c.*
from s1212a.company co
join s1212a.contract c
on c.com_id = co.id
where co.cname = 'PRESTIGE';
id | cname | ?column? | id | gal_id | com_id
----+----------+----------+----+--------+--------
3 | PRESTIGE | | | 5 | 3 | 3
3 | PRESTIGE | | | 7 | 4 | 3
(2 rows)
-- videos
with t1 as (
delete
from s1212bak.videos v
using s1212a.contract c
, s1212a.company co
where co.cname = 'PRESTIGE'
and c.com_id = co.id
returning v.*
)
insert into s1212a.videos
select *
from t1;
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
あやみ旬果 | PRESTIGE | {ABP-356,EDD-209}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(7 rows)
-- 資料都回來了
-- 檢查 s1212bak schema 下的資料
select *
from s1212bak.videos;
id | cid | vid
----+-----+-----
(0 rows)
使用delete 搭配join的方式,再做insert.先做最上級,然後逐步搭配已經復原的資料,就能復原了!
是的! 不使用 cascade 的方式,一樣可以透過join, 先刪除最下級資料,
逐步往上,與復原那段類似但反向.
但是,select大家常下,delete其實是相對較少的,而且有時候下架時機是有壓力的,甚至有時候會有一些商務法務上的需求.可以,但是有時候就出包,往往還是大包.
為了減少失誤,提高效率與正確性,現成的機制,為何不善用?
這時候需要與片商協商,目前 已經接獲 S1 同意,先下架 あやみ旬果 與 S1 合作之影片.
delete
from s1212a.contract c
using s1212a.gal g
, s1212a.company co
where c.gal_id = g.id
and c.com_id = co.id
and g.gname = 'あやみ旬果'
and co.cname = 'S1';
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SSIS-974,SNIS–436}
あやみ旬果 | PRESTIGE | {ABP-356,EDD-209}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(6 rows)
-- 可以看到 あやみ旬果 與 S1 的影片都下架了!
-- 檢查 s1212bak schema 下的資料
select *
from s1212bak.contract;
id | gal_id | com_id
----+--------+--------
6 | 3 | 2
(1 row)
select *
from s1212bak.videos;
id | cid | vid
----+-----+----------
9 | 6 | SSNI-203
(1 row)
rollback;
-- 因為我是把 autocommit off, 所以 rollback 就都復原了!
select *
from s1212bak.videos;
id | cid | vid
----+-----+-----
(0 rows)
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
あやみ旬果 | PRESTIGE | {ABP-356,EDD-209}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(7 rows)
一樣是一道delete 搞定!
這時候有兩種作法, 一種是直接將 gal 裡面 葵つかさ 刪除,
另一種是將 contract 裡面與 葵つかさ 關聯的 刪除.
第二種方法與上面 あやみ旬果 下架的方式類似,而且還不用判斷company部分.只跟gal join.第一種方法更簡單.
至於使用哪種方式,要看我們實際需求,一般來說,我會採用第二種,保留av女優的名字在系統裡.因為有共演片,至於共演片是否下架,這是另外議題,假設共演片先不下架,這樣就能關聯到已下架影片女優的名字了.當然這是比現在範例複雜許多的情境.
delete
from s1212a.gal
where gname = '葵つかさ';
-- 觀察這時候的 s1212bak schema 下的
select g.gname, co.cname
, array_agg(v.vid) as vids
from s1212bak.contract c
join s1212bak.gal g
on c.gal_id = g.id
join s1212a.company co
on c.com_id = co.id
join s1212bak.videos v
on c.id = v.cid
group by g.gname, co.cname;
gname | cname | vids
----------+-------------+---------------------
葵つかさ | Alice Japan | {DV-1195}
葵つかさ | S1 | {SNIS–436,SSIS-974}
(2 rows)
-- 觀察 current_videos
select *
from s1212a.current_vidoes;
gname | cname | vids
------------+-------------+---------------------
あやみ旬果 | PRESTIGE | {ABP-356,EDD-209}
あやみ旬果 | S1 | {SSNI-203}
小島みなみ | Alice Japan | {DV-1279}
小島みなみ | S1 | {SSIS-911,SNIS-133}
鈴村あいり | PRESTIGE | {EDD-218,ABF-061}
(5 rows)
-- 已經符合需求了,輕鬆簡單
rollback;
-- rollback 就全部回來了.
正規化是很重要,且需要我們深入理解與實踐進而能夠更有效的使用.
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up