<style>
.slides img {
filter: invert(90%);
}
</style>
# Effective SQL
CH3
<div style="margin-top: 10%; font-size: 60%;">
Presenter: yipo
2019-05-31
</div>
---
不能改變設計時…
<div style="margin-top: 6%;"><!-- .element: class="fragment" -->
## 把壞資料表弄成
## 好看的樣子
</div>
---
18
# `VIEW`
---
## 例一:消除重複資料
無法改變這個
↓
<div style="font-size: 28%;">
CustomerSales
| SalesID | CustFirstName | CustLastName | Address | City | Phone | PurchaseDate | ModelYear | Model | SalesPerson |
| -:| ------ | ------- | ----------------- | -------- | ------------ | ---------- | ---- | --------------------------- | ------------- |
| 1 | Amy | Bacock | 111 Dover Lane | Chicago | 312-222-1111 | 2016-02-14 | 2016 | Mercedes R231 | Mariam Castro |
| 2 | Tom | Frank | 7453 NE 20th St. | Bellevue | 425-888-9999 | 2016-03-15 | 2016 | Land Rover | Donald Ash |
| 3 | Debra | Smith | 3223 SE 12th Pl. | Seattle | 206-333-4444 | 2016-01-20 | 2016 | Toyota Camry | Bill Baker |
| 4 | Barney | Killjoy | 4655 Rainier Ave. | Auburn | 253-111-2222 | 2015-12-22 | 2016 | Subaru Outback | Bill Baker |
| 5 | Homer | Tyler | 1287 Grady Way | Renton | 425-777-8888 | 2015-11-10 | 2016 | Ford Mustang GT Convertible | Mariam Castro |
| 6 | Tom | Frank | 7435 NE 20th St. | Bellevue | 425-888-9999 | 2015-05-25 | 2015 | Cadillac CT6 Seda | Jessica Robin |
</div>
以 檢視表 呈現正規化後的樣子
---
```sql
CREATE VIEW vCustomers AS
SELECT DISTINCT cs.CustFirstName, cs.CustLastName,
cs.Address, cs.City, cs.Phone
FROM CustomerSales AS cs;
```
```sql
CREATE VIEW vAutomobileModels AS
SELECT DISTINCT cs.ModelYear, cs.Model
FROM CustomerSales AS cs;
```
```sql
CREATE VIEW vEmployees AS
SELECT DISTINCT cs.SalesPerson
FROM CustomerSales AS cs;
```
挑選欄位 並 `DISTINCT`,正規化為三個資料表。
---
<div style="font-size: 60%;">
vCustomers
| CustFirstName | CustLastName | Address | City | Phone |
| ------ | ------- | ----------------- | -------- | ------------ |
| Amy | Bacock | 111 Dover Lane | Chicago | 312-222-1111 |
| Barney | Killjoy | 4655 Rainier Ave. | Auburn | 253-111-2222 |
| Debra | Smith | 3223 SE 12th Pl. | Seattle | 206-333-4444 |
| Homer | Tyler | 1287 Grady Way | Renton | 425-777-8888 |
| Tom | Frank | 7435 NE 20th St. | Bellevue | 425-888-9999 |
| Tom | Frank | 7453 NE 20th St. | Bellevue | 425-888-9999 |
</div>
便容易發現錯誤,並在原資料表修正。
---
## 例二:消除重複群組
無法改變這個
↓
<div style="font-size: 40%;">
Assignments
| ID | DrawingNumber | Predecessor_1 | Predecessor_2 | Predecessor_3 | Predecessor_4 | Predecessor_5 |
| -:| ------------ | ---------- | ---------- | ---------- | ---------- | ---------- |
| 1 | LO542B2130 | LS01847409 | LS02390811 | LS02390813 | LS02390817 | LS02390819 |
| 2 | LO426C2133 | LS02388410 | LS02495236 | LS02495238 | LS02495241 | LS02640008 |
| 3 | LO329W2743-1 | LS02388418 | LS02640036 | LS02388418 | NULL | NULL |
| 4 | LO873W1842-2 | LS02388419 | LS02741454 | LS02741456 | LS02769388 | NULL |
| 5 | LO690W1960-1 | LS02742130 | NULL | NULL | NULL | NULL |
| 6 | LO217W1855-1 | LS02388421 | LS02769390 | NULL | NULL | NULL |
</div>
以 檢視表 呈現正規化後的樣子
---
```sql
CREATE VIEW vDrawings AS
SELECT a.ID AS DrawingID, a.DrawingNumber
FROM Assignments AS a;
```
```sql
CREATE VIEW vPredecessors AS
SELECT 1 AS PredecessorID, a.ID AS DrawingID,
a.Predecessor_1 AS Predecessor
FROM Assignments AS a
WHERE a.Predecessor_1 IS NOT NULL
UNION
…
UNION
SELECT 5, a.ID, a.Predecessor_5
FROM Assignments AS a
WHERE a.Predecessor_5 IS NOT NULL;
```
挑選重複欄位 並 `UNION`
---
## 檢視表 可以編輯嗎?
<div style="line-height: 1.5;">
- 可能無法
這裡用了 `DISTINCT` 和 `UNION`
- 部份系統提供 `INSTEAD OF` 觸發器
→ 能透過 檢視表 來更新 底層資料表
</div>
```sql
CREATE TRIGGER view_insert
INSTEAD OF INSERT ON my_view
FOR EACH ROW
EXECUTE FUNCTION view_insert_row();
```
<small>[PostgreSQL: CREATE TRIGGER](https://www.postgresql.org/docs/current/sql-createtrigger.html)</small>
---
## 其他使用場合
<div style="line-height: 2; font-size: 60%;">
- 專注特定的欄或列
- <!-- .element: class="fragment" --> 簡化或更名欄位
- <!-- .element: class="fragment" --> 組合多個資料表
- <!-- .element: class="fragment" --> 省去重複輸入複雜的指令 <span style="color: gray;">確保一致性 參數化檢視表 (??</span>
- <!-- .element: class="fragment" --> 遮蔽敏感資訊 <span style="color: gray;">搭配限制權限 (欄或列) `WITH CHECK OPTION`</span>
- <!-- .element: class="fragment" --> 提供相容性
- <!-- .element: class="fragment" --> 據權限或角色呈現不同資料
- <!-- .element: class="fragment" --> 彙整資料:`SUM()`、`AVERAGE()` 等
- <!-- .element: class="fragment" --> 匯入/匯出 串接其他程式
</div>
---
## 拿檢視表製作檢視表
有效能疑慮
<div style="line-height: 2; font-size: 60%;">
- 例子:以檢視表製作彙整資料,其中連接了兩張檢視表。
- 原理:解構檢視表語法成為等效語法
- 問題
* 對中間結果的欄位求值,恐浪費計算工作。
* 被濾掉的列造成無謂的 IO
- 最好直接參考原始資料表
</div>
---
```sql
CREATE VIEW vActiveCustomers AS
SELECT c.CustomerID, c.CustFirstName, c.CustLastName,
c.CustFirstName + ' ' + c.CustLastName AS CustFullName
FROM Customers AS c
WHERE EXISTS (
SELECT NULL
FROM Orders AS o
WHERE o.CustomerID = c.CustomerID
AND o.OrderDate > DATEADD(MONTH, -6, GETDATE()));
```
<small>近半年下單的客人</small>
```sql
CREATE VIEW vCustomerStatistics AS
SELECT o.CustomerID, COUNT(o.OrderNumber) AS OrderCount,
SUM(o.OrderTotal) AS GrandOrderTotal,
MAX(o.OrderDate) AS LastOrderDate
FROM Orders AS o
GROUP BY o.CustomerID;
```
<small>客人的統計資料</small>
---
```sql
CREATE VIEW vActiveCustomerStatistics AS
SELECT a.CustomerID, a.CustFirstName, a.CustLastName,
s.LastOrderDate, s.GrandOrderTotal
FROM vActiveCustomers AS a
INNER JOIN vCustomerStatistics AS s
ON a.CustomerID = s.CustomerID;
```
<small>近半年下單客人的統計資料 (但沒 OrderCount)</small>
---
```sql
SELECT c.CustomerID, c.CustFirstName, c.CustLastName,
s.LastOrderDate, s.GrandOrderTotal
FROM Customers AS c
INNER JOIN (
SELECT o.CustomerID,
SUM(o.OrderTotal) AS GrandOrderTotal,
MAX(o.OrderDate) AS LastOrderDate
FROM Orders AS o
GROUP BY o.CustomerID) AS s
ON c.CustomerID = s.CustomerID
WHERE EXISTS (
SELECT NULL FROM Orders AS o
WHERE o.CustomerID = c.CustomerID
AND o.OrderDate > DATEADD(MONTH, -6, GETDATE()));
```
<small>最佳化程序產生的結果</small>
---
19
# ETL
---
<div style="font-size: 150%;">
Extract, Transform, Load (ETL)
匯入外部資料的工具
</div>
---
![](https://i.imgur.com/yFp7dSy.png)
---
## Access 為例 (1/2)
<div style="line-height: 2; font-size: 80%;">
1. 從 試算表 匯入
2. 辨識首列可能為欄位名稱
3. 辨識欄位可能的型別
4. 忽略不重要的欄
</div>
<div style="height: 200px; overflow: hidden;">
![](https://i.imgur.com/yFp7dSy.png)
</div>
---
## Access 為例 (2/2)
<div style="line-height: 2; font-size: 80%;">
5. 挑選主鍵、是否 `AUTO INCREMENT` (或不要主鍵)
6. 命名資料表
7. 整合其他工具分析
8. 進行後續正規化 (Table Analyzer)
</div>
<div style="height: 200px; overflow: hidden;">
![](https://i.imgur.com/yFp7dSy.png)
</div>
---
<div style="font-size: 150%;">
大部份系統皆提供有類似工具
(或其他開源或付費工具)
</div>
---
20
# `SUMMARY`
---
## 目的?
<div style="line-height: 2;">
- 整合資料,呈現較易讀易懂的資訊。
- 預先計算,加快速度。
</div>
---
## 土炮做法
建立額外資料表來 匯總 (summary) 原始資料表
<div style="line-height: 2;">
1. 以 觸發器 在原資料更動時一併更新
→ 若時常異動要留意處理成本
2. 以 預存程序 定期全盤重新建立
→ 通常較好
</div>
---
## DB2 支援 匯總資料表
<div style="line-height: 2; font-size: 80%;">
- 來源可以是多個資料表
- 更新:自動 或 手動
- 效能佳、具最佳化選項
`ENABLE QUERY OPTIMIZATION`
(使用已匯總的資料??)
- 省去維護 觸發器 或 預存程序 的麻煩
</div>
---
## 語法
```sql
CREATE SUMMARY TABLE SalesSummary AS (
SELECT
…
)
```
<small>DB2</small>
```sql
CREATE MATERIALIZED VIEW SalesSummary
…
```
<small>Oracle</small>
---
## 語法
<div style="line-height: 2; font-size: 80%;">
- 類似 實質化檢視表 (加了 `GROUP BY`)
- 實質化查詢限制使用 `INNER JOIN` (??
- `COUNT(*)` 以使用 `REFRESH IMMEDIATE` (??
- Oracle:以 實質化檢視表 (`MATERIALIZED VIEW`) 達成
- MS SQL:對 檢視表 建構 索引 達成
- 系統各有不同,建議參考文件。
</div>
---
## 成本
<div style="line-height: 2; font-size: 80%;">
- 佔儲存空間
- 須維護、管理 (觸發器、預存程序、約束??)
- 根據使用情境設計需匯總的資料
- 不同的 分組 或 過濾 需多個 匯總資料表
- 可能需要排程更新
</div>
---
## 行內匯總
來避免維護、管理的成本
<div style="line-height: 2; font-size: 80%;">
- 在現有資料表加 匯總欄
- 以 `INSERT INTO` 陳述資料
- 非匯總欄設定為特定值
- 好處:匯總與細節可一併或分開查詢
- 壞處:查詢時留意區分匯總與細節
</div>
---
<div style="font-size: 80%;">
Orders
| OrderID | CustomerID | Amount | Price | Count | Total |
| -:| ----:| ----:| ----:| ----:| ----:|
| 1 | 506 | 5 | $10 | NULL | NULL |
| 2 | 506 | 1 | $80 | NULL | NULL |
| 3 | 506 | 2 | $60 | NULL | NULL |
| 4 | 506 | NULL | NULL | 3 | $250 |
</div>
想像中的樣子 (不確定…)
---
21
# 以 `UNION` 整理資料
---
## 舉例討論
![](https://i.imgur.com/j7XKm8R.png)
從 Excel 來的試算表
---
1. 先設法匯入
<div style="margin: 5%; font-size: 30%;">
SalesSummary
| Category | OctQuantity | OctSales | NovQuantity | NovSales | DecQuantity | DecSales | JanQuantity | JanSales | FebQuantity | FebSales |
|:----------- | ---:| -----------:| ---:| -----------:| ---:| -----------:| ----:| -----------:| --- | -----------:|
| Accessories | 930 | $61,165.40 | 923 | $60,883.03 | 987 | $62,758.14 | 1223 | $80,954.76 | 979 | $60,242.47 |
| Bikes | 413 | $536,590.50 | 412 | $546,657.00 | 332 | $439,831.50 | 542 | $705,733.50 | 450 | $585,130.50 |
| Car racks | 138 | $24,077.15 | 96 | $16,772.05 | 115 | $20,137.05 | 142 | $24,794.75 | 124 | $21,763.30 |
| Clothing | 145 | $5,903.20 | 141 | $5,149.96 | 139 | $4,937.74 | 153 | $5,042.62 | 136 | $5,913.98 |
| Components | 286 | $34,228.55 | 322 | $35,451.79 | 265 | $27,480.22 | 325 | $35,181.97 | 307 | $32,828.02 |
| Skateboards | 164 | $60,530.06 | 203 | $89,040.58 | 129 | $59,377.20 | 204 | $79,461.30 | 147 | $61,125.19 |
| Tires | 151 | $4,356.91 | 110 | $3,081.24 | 150 | $4,388.55 | 186 | $5,377.60 | 137 | $3,937.70 |
</div>
```sql
SELECT Category, OctQuantity, OctSales
FROM SalesSummary
```
不同月份資料要分開查詢
---
2. 直接以 `UNION` 將各月份查詢合併
```sql
SELECT Category, OctQuantity, OctSales
FROM SalesSummary
UNION
SELECT Category, NovQuantity, NovSales
FROM SalesSummary
UNION
…
```
<div style="font-size: 40%;">
| Category | OctQuantity | OctSales |
|:----------- | ---:| -----------:|
| Accessories | 930 | $61,165.40 |
| Accessories | 923 | $60,883.03 |
| Accessories | 987 | $62,758.14 |
| ... | ... | ... |
| Tires | 150 | $4,388.55 |
| Tires | 186 | $5,377.60 |
| Tires | 137 | $3,937.70 |
</div>
分不清是何月
---
3. ⑴ 於 `SELECT` 增加月份欄位 ⑵ 善用 `AS` 更名欄位
```sql
SELECT Category, 'Oct' AS SalesMonth,
OctQuantity AS Quantity, OctSales AS SalesAmt
FROM SalesSummary
UNION
SELECT Category, 'Nov', NovQuantity, NovSales
FROM SalesSummary
UNION
…
```
<div style="font-size: 40%;">
| Category | SalesMonth | Quantity | Sales |
|:----------- |:--- | ---:| -----------:|
| Accessories | Oct | 930 | $61,165.40 |
| Accessories | Nov | 923 | $60,883.03 |
| Accessories | Dec | 987 | $62,758.14 |
| ... | ... | ... | ... |
| Tires | Dec | 150 | $4,388.55 |
| Tires | Jan | 186 | $5,377.60 |
| Tires | Feb | 137 | $3,937.70 |
</div>
---
4. 在最後加個 `ORDER BY` 排序資料
```sql
SELECT Category, 'Oct' AS SalesMonth,
OctQuantity AS Quantity, OctSales AS SalesAmt
FROM SalesSummary
UNION
…
UNION
SELECT Category, 'Jan', JanQuantity, JanSales
FROM SalesSummary
UNION
SELECT Category, 'Feb', FebQuantity, FebSales
FROM SalesSummary
ORDER BY SalesMonth, Category;
```
---
## 規則
<div style="line-height: 2;">
- 欄位數目相同
- 欄位順序相同
- 每個欄位的型態要相容
(欄位的名稱沒差)
</div>
---
`UNION ALL` 不排除重複,
速度較快。
---
END;
{"metaMigratedAt":"2023-06-14T21:56:20.608Z","metaMigratedFrom":"YAML","breaks":"true","description":"","title":"Effective SQL (CH3)","showTags":"false","showTitle":"false","lang":"zh-TW","contributors":"[{\"id\":\"8128865c-d2fb-4723-b1e0-73e1fb0c2614\",\"add\":38336,\"del\":43287}]"}