<style> .slides img { filter: invert(90%); } </style> # Effective SQL CH3 <div style="margin-top: 10%; font-size: 60%;"> Presenter: yipo 2019-05-31 </div> --- 不能改變設計時… <div style="margin-top: 6%;"><!-- .element: class="fragment" --> ## 把壞資料表弄成 ## 好看的樣子 </div> --- 18 # `VIEW` --- ## 例一:消除重複資料 無法改變這個 ↓ <div style="font-size: 28%;"> CustomerSales | SalesID | CustFirstName | CustLastName | Address | City | Phone | PurchaseDate | ModelYear | Model | SalesPerson | | -:| ------ | ------- | ----------------- | -------- | ------------ | ---------- | ---- | --------------------------- | ------------- | | 1 | Amy | Bacock | 111 Dover Lane | Chicago | 312-222-1111 | 2016-02-14 | 2016 | Mercedes R231 | Mariam Castro | | 2 | Tom | Frank | 7453 NE 20th St. | Bellevue | 425-888-9999 | 2016-03-15 | 2016 | Land Rover | Donald Ash | | 3 | Debra | Smith | 3223 SE 12th Pl. | Seattle | 206-333-4444 | 2016-01-20 | 2016 | Toyota Camry | Bill Baker | | 4 | Barney | Killjoy | 4655 Rainier Ave. | Auburn | 253-111-2222 | 2015-12-22 | 2016 | Subaru Outback | Bill Baker | | 5 | Homer | Tyler | 1287 Grady Way | Renton | 425-777-8888 | 2015-11-10 | 2016 | Ford Mustang GT Convertible | Mariam Castro | | 6 | Tom | Frank | 7435 NE 20th St. | Bellevue | 425-888-9999 | 2015-05-25 | 2015 | Cadillac CT6 Seda | Jessica Robin | </div> 以 檢視表 呈現正規化後的樣子 --- ```sql CREATE VIEW vCustomers AS SELECT DISTINCT cs.CustFirstName, cs.CustLastName, cs.Address, cs.City, cs.Phone FROM CustomerSales AS cs; ``` ```sql CREATE VIEW vAutomobileModels AS SELECT DISTINCT cs.ModelYear, cs.Model FROM CustomerSales AS cs; ``` ```sql CREATE VIEW vEmployees AS SELECT DISTINCT cs.SalesPerson FROM CustomerSales AS cs; ``` 挑選欄位 並 `DISTINCT`,正規化為三個資料表。 --- <div style="font-size: 60%;"> vCustomers | CustFirstName | CustLastName | Address | City | Phone | | ------ | ------- | ----------------- | -------- | ------------ | | Amy | Bacock | 111 Dover Lane | Chicago | 312-222-1111 | | Barney | Killjoy | 4655 Rainier Ave. | Auburn | 253-111-2222 | | Debra | Smith | 3223 SE 12th Pl. | Seattle | 206-333-4444 | | Homer | Tyler | 1287 Grady Way | Renton | 425-777-8888 | | Tom | Frank | 7435 NE 20th St. | Bellevue | 425-888-9999 | | Tom | Frank | 7453 NE 20th St. | Bellevue | 425-888-9999 | </div> 便容易發現錯誤,並在原資料表修正。 --- ## 例二:消除重複群組 無法改變這個 ↓ <div style="font-size: 40%;"> Assignments | ID | DrawingNumber | Predecessor_1 | Predecessor_2 | Predecessor_3 | Predecessor_4 | Predecessor_5 | | -:| ------------ | ---------- | ---------- | ---------- | ---------- | ---------- | | 1 | LO542B2130 | LS01847409 | LS02390811 | LS02390813 | LS02390817 | LS02390819 | | 2 | LO426C2133 | LS02388410 | LS02495236 | LS02495238 | LS02495241 | LS02640008 | | 3 | LO329W2743-1 | LS02388418 | LS02640036 | LS02388418 | NULL | NULL | | 4 | LO873W1842-2 | LS02388419 | LS02741454 | LS02741456 | LS02769388 | NULL | | 5 | LO690W1960-1 | LS02742130 | NULL | NULL | NULL | NULL | | 6 | LO217W1855-1 | LS02388421 | LS02769390 | NULL | NULL | NULL | </div> 以 檢視表 呈現正規化後的樣子 --- ```sql CREATE VIEW vDrawings AS SELECT a.ID AS DrawingID, a.DrawingNumber FROM Assignments AS a; ``` ```sql CREATE VIEW vPredecessors AS SELECT 1 AS PredecessorID, a.ID AS DrawingID, a.Predecessor_1 AS Predecessor FROM Assignments AS a WHERE a.Predecessor_1 IS NOT NULL UNION … UNION SELECT 5, a.ID, a.Predecessor_5 FROM Assignments AS a WHERE a.Predecessor_5 IS NOT NULL; ``` 挑選重複欄位 並 `UNION` --- ## 檢視表 可以編輯嗎? <div style="line-height: 1.5;"> - 可能無法 這裡用了 `DISTINCT` 和 `UNION` - 部份系統提供 `INSTEAD OF` 觸發器 → 能透過 檢視表 來更新 底層資料表 </div> ```sql CREATE TRIGGER view_insert INSTEAD OF INSERT ON my_view FOR EACH ROW EXECUTE FUNCTION view_insert_row(); ``` <small>[PostgreSQL: CREATE TRIGGER](https://www.postgresql.org/docs/current/sql-createtrigger.html)</small> --- ## 其他使用場合 <div style="line-height: 2; font-size: 60%;"> - 專注特定的欄或列 - <!-- .element: class="fragment" --> 簡化或更名欄位 - <!-- .element: class="fragment" --> 組合多個資料表 - <!-- .element: class="fragment" --> 省去重複輸入複雜的指令 <span style="color: gray;">確保一致性 參數化檢視表 (??</span> - <!-- .element: class="fragment" --> 遮蔽敏感資訊 <span style="color: gray;">搭配限制權限 (欄或列) `WITH CHECK OPTION`</span> - <!-- .element: class="fragment" --> 提供相容性 - <!-- .element: class="fragment" --> 據權限或角色呈現不同資料 - <!-- .element: class="fragment" --> 彙整資料:`SUM()`、`AVERAGE()` 等 - <!-- .element: class="fragment" --> 匯入/匯出 串接其他程式 </div> --- ## 拿檢視表製作檢視表 有效能疑慮 <div style="line-height: 2; font-size: 60%;"> - 例子:以檢視表製作彙整資料,其中連接了兩張檢視表。 - 原理:解構檢視表語法成為等效語法 - 問題 * 對中間結果的欄位求值,恐浪費計算工作。 * 被濾掉的列造成無謂的 IO - 最好直接參考原始資料表 </div> --- ```sql CREATE VIEW vActiveCustomers AS SELECT c.CustomerID, c.CustFirstName, c.CustLastName, c.CustFirstName + ' ' + c.CustLastName AS CustFullName FROM Customers AS c WHERE EXISTS ( SELECT NULL FROM Orders AS o WHERE o.CustomerID = c.CustomerID AND o.OrderDate > DATEADD(MONTH, -6, GETDATE())); ``` <small>近半年下單的客人</small> ```sql CREATE VIEW vCustomerStatistics AS SELECT o.CustomerID, COUNT(o.OrderNumber) AS OrderCount, SUM(o.OrderTotal) AS GrandOrderTotal, MAX(o.OrderDate) AS LastOrderDate FROM Orders AS o GROUP BY o.CustomerID; ``` <small>客人的統計資料</small> --- ```sql CREATE VIEW vActiveCustomerStatistics AS SELECT a.CustomerID, a.CustFirstName, a.CustLastName, s.LastOrderDate, s.GrandOrderTotal FROM vActiveCustomers AS a INNER JOIN vCustomerStatistics AS s ON a.CustomerID = s.CustomerID; ``` <small>近半年下單客人的統計資料 (但沒 OrderCount)</small> --- ```sql SELECT c.CustomerID, c.CustFirstName, c.CustLastName, s.LastOrderDate, s.GrandOrderTotal FROM Customers AS c INNER JOIN ( SELECT o.CustomerID, SUM(o.OrderTotal) AS GrandOrderTotal, MAX(o.OrderDate) AS LastOrderDate FROM Orders AS o GROUP BY o.CustomerID) AS s ON c.CustomerID = s.CustomerID WHERE EXISTS ( SELECT NULL FROM Orders AS o WHERE o.CustomerID = c.CustomerID AND o.OrderDate > DATEADD(MONTH, -6, GETDATE())); ``` <small>最佳化程序產生的結果</small> --- 19 # ETL --- <div style="font-size: 150%;"> Extract, Transform, Load (ETL) 匯入外部資料的工具 </div> --- ![](https://i.imgur.com/yFp7dSy.png) --- ## Access 為例 (1/2) <div style="line-height: 2; font-size: 80%;"> 1. 從 試算表 匯入 2. 辨識首列可能為欄位名稱 3. 辨識欄位可能的型別 4. 忽略不重要的欄 </div> <div style="height: 200px; overflow: hidden;"> ![](https://i.imgur.com/yFp7dSy.png) </div> --- ## Access 為例 (2/2) <div style="line-height: 2; font-size: 80%;"> 5. 挑選主鍵、是否 `AUTO INCREMENT` (或不要主鍵) 6. 命名資料表 7. 整合其他工具分析 8. 進行後續正規化 (Table Analyzer) </div> <div style="height: 200px; overflow: hidden;"> ![](https://i.imgur.com/yFp7dSy.png) </div> --- <div style="font-size: 150%;"> 大部份系統皆提供有類似工具 (或其他開源或付費工具) </div> --- 20 # `SUMMARY` --- ## 目的? <div style="line-height: 2;"> - 整合資料,呈現較易讀易懂的資訊。 - 預先計算,加快速度。 </div> --- ## 土炮做法 建立額外資料表來 匯總 (summary) 原始資料表 <div style="line-height: 2;"> 1. 以 觸發器 在原資料更動時一併更新 → 若時常異動要留意處理成本 2. 以 預存程序 定期全盤重新建立 → 通常較好 </div> --- ## DB2 支援 匯總資料表 <div style="line-height: 2; font-size: 80%;"> - 來源可以是多個資料表 - 更新:自動 或 手動 - 效能佳、具最佳化選項 `ENABLE QUERY OPTIMIZATION` (使用已匯總的資料??) - 省去維護 觸發器 或 預存程序 的麻煩 </div> --- ## 語法 ```sql CREATE SUMMARY TABLE SalesSummary AS ( SELECT … ) ``` <small>DB2</small> ```sql CREATE MATERIALIZED VIEW SalesSummary … ``` <small>Oracle</small> --- ## 語法 <div style="line-height: 2; font-size: 80%;"> - 類似 實質化檢視表 (加了 `GROUP BY`) - 實質化查詢限制使用 `INNER JOIN` (?? - `COUNT(*)` 以使用 `REFRESH IMMEDIATE` (?? - Oracle:以 實質化檢視表 (`MATERIALIZED VIEW`) 達成 - MS SQL:對 檢視表 建構 索引 達成 - 系統各有不同,建議參考文件。 </div> --- ## 成本 <div style="line-height: 2; font-size: 80%;"> - 佔儲存空間 - 須維護、管理 (觸發器、預存程序、約束??) - 根據使用情境設計需匯總的資料 - 不同的 分組 或 過濾 需多個 匯總資料表 - 可能需要排程更新 </div> --- ## 行內匯總 來避免維護、管理的成本 <div style="line-height: 2; font-size: 80%;"> - 在現有資料表加 匯總欄 - 以 `INSERT INTO` 陳述資料 - 非匯總欄設定為特定值 - 好處:匯總與細節可一併或分開查詢 - 壞處:查詢時留意區分匯總與細節 </div> --- <div style="font-size: 80%;"> Orders | OrderID | CustomerID | Amount | Price | Count | Total | | -:| ----:| ----:| ----:| ----:| ----:| | 1 | 506 | 5 | $10 | NULL | NULL | | 2 | 506 | 1 | $80 | NULL | NULL | | 3 | 506 | 2 | $60 | NULL | NULL | | 4 | 506 | NULL | NULL | 3 | $250 | </div> 想像中的樣子 (不確定…) --- 21 # 以 `UNION` 整理資料 --- ## 舉例討論 ![](https://i.imgur.com/j7XKm8R.png) 從 Excel 來的試算表 --- 1. 先設法匯入 <div style="margin: 5%; font-size: 30%;"> SalesSummary | Category | OctQuantity | OctSales | NovQuantity | NovSales | DecQuantity | DecSales | JanQuantity | JanSales | FebQuantity | FebSales | |:----------- | ---:| -----------:| ---:| -----------:| ---:| -----------:| ----:| -----------:| --- | -----------:| | Accessories | 930 | $61,165.40 | 923 | $60,883.03 | 987 | $62,758.14 | 1223 | $80,954.76 | 979 | $60,242.47 | | Bikes | 413 | $536,590.50 | 412 | $546,657.00 | 332 | $439,831.50 | 542 | $705,733.50 | 450 | $585,130.50 | | Car racks | 138 | $24,077.15 | 96 | $16,772.05 | 115 | $20,137.05 | 142 | $24,794.75 | 124 | $21,763.30 | | Clothing | 145 | $5,903.20 | 141 | $5,149.96 | 139 | $4,937.74 | 153 | $5,042.62 | 136 | $5,913.98 | | Components | 286 | $34,228.55 | 322 | $35,451.79 | 265 | $27,480.22 | 325 | $35,181.97 | 307 | $32,828.02 | | Skateboards | 164 | $60,530.06 | 203 | $89,040.58 | 129 | $59,377.20 | 204 | $79,461.30 | 147 | $61,125.19 | | Tires | 151 | $4,356.91 | 110 | $3,081.24 | 150 | $4,388.55 | 186 | $5,377.60 | 137 | $3,937.70 | </div> ```sql SELECT Category, OctQuantity, OctSales FROM SalesSummary ``` 不同月份資料要分開查詢 --- 2. 直接以 `UNION` 將各月份查詢合併 ```sql SELECT Category, OctQuantity, OctSales FROM SalesSummary UNION SELECT Category, NovQuantity, NovSales FROM SalesSummary UNION … ``` <div style="font-size: 40%;"> | Category | OctQuantity | OctSales | |:----------- | ---:| -----------:| | Accessories | 930 | $61,165.40 | | Accessories | 923 | $60,883.03 | | Accessories | 987 | $62,758.14 | | ... | ... | ... | | Tires | 150 | $4,388.55 | | Tires | 186 | $5,377.60 | | Tires | 137 | $3,937.70 | </div> 分不清是何月 --- 3. ⑴ 於 `SELECT` 增加月份欄位 ⑵ 善用 `AS` 更名欄位 ```sql SELECT Category, 'Oct' AS SalesMonth, OctQuantity AS Quantity, OctSales AS SalesAmt FROM SalesSummary UNION SELECT Category, 'Nov', NovQuantity, NovSales FROM SalesSummary UNION … ``` <div style="font-size: 40%;"> | Category | SalesMonth | Quantity | Sales | |:----------- |:--- | ---:| -----------:| | Accessories | Oct | 930 | $61,165.40 | | Accessories | Nov | 923 | $60,883.03 | | Accessories | Dec | 987 | $62,758.14 | | ... | ... | ... | ... | | Tires | Dec | 150 | $4,388.55 | | Tires | Jan | 186 | $5,377.60 | | Tires | Feb | 137 | $3,937.70 | </div> --- 4. 在最後加個 `ORDER BY` 排序資料 ```sql SELECT Category, 'Oct' AS SalesMonth, OctQuantity AS Quantity, OctSales AS SalesAmt FROM SalesSummary UNION … UNION SELECT Category, 'Jan', JanQuantity, JanSales FROM SalesSummary UNION SELECT Category, 'Feb', FebQuantity, FebSales FROM SalesSummary ORDER BY SalesMonth, Category; ``` --- ## 規則 <div style="line-height: 2;"> - 欄位數目相同 - 欄位順序相同 - 每個欄位的型態要相容 (欄位的名稱沒差) </div> --- `UNION ALL` 不排除重複, 速度較快。 --- END;
{"metaMigratedAt":"2023-06-14T21:56:20.608Z","metaMigratedFrom":"YAML","breaks":true,"description":"","title":"Effective SQL (CH3)","showTags":"false","showTitle":"false","lang":"zh-TW","contributors":"[{\"id\":\"8128865c-d2fb-4723-b1e0-73e1fb0c2614\",\"add\":38336,\"del\":43287}]"}
    1105 views