### GROUP BY ```sql! SELECT [回傳的欄位] FROM [資料庫.表格] GROUP BY [進行分組的欄位名稱] HAVING [分組後的篩選條件] ORDER BY [排序的欄位名稱] ``` * WHERE是分組前設定的篩選條件 --- * GROUP BY和SELECT DISTINCT都是SQL中用於處理重複數據的語句 - SELECT DISTINCT用於選擇唯一的行。 - GROUP BY用於對行進行分組,並對每個分組進行聚合運算。 這兩者雖然都可以用來處理重複的數據,但SELECT DISTINCT更偏向於選擇唯一的數據行,而GROUP BY更適用於需要對數據進行分組並計算每個分組的情況。 --- ### GROUP BY搭配聚合函數(SUM, COUNT) #### COUNT ![截圖 2024-02-01 下午10.20.39](https://hackmd.io/_uploads/Skf6_Xt5T.png) 以5為例,在4/03出現兩次 ![截圖 2024-02-01 下午10.22.03](https://hackmd.io/_uploads/SkALYmY9a.png) 先分組,然後計算 #### SUM ![截圖 2024-02-01 下午10.27.47](https://hackmd.io/_uploads/HkzY57KqT.png) ![截圖 2024-02-01 下午10.28.27](https://hackmd.io/_uploads/HJCs5Qtqa.png) * 要確認quantity的單位為何 #### COUNT DISTINCT ![截圖 2024-02-01 下午10.34.01](https://hackmd.io/_uploads/H1IkhQY96.png) * 查出顧客在每個市集日期購買多少不同種的產品 COUNT跟SUM可以同時寫入來進行查詢 ### 聚合函數中放入算式 ```sql! SELECT market_date, customer_id, vendor_id, quantity * cost_to_customer_per_qty AS price FROM farmers_market.customer_purchases WHERE customer_id = 3 ORDER BY market_date, vendor_id ``` #### GROUP BY與聚合函數做分組運算 ```sql! SELECT customer_id, SUM(quantity * cost_to_customer_per_qty) AS total_spent FROM farmers_market.customer_purchases GROUP BY market_date ``` #### 連結多個表格分組與聚合 * 先將表格連結,然後才做分組聚合 - 先看到最小粒度 1. ![截圖 2024-02-01 下午10.53.10](https://hackmd.io/_uploads/rk_dxVF9T.png) 2. ![截圖 2024-02-01 下午10.56.39](https://hackmd.io/_uploads/H1fI-VKc6.png) ### MAX & MIN ![截圖 2024-02-01 下午11.00.56](https://hackmd.io/_uploads/rkN4G4K5a.png) 可以搭配GROUP BY來查詢各產品類別中的最高或最低 ### AVG ```sql! SELECT vendor_id, COUNT(DISTINCT product_id) AS different_products_offered, AVG(original_price) AS average_product_price FROM farmers_market.vendor_inventory WHERE market_date BETWEEN '2019-04-03' AND '2019-06-30' GROUP BY vendor_id ORDER BY vendor_id ``` - 計算方法有誤:單價/總數 #### HAVING - 資料分組後篩選資料,而且可以使用聚合函數 (HAVING的執行順序在GROUP BY之後,SELECT之前) ```sql! SELECT vendor_id, COUNT(DISTINCT product_id) AS different_products_offered, SUM(quantity * original_price) AS value_of_inventory, SUM(quantity) AS inventory_item_count, SUM(quantity * original_price) / SUM(quantity) AS average_item_price FROM farmers_market.vendor_inventory WHERE market_date BETWEEN '2019-04-03' AND '2019-06-30' GROUP BY vendor_id HAVING inventory_item_count >= 100 ORDER BY vendor_id ``` #### COUNT(*) > 1 ![截圖 2024-02-01 下午11.24.04](https://hackmd.io/_uploads/rJ7oDNYca.png) ```sql! SELECT customer_name, COUNT(*) AS orders_count FROM orders GROUP BY customer_name HAVING COUNT(*) > 1; ``` ![截圖 2024-02-01 下午11.24.26](https://hackmd.io/_uploads/BkUnPEKqa.png) ### 聚合函數中使用CASE語法 ![截圖 2024-02-01 下午11.30.57](https://hackmd.io/_uploads/SJaNK4Fcp.png) ```sql! SUM(CASE...WHEN...THEN...ELSE...END) AS 欄位別名 ``` ![截圖 2024-02-01 下午11.35.12](https://hackmd.io/_uploads/B1nVqVtca.png) 因為使用聚合函數,所以需要GROUP BY