# CH08 日期與時間函數 ## 8.2 提取datetime局部數值EXTRACT, DATE, TIME ### EXTRACT 如果datetime是'2019-03-02 08:00:00' 使用EXTRACT可以提取出**日, 月, 年, 小時, 分鐘** EXTRACT(DAY FROM market_start_datetime) AS mktsrt_day EXTRACT函式用法 ```sql= EXTRACT() [ 時間單位 ] FROM datetime) ``` - DATE DATE(market_start_datetime) AS mkstrt_date -> 2019-03-02 - TIME TIME(market_start_datetime) AS mkstrt_time -> 08:00:00 ## 8.3 取得時間間隔的結束時間 ### 往後取得結束時間的DATE_ADD函數 ```sql= DATE_ADD(datetime, INTERVAL [增加量] [時間單位]) ``` 範例1 ```sql= DATE_ADD(market_start_datetime, INTERVAL 30 MINUTE) AS mktstrt_date_plus_30min ``` | market_start_datetime | mktstrt_date_plus_30min | | -------- | -------- | | 2019-03-02 08:00:00 | 2019-03-02 08:30:00 | 範例2 ```sql= DATE_ADD(market_start_datetime, INTERVAL 30 DAY) AS mktstrt_date_plus_30days ``` | market_start_datetime | mktstrt_date_plus_30days | | -------- | -------- | | 2019-03-02 08:00:00 | 2019-04-01 08:00:00 | ### 得到反向時間的DATE_SUB函數(與DATE_ADD加負數相同) ```sql= DATE_ADD(market_start_datetime, INTERVAL -30 DAY) AS mktstrt_date_plus_30days DATE_SUB(market_start_datetime, INTERVAL 30 DAY) AS mktstrt_date_plus_30days ``` --- ### 時區 從台北轉紐約 -> CONVERT_TZ ```sql= SELECT meeting_id CONVERT_TZ(meeting_time 'Asia/Taipei', 'America/New_York') AS meeting_time_ny FROM meeting ``` 利用DATE_ADD or DATE_SUB調整 ```sql= SELECT meeting_id DATE_ADD(meeting_time INTERVAL -12 HOUR) AS meeting_time_ny FROM meeting ``` --- ## 8.4 計算時間差異 DATEDIFF 用來計算兩個時間之間的間隔天數有多少 ```sql= SELECT x.first_market, x.last_market, DATEDIFF(x.first_market, x.last_market) AS days_first_to_last FROM ( SELECT min(market_start_datetime) first_market, max(market_start_datetime) last_market, FROM farmers_market.datetime_demo ) AS x ``` | first_market | last_market | days_first_to_last | | -------- | -------- | -------- | | 2019-03-02 08:00:00 | 2020-10-10 08:00:00 | 588 | ## 8.5 指定時間差異單位 TIMESTAMPDIFF ```sql= SELECT market_start_datetime, market_end_datetime, TIMESTAMPDIFF(HOUR, market_start_datetime, market_end_datetime) AS market_duration_hours, TIMESTAMPDIFF(MINUTE, market_start_datetime, market_end_datetime) AS market_duration_mins, FROM farmers_market.datetime_demo ``` | market_start_datetime | market_end_datetime | market_duration_hours | market_duration_mins | | -------- | -------- | -------- | -------- | | 2019-03-02 08:00:00 | 2019-03-02 14:00:00 | 6 | 360 | ```sql= TIMESTAMPDIFF( [ 時間單位 ], datetime_start, datetime_end) ``` ## 8.6 用聚合函數與窗口函數處理datetime資料 ### 找出顧客最早與最近的消費日期 - 步驟一 取得顧客購買資料的欄位細節 ```slq= SELECT customer_id, market_date FROM farmers_market.customer_purchases WHERE customer_id = 1 ``` - 步驟二 1. 先以GROUP BY以customer_id欄位分組 2. 以MIN, MAX取出market_date欄位最早與最近日期 3. 透過COUNT DISTINCT算出多少不同日期有消費 ```sql= MIN(market_date) AS first_purchase, MAX(market_date) AS last_purchase, COUNT(DISTINCT market_date) AS count_of_purchase_dates ``` ```slq= SELECT customer_id, market_date MIN(market_date) AS first_purchase, MAX(market_date) AS last_purchase, COUNT(DISTINCT market_date) AS count_of_purchase_dates FROM farmers_market.customer_purchases WHERE customer_id = 1 GROUP BY customer_id ``` | customer_id | first_purchase | last_purchase | count_of_purchase_dates | | -------- | -------- | -------- | -------- | | 1 | 2019-04-06 | 2020-10-10 | 107 | ### 計算最早與最近消費日期相隔幾天 ```sql= DATEDIFF(MAX(market_date), MIN(market_date)) AS days_between_first_last_purchase ``` ```slq= SELECT customer_id, market_date MIN(market_date) AS first_purchase, MAX(market_date) AS last_purchase, COUNT(DISTINCT market_date) AS count_of_purchase_dates DATEDIFF(MAX(market_date), MIN(market_date)) AS days_between_first_last_purchase FROM farmers_market.customer_purchases WHERE customer_id = 1 GROUP BY customer_id ``` | customer_id | first_purchase | last_purchase | count_of_purchase_dates | days_between_first_last_purchase | | -------- | -------- | -------- | -------- |-------- | | 1 | 2019-04-06 | 2020-10-10 | 107 |553 | * 撈出所有顧客 ### 最近一次消費離現在多少天 ```sql= DATEDIFF(CURDATE(), MAX(market_date)) AS days_since_last_purchase ``` ```slq= SELECT customer_id, market_date MIN(market_date) AS first_purchase, MAX(market_date) AS last_purchase, COUNT(DISTINCT market_date) AS count_of_purchase_dates DATEDIFF(MAX(market_date), MIN(market_date)) AS days_between_first_last_purchase DATEDIFF(CURDATE(), MAX(market_date)) AS days_since_last_purchase FROM farmers_market.customer_purchases WHERE customer_id = 1 GROUP BY customer_id ``` | customer_id | first_purchase | last_purchase | count_of_purchase_dates | count_of_purchase_dates | days_since_last_purchase | | -------- | -------- | -------- | -------- |-------- |-------- | | 1 | 2019-04-06 | 2020-10-10 | 107 |553 |957 | ### 將本次與下次消費日期並列,並算出每兩次消費的間隔天數 ```sql= RANK() OVER (PARTITION BY customer_id ORDER BY market_date) AS purchase_number, LEAD(market_date,1) OVER (PARTITION BY customer_id ORDER BY market_date) AS next_purchase ``` ```slq= SELECT customer_id, market_date RANK() OVER (PARTITION BY customer_id ORDER BY market_date) AS purchase_number, LEAD(market_date,1) OVER (PARTITION BY customer_id ORDER BY market_date) AS next_purchase FROM farmers_market.customer_purchases WHERE customer_id = 1 ``` ![截圖 2024-03-07 上午11.11.56](https://hackmd.io/_uploads/Byg0e3I6a.png) 如果有同一天購買兩次的紀錄,會出現相同的purchase_number -> 在子查詢中使用DISTINCT篩掉 ```sql= SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchases WHERE customer_id = 1 ``` ```slq= SELECT x.customer_id, x.market_date, RANK() OVER (PARTITION BY customer_id ORDER BY market_date) AS purchase_number, LEAD(market_date,1) OVER (PARTITION BY customer_id ORDER BY x.market_date) AS next_purchase FROM ( SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchases WHERE customer_id = 1 ) AS x ``` 將next_purchase欄位的LEAD函數包進DATEDIFF中,算出與market_date差值 ```slq= SELECT x.customer_id, x.market_date, RANK() OVER (PARTITION BY customer_id ORDER BY market_date) AS purchase_number, DATEDIFF(LEAD(market_date,1) OVER (PARTITION BY customer_id ORDER BY x.market_date), x.market_date) AS days_between_purchase FROM ( SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchases WHERE customer_id = 1 ) AS x ``` ![截圖 2024-03-07 上午11.18.01](https://hackmd.io/_uploads/Sy8eMh86a.png) Q: 如果已經有next_purchase與market_date兩個欄位,何不把這兩個欄位直接放入DATEDIFF即可? A: 不行!因為**同層級的程式碼並非由上而下執行** DATEDIFF函數計算時,next_purchase還沒生成 --- ### 巢狀查詢作法 ```sql= SELECT a.customer_id, a.market_date AS first_purchase, a.next_purchase AS second_purchase, DATEDIFF(a.next_purchase, a.market_date) AS time_between_1st_2nd_purchase FROM ( SELECT x.customer_id, x.market_date, RANK() OVER (PARTITON BY x.customer_id ORDER BY x.market_date, 1) OVER (PARTITION BY x.customer_id ORDER BY x.market_date) AS next_purchase FROM ( SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchases ) AS x ) AS a WHERE a.purchase_number = 1 ``` **巢狀查詢層數越多 -> 可讀性與執行效率越低** 因為每層查詢執行之前,都要先完成子查詢的結果,所以層數越多,資料量越大。佔用越多記憶體 -> 效率差 --- ### 激勵某一段時間很少消費的顧客回購 兩個步驟 1. 首先,找出十月份消費過的所有顧客 2. 從中挑出只在這月份中消費一次的顧客 用日期時間函數找出十月份有消費紀錄得顧客 ```slq= SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchase WHERE DATEDIFF('2020-10-31', market_date) <= 30 ``` 用HAVING子句篩選只有消費過一次的顧客 ```sql= SELECT x.customer_id COUNT(DISTINCT x.market_date) AS market_count FROM ( SELECT DISTINCT customer_id, market_date FROM farmers_market.customer_purchase WHERE DATEDIFF('2020-10-31', market_date) <= 30 ) AS x GROUP BY x.customer_id HAVING COUNT(DISTINCT market_date) = 1 ``` ### Code Review ```bash= find app/models -name "*.rb" | while read file; do echo $file `git log --format=oneline -- $file | wc -l`; done | sort -k 2 -nr | head ``` ![截圖 2024-03-07 中午12.10.39](https://hackmd.io/_uploads/HyTHCnLT6.png)