Databases and SQL for Data Science with Python WEEK_3 - Intermediate SQL

Databases and SQL for Data Science with Python WEEK_3 - Intermediate SQL ==== ###### tags: `IBM Data Engineering Professional Certificate`,`Reading Note`,`Coursera`,`Databases and SQL for Data Science with Python` ### Overview >* 介紹 WHERE 的LIKE， BETWEEN， OR 提取資料。 >* 介紹 ORDER BY 排序資料。 >* 介紹 GROUP BY 的AS， HAVING 為資料進行分類。 >* 介紹 Built in Function 的 SUM， MAX/MIN， AVG， ROUND，LENGTH，UCASE/LCASE 的功能。 >* 介紹 Date & Time 的資料類型以及時間的函數。 >* 介紹 Sub-Queries 和 Nested Selects 的資料提取功能。 >* 提取多個 tables 的資料。 >* 介紹 join 多個 tables。 ## Refining your Results ### 1. Using String Patterns and Ranges * Retrieving rows - using a String Pattern * WHERE...LIKE * 使用LIKE搜尋String Pattern ```SQL= SELECT <column_name> FROM <table_name> WHERE <column_name> LIKE <string_pattern> ``` * 使用LIKE的範例： ```SQL= SELECT firstname FROM Author WHERE firstname LIKE 'R%' ``` * Retrieving rows - using a Range * WHERE...BETWEEN * 使用rows的區間提取資料 ```SQL= SELECT <column_name> FROM <table_name> WHERE <column_name> BETWEEN <num> AND <num> ``` * 使用BETWEEN的範例： ```SQL= SELECT title, pages FROM Book WHERE pages BETWEEN 290 AND 300 ``` * Retrieving rows - using a Set of Values * WHERE...OR * 使用OR搜尋更多關鍵字 ```SQL= SELECT <column_name> FROM <table_name> WHERE <column_name> = 'keyword 1' OR <column_name> = 'keyword 2' ``` * 使用OR的範例： ```SQL= SELECT fisrtname, lastname, country FROM Author WHERE country = 'AU' OR country = 'BR' ``` ### 2. Sorting Result Sets * Using the ORDER BY clause * ORDER BY * 使用 ORDER BY 為 column 中的資料進行排序 ```SQL= SELECT <column_name> FROM <table_name> ORDER BY <column_name> ``` * ORDER BY clause - Descending Order * ORDER BY...DESC / ASC * 使用 ORDER BY 為 column 中的資料進行遞減或遞增排序 ```SQL= SELECT <column_name> FROM <table_name> ORDER BY <column_name> DESC ``` * Specifying Column Sequence Number * ORDER BY * 使用範例 ```SQL= SELECT title, pages FROM Book ORDER BY 2 ``` OUTPUT : ![](https://i.imgur.com/vPcgRQE.png =500x) ### 3. Grouping Result Sets * GROUP BY clause * GROUP BY * 使用 GROUP BY 為資料進行分類 ```SQL= SELECT <column_name> FROM <table> GROUP BY <column_name> ``` * 使用 GROUP BY 的範例 ```SQL= SELECT country, count(country) FROM Author GROUP BY country ``` OUTPUT : ![](https://i.imgur.com/ovoVQO2.png =300x) * AS * 在 SELECT 查詢可以使用 AS 關鍵字為查詢結果取一個別名 ```SQL= SELECT country, count(country) AS Count FROM Author GROUP BY country ``` OUTPUT : ![](https://i.imgur.com/cfAQwLH.png =300x) * HAVING * 在 SELECT 查詢中使用 HAVING 来指定 GROUP BY 的过滤条件 ```SQL= SELECT country, count(country) AS Count FROM Author GROUP BY country HAVING COUNT(country) > 4 ``` OUTPUT : ![](https://i.imgur.com/L67Wp8F.png =300x) ## Functions, Multiple Tables, and Sub-queries ### 1. Built-in Database Functions * Built-in Function * 提升資料的處理能力 * 減少提取資料的數量 * Aggregate or Column Functions * SUM Function: * 把整個Column的數值做相加 ```SQL= SUM(column_name) FROM <table_name> ``` * SUM的使用範例 ```SQL= SELCT SUM(COST) FROM PETRESCUE ``` OUTPUT : ![](https://i.imgur.com/T6u0hMs.png =100x) * 為輸出結果取別名 ```SQL= SELCT SUM(COST) AS SUM_OF_COST FROM PETRESCUE ``` OUTPUT : ![](https://i.imgur.com/G9f7O9h.png =120x) * MIN, MAX * MAX / MIN * 回傳在 Column 裡最大的數值 ```SQL= SELECT MAX(column_name) FROM <table_name> ``` * MIN 與 MAX 的 syntax 一樣的寫法 * 有條件搜尋 MAX / MIN 的值 * 加 WHERE ```SQL= SELECT MIN(column_name) FROM <table_name> WHERE <column_name> = 'keyword' ``` * Average * AVG * 計算column的平均值 ```SQL= SELECT AVG(column_name) FROM <table_name> ``` * 可以處理Column之間的數學運算 * 以計算平均一隻狗的價格為範例 ```SQL= SELECT AVG(COST / QUANTITY) FROM PETRESUE WHERE ANIMAL = 'Dog' ``` OUTPUT: ![](https://i.imgur.com/7OSoYh8.png =300x) * SCALAR and STRING FUNCTIONS * ROUND(X, D) * 回傳x舍入到最接近的整數。如果第D參數，則函數回傳x四捨五入至第D位小數點 ```SQL= SELECT ROUND(column_name, integer) FROM <table_name> ``` * LENGTH * 回傳字串的長度 ```SQL= SELECT LENGTH(column_name) FROM <table_name> ``` * LENGTH 可以直接匯入字串 ```SQL= SELECT LENGTH('string') ``` * UCASE, LCASE * UCASE / LCASE * 把字串轉為大或小寫 ```SQL= SELECT UCASE(column_name) FROM <table_name> ``` ### 2. Date and Time Built-in Functions * 多數的數據庫會提供時間資料類型 ![](https://i.imgur.com/NN5ZJqT.png) DATE: >Y : Year >M : Month >D : Day TIME >H : Hour >M : Minute >S : Seconds TIMESTAMP >Y : Year >X : Month >D : Day >H : Hour >M : Minute >S : Seconds >Z : Microseconds * 關於時間與日期的函數 * YEAR() * 回傳日期的年份 * MONTH() * 回傳日期的月份 * DAY() * 回傳日期的天 * DAYOFMONTH() * 回傳日期月份中的天，回傳值範圍會在0至31之間 * DAYOFWEEK() * 回傳日期的工作索引值，如：SUNDAY為1， MONDAY為2，SATURDAY為7 * DAYOFYEAR() * 回傳一年內的第幾天 * WEEK() * 回傳一年內的第幾週 * HOUR() * 回傳時間的小時部分，範圍為0至23 * MINUTE() * 回傳時間的分鐘部分，範圍為0至59 * SECOND * 回傳時間的秒，範圍為0至59 ### 3. Sub-Queries and Nested Selects * SQL 查詢語在另一個 SQL 查詢語句中 * 錯誤示範 ```SQL= SELECT * FROM employees WHERE salary > AVG(salary) ``` OUTPUT : ![](https://i.imgur.com/LY1mVmR.png =500x) * 無法輸出正常結果 * 使用 Sub-query 完成任務 ```SQL= SELECT EMP_ID, F_NAME, L_NAME, SALARY FROM employees WHERE SALARY < (SELECT AVG(SALARY) FROM employees); ``` OUTPUT: ![](https://i.imgur.com/cBUiLza.png =400x) * Sub-queries in list of columns * 為 Sub-queries 取別名，錯誤示範 ```SQL= SELECT EMP_ID, SALARY, AVG(SALARY) AS AVG_SALARY FROM employees; ``` * 正確的寫法 ```SQL= SELECT EMP_ID, SALARY, (SELECT AVG(SALARY) FROM employees) AS AVG_SALARY FROM employees; ``` OUTPUT: ![](https://i.imgur.com/8gepgQS.png =500x) * Sub-queries in FROM clause * Derived Tables / Table Expressions ```SQL= SELECT * FROM (SELECT EMP_ID, F_NAME, L_NAME, DEP_ID FROM emplyees) AS EMP4ALL; ``` OUTPUT : ![](https://i.imgur.com/nWrXtHP.png =500x) ### 4. Working with Multiple Tables #### 1. Accessing Multiple Tables with Sub-queries * 這部分有點複雜，所以會用範例的方式講解： ![](https://i.imgur.com/3Hvkmf0.png) * 提取 DEPARTMENTS table 與 EMPLOYEES table 有交集的資料： ```SQL= SELECT * FROM employees WHERE DEP_ID IN (SELECT DEPT_ID_DEP FROM departments); ``` * 提取 EMPLOYEES table 與 DEPARTMENTS table 的 location ID 有交集的資料 ```SQL= SELECT * FROM employees WHERE DEP_ID IN (SELECT DEPT_ID_DEP FROM departments WHERE LOC_ID = 'L0002'); ``` OUTPUT: ![](https://i.imgur.com/YMNkbVP.png =500x) * 提取 department ID 和 employees name, 薪資待遇有超過 70000 ```SQL= SELECT DEPT_ID_DEP, DEP_NAME FROM departments WHERE DEPT_ID_DEP IN (SELECT DEP_ID FRPM employees WHERE SALARY > 70000); ``` #### 2. Accessing multiple tables with Implicit Join * Full join / Cartesian join ```SQL= SELECT * FROM <table_name_1>, <table_name_2>; ``` * SQL 範例 ```SQL= SELECT * FROM employees, departments; ``` OUTPUT: ![](https://i.imgur.com/Tius2SM.png) * 提取部分資料 ```SQL= SELECT * FROM employees, departments WHERE employees.DEP_ID = departments.DEPT_ID_DEP; ``` * 簡寫版本 ```SQL= SELECT * FROM employees E, departments D WHERE E.DEP_ID = D.DEPT_ID_DEP; ``` OUTPUT: ![](https://i.imgur.com/TQYceGo.png) * 提取每位 employee 的 department name ```SQL= SELECT EMP_ID, DEP_NAME FROM employees E, departments D WHERE E.DEP_ID = D.DEPT_ID_DEP; ``` OUTPUT: ![](https://i.imgur.com/IKXCpC2.png =300x) ## Summary & Highlights * 課程完整整理的內容，所以把它記錄下來 * You can use the WHERE clause to refine your query results. * You can use the wildcard character (%) as a substitute for unknown characters in a pattern. * You can use BETWEEN ... AND ... to specify a range of numbers. * You can sort query results into ascending or descending order, using the ORDER BY clause to specify the column to sort on. * You can group query results by using the GROUP BY clause. * Most databases come with built-in functions that you can use in SQL statements to perform operations on data within the database itself. * When you work with large datasets, you may save time by using built-in functions rather than first retrieving the data into your application and then executing functions on the retrieved data. * You can use sub-queries to form more powerful queries than otherwise. * You can use a sub-select expression to evaluate some built-in aggregate functions like the average function. * Derived tables or table expressions are sub-queries where the outer query uses the results of the sub-query as a data source.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.