惱人的 SQL - HackMD

# 惱人的 SQL ![](https://i.imgur.com/QGe2oee.jpg) # Select `select column_name from table_name` 語法： * `''` 表示 string 關鍵字： * distinct - 去除重複值：`select distinct column from table` * and, or, not, `!=, <>` - 邏輯運算子 * limit - 限制 # of rows return from the query，常和 order by 一起用 * order by - 搭配 DESC, ASC(預設) * 舉例：`select address, name, id from student order by name, address` * between - match a value against a range of values 有包含性 * 留意：'2007-02-014' 預設後面的毫秒千秒等等的是 0 * in - create a condition that checks to see if a value in included in a list of multiple options，也可以搭配 not * 舉例：`select column from table where columns in(property_1, property_2)` * like and unlike - using pattern matching. like 是 case-sensitive 可以用 ilike 去達成 insensitive. postgresql 也有支援 regex * like 搭配 % match any sequences of character * liek 搭配 _ match any single character * as - 可以增加可讀性，尤其是和 aggregate function 搭配。因為他在 query 最後才被執行，所以不能使用 alias inside a where operator or a having operator to filter by * union - combine the result-set of two or more select statements * `select column_name1 from table1 union select column_name2 from table2` 舉例： ![](https://i.imgur.com/Xmv4nTk.png) ![](https://i.imgur.com/qjIkoBN.png) ![](https://i.imgur.com/XkeHrWb.png) `select * from customer where first_name ilike '%er%'` 任何有 er 在 first_name 中出現的內建 function: * count - return the # of input rows that match the query * 舉例：`select count(distinct name) from actor` # Group by statement group by 的用途是在 aggreate data and apply functions 目的是讓我們更好地理解 aggregate functions: * idea:將多個 input 變成一個 output * 舉例：`AVG(), COUNT(), MIN(), MAX(), SUM()` * 使用在 select clause or having clause 舉例 1. AVG() - 回傳為 decimal, 可以透過 ROUND() 去 format ## Group by 用法：選定 categorical column to group by 底層運作：將表依據你選定的 categorical column 去分組，透過 aggregate function 將分組後的欄位去做相對應的運算 (take multiple values reduce them into one single value) ```sql select categorical_col, AGG(data_col) from table where categorical_col != 'A' -- 和 aggregate 完的結果無關 group by AGG(data_col) ``` 在 SQL 的 SELECT 語句中，如果你使用了聚合函數（如SUM、COUNT、AVG等），那麼在 SELECT 語句中出現的列（columns）就必須是聚合函數，或者是 GROUP BY 子句中列出的列。這是因為聚合函數是針對整個分組（group）進行計算的，而不是針對每個單獨的行進行計算的。如果你想要在 SELECT 語句中列出其他列，並且不是聚合函數或 GROUP BY 子句中列出的列，那麼就會報錯，因為這樣的列沒有定義計算的規則。 ```sql select customer_id, SUM(amount) from payment group by customer_id -- total sum amount per customer_id order by SUM(amount) desc select customer_id, count(amount) from payment group by customer_id -- transcation occurr per customer_id order by count(amount) desc select customer_id, staff_id, sum(amount) from payment group by customer_id, staff_id order by customer_id, staff_id -- the totam amount spend per staff per customer select date(payment_date), sum(amount) from payment group by date(payment_date) -- how much it process each day select round(avg(replacement_cost), 2), rating from film group by rating ``` 處理 group by date 要做轉換, 目的是捨去 timestamp 後面的時分秒 ## Having 用途： filter 已發生的 aggregation （可以想成 aggregate statement 後的 where) ```sql select customer_id, sum(amount) from payment where customer_id not in(184, 87, 477) group by customer_id having sum(amount) < 100 select store_id, count(customer_id) -- 語意較好 from customer group by store_id having count(customer_id) > 300 select customer_id, count(*) from payment group by customer_id having count(*) >= 40 select customer_id, sum(amount) from payment where staff_id = 2 group by customer_id having sum(amount) > 100 ``` # JOINS ![](https://i.imgur.com/nGivCAZ.png) [A Visual Explanation of SQL Joins ](https://blog.codinghorror.com/a-visual-explanation-of-sql-joins/) ## Inner Join An Inner Join will result the set of records that match in both tables. only grab the rows that happen to be in both table A and table B. ![](https://i.imgur.com/6EUE1Gw.png) 1. talbe order 不重要：從表 A 去 join 表 B，跟從表 B 去 join 表 A 得到的結果是一樣的 2. postgres join 預設為 inner join ## Outer Join 目的：how to deal with values only present in one of the table being joined 1. Full outer Join: 可替換 table 的順序，因為他是對稱的。 ![](https://i.imgur.com/L8lrJpu.png) ![](https://i.imgur.com/AcU93BM.png) > full outer join with where - get rows unique to either table(rows not found in both table) ![](https://i.imgur.com/rviKxIT.png) ```sql= select * from customer full outer join payment on customer.customer_id = payment.customer_id where customer.customer_id is null or payment.customer_id is null ``` 2. left outer join: order matters > left outer join with where - get row unique with left table ![](https://i.imgur.com/yZtXTOE.png) 3. right outer join: same as left join expect the tables are switched ![](https://i.imgur.com/ZUAdprt.png) ![](https://i.imgur.com/PNjldcE.png) # 資料庫對應關係 [「筆記」- 資料關聯的三種關係](https://medium.com/pierceshih/%E7%AD%86%E8%A8%98-%E8%B3%87%E6%96%99%E9%97%9C%E8%81%AF%E7%9A%84%E4%B8%89%E7%A8%AE%E9%97%9C%E4%BF%82-245152c093da) # SubQuery > perform a query on the results of another query. The syntax involves two select statements 1. subquery 括號裡面的會先執行，一執行完畢就回傳結果，sql 就執行外部的 query 2. 可以搭配 in operator in conjuction with a subquery to check against multiple results returned ```sql -- 可以透過 join 也可以用 subquery select student, score from test_scores where student in( select student from honer_roll_table ) ``` 3. The Exists operator is used to test for existence of rows in a subquery. Typically a subquery is passed in the Exists() to check if any rows are returned with the subquery. ```sql select column_name from table_name where exists (select column_name from table_name where condition ) ``` 如果 subquery 回傳的是 multiple value 你應該要使用 in operator ```sql select title, rental_rate from film where rental_rate > (select avg(rental_rate) from film) order by rental_rate desc -- grab film title between certain dates select film.film_id, film.title from film where film.film_id in ( select inventory.film_id from rental inner join inventory on inventory.inventory_id = rental.inventory_id where return_date between '2005-05-29' and '2005-05-30') -- find customers at least have one payment whose amount is greater than 11 select first_name, last_name from customer as c where not exists( select * from payment as p where p.customer_id = c.customer_id and amount > 11 ) ``` # self join ![](https://i.imgur.com/MLxhKrc.png) 名詞解釋：A self-join is a query in which a table is joined to itself 可以被看作兩個相同的表去做關聯。The table is not actually copied, but SQL performs the command as though it were. 用途：compare values in a column of rows within the same table 性質： 1. use alias for the table ow the table name would be ambiguous 用途： ![](https://i.imgur.com/kuboAC7.png) ```sql select emp.name, rep.name as rep from employees as emp join employees as rep on emp.emp_id = rep.report_id ``` 透過這個 JOIN 子句，您可以將 Employees 表格自己連接到自己，其中 emp_id 列被連接到 report_id 列。這表示，每個員工報告給另一個員工，而 emp_id 是報告人的 ID，而 report_id 是該報告人報告給的人的 ID。因此，您需要在 JOIN 子句中將 emp_id 列連接到 report_id 列，以便能夠正確匹配員工和他們的報告對象。如果您將 emp_id 列連接到 emp_id 列，那麼您將只會找到員工報告給自己，而不是他們實際上報告給的人。 ``` -- find all the pair of film that have the same length select f1.title, f2.title, f2.length from film as f1 inner join film as f2 on f1.film_id != f2.film_id and f1.length = f2.length ``` ## QA > What's difference between subquery and join ? 子查询是在一個查詢語句內嵌套另一個查詢語句，而 JOIN 是通過將兩個或多個表的行匹配來組合數據。子查询返回一個單獨的值或一個結果集，而 JOIN 返回由兩個或多個表的結合行組成的結果集。子查询可以在 SELECT、INSERT、UPDATE 和 DELETE 陳述中使用，而 JOIN 只能在 SELECT 陳述中使用。當你需要使用多個表或子集進行查詢時，可以考慮使用 JOIN。JOIN 提供了更靈活和有效的方式來組合表中的數據。通常情況下，當你需要選擇和組合多個表中的數據時，JOIN 更為適用。而子查询通常適用於需要使用特定條件過濾數據的情況，例如查找具有某些屬性的行，或者查找與特定值相關聯的行。當你需要在 WHERE 子句中使用特定條件來過濾數據時，子查询是更好的選擇。 # Other Functions ## Timestamps and Extract data type: 1. time: contains only time 2. date: contains only date 3. timestamp: contains date and time 4. timestamptz: contains date, time, and timezone functions and operations: 1. timezone 2. now 3. timeofday 4. current_time 5. current_date ```sql; show timezone select now() select timeofday() select current_time select current_date ``` 擷取資料常用函數 1. extract(): extract or obtain a sub-component of a date value, `EXTRACT(field FROM source)`，用於從日期/時間值中提取指定的元素，例如年、月、日、小时、分钟、秒等等。 * year: `extract(year from date_col)` * month * day * week * ouarter 3. age(): calculates and returns the current age given a timestamp * age(date_col) 回傳 2 years 1 mon 5 day 11.32.13.003423 5. to_char(): convert data type to text. 通常會用在 timestamp 的 format 上 6. dow(): 日期/時間函數，用於獲取指定日期的星期幾。`SELECT EXTRACT(DOW FROM TIMESTAMP '2001-02-16 20:38:40'); ` ```sql select extract(year from payment_date) as payment_year, extract(month from payment_date) as payment_month, age(payment_date), to_char(payment_date, 'YYYY/MM/DD') as payment_date from payment select count(*) as payment_day from payment where to_char(payment_date, 'ID') = '1' ``` # Mathamatical functions and operators Left: `LEFT(string, length)`，是一個選取字串左邊的 function ```sql select upper(first_name) || ' ' || lower(last_name) as full_name, length(first_name), lower(left(first_name, 1)) || last_name || '@gmail.com' as new_mail from customer ``` # HW [sql HW2](https://docs.google.com/document/d/1wiuYbTQslmfolQWgeVPB356csjK6yqOUBhgC7fM44o8/edit) ```sql -- Q01 select * from cd.facilities -- Q02 select distinct fc.name, fc.membercost from cd.facilities fc -- Q03 select distinct * from cd.facilities fc where fc.membercost > 0 -- Q04 select distinct fc.facid, fc.name, fc.membercost, fc.monthlymaintenance from cd.facilities fc where fc.membercost > 0 and fc.membercost < (fc.monthlymaintenance / 50.0) -- Q05 select * from cd.facilities fc where fc.name like '%Tennis%' -- Q06 使用IN運算子，可以在一個查詢中同時搜尋多個值，而不需要使用OR運算子。 select * from cd.facilities fc where fc.facid in (1, 5) -- Q07 也可以使用 date function ex: Date('2012-09-01') select memid, surname, firstname, joindate from cd.members where to_char(joindate, 'YYYY-MM-DD') >= '2012-09-01' -- Q08 select distinct surname from cd.members order by surname limit 10 -- Q09 也可以透過 MAX() MIN() 去解題 select mb.joindate from cd.members mb order by mb.joindate desc limit 1 -- Q10 select count(*) from cd.facilities fc where fc.guestcost > 10 -- Q11 per statement 也就是要用 group by select sum(slots), bk.facid from cd.bookings bk inner join cd.facilities fc on fc.facid = bk.facid where to_char(bk.starttime, 'YYYY-MM') = '2012-09' group by bk.facid order by sum(slots) desc -- Q12 select fc.facid, sum(bk.slots) from cd.bookings bk inner join cd.facilities fc on fc.facid = bk.facid group by fc.facid having sum(bk.slots) > 1000 order by fc.facid -- Q13 select cd.bookings.starttime as start, cd.facilities.name from cd.bookings inner join cd.facilities on cd.bookings.facid = facilities.facid where cd.facilities.name like '%Tennis Court%' and to_char(cd.bookings.starttime, 'YYYY-MM-DD') = '2012-09-21' order by cd.bookings.starttime -- Q14 select count(*) from cd.members inner join cd.bookings on cd.bookings.memid = cd.members.memid where firstname = 'David' and surname = 'Farrell' ``` # PK and FK ## PK * 名詞解釋: columns or a group of columns used to identify a row uniquely in a table * 用途：去 join 其他表 ## FK * 名詞解釋: field or a group of fields in a table that uniquely identify a row in other table * 用途：定義自己這張表如何跟其他表的 PK 去做參照 # Constraints * 名詞解釋：表格欄位定義的規則，避免無效的資料進到 DB * 用途：建立表或定義欄位時，可以透過 constraints 去設定為 PK 或建立 FK 跟其他表的關係 * table constraints & column constraints common constraints 1. not null 2. unique 3. check: ensures all values in the columns satisfy certain conditions 4. exclusion: 如果任意兩個 rows 跟特定欄位或特定運算子形成的表達式做比較，要使用它。not all of the comparisons will return true. table constraints * 名詞解釋: columns or a group of columns used to identify a row uniquely in a table * 用途：去 join 其他表 * 特質：unique & not null ![](https://i.imgur.com/ZBVeKUs.png) ![](https://i.imgur.com/jTqoJHS.png) 舉例：Each row for PK is uniquely ; constraints data based on columns on other table # Create DB and tables ## Create ```sql Create Table table_name( column_name Type column_constraint, column_name Type column_constraint, table_constraint, table_constraint ) INHERITS existing_table_name create table job( job_id serial primary key, job_name varchar(50) unique not null ) -- 使用中間表關聯兩張表 create table account_job( user_id integer references account(user_id), job_id integer references job(job_id), hire_date timestamp ) ``` serial - if the row is removed, the column with the serial data type will not adjust 作為表的主鍵，如果是要說跟其他表的關聯，就用 integer ```sql create table account( user_id serial primary key, name varchar(50) unique not null, password varchar(50) not null, email varchar(200) unique not null, created_on timestamp not null, last_login timestamp ) ``` ## Insert 目的：add a row into table the inserted row values must match up for the table, including constraints. serial columns don't need to be a provided a value. ```sql insert into table(column1, column2) values (value1, value2), (valu1, value2).... insert into table(column1, column2) select colum1, column2 from another table where condition ... ``` ```sql -- Example insert into account( name, password, email, created_on ) values ('Angela', 'password', 'angela@gmail.com', current_timestamp) insert into job( job_name ) values('president') insert into account_job( user_id, job_id, hire_date ) values(1, 1, current_timestamp) ``` ## Update 目的：allows us for the changing of values of the columns in the table ```sql update table set colum1 = value1 column2 = value2 where condition ``` ```sql -- update join update tableA set origin_col = tableB.new_col from tableB where tableA.id = tableB.id -- return affected rows update account set last_login = created_on returning account_id, last_login -- update join update account_job set hire_date = account.created_on from account where account.user_id = account_job.user_id ``` ## Delete 用途：remove rows from a table ```sql -- delete join delete from table_A using table_B where table_A.id = table_B.id -- delete all rows from a table delete from table ``` 可以搭配 returning 關鍵字，去得到我們移除的 rows 是什麼 ```sql insert into job(job_name) values('Doctor') delete from job where job_name = 'Doctor' returning job_id, job_name ``` ## Alter 用途：change for existing table structure 1. add, drop, or rename columns 2. chage a column's data type 3. set default value for a column 4. add check constraint 5. rename table 語法： ```sql alter table table_name action ``` ```sql -- add columns alter table table_name add column new_col type -- remove columns alter table table_name drop column col_name -- alter constraint alter table table_name alter column col_name set default value alter table table_name alter column col_name drop default alter table table_name alter column col_name set not null alter table table_name alter column col_name add constraint constraint_name ``` ```sql! create table information( info_id serial primary key, title varchar(500) not null, person varchar(50) not null unique ) -- rename table alter table information rename to new_information -- rename column alter table new_information rename person to people -- remove constraint for a column alter table new_information alter column people drop not null ``` ## Drop 用途：completely remove a column in a table 在 postgresql 裏面，會自動移除這個欄位相關的 constraints and indexes **However, it will not remove columns used in views, triggers, or stored procedures without the additional CASCADE clause** ```sql! -- general syntax alter table table_name drop column col_name -- remove all dependencies alter table table_name drop column col_name cascade -- check for existence for avoiding error alter table table_name drop column if exists col_name -- drop multiple columns alter table table_name drop column col_one, drop column col_two ``` NOTICE: column "people" of relation "new_information" does not exist, skipping ALTER TABLE ```sql! alter table new_information drop if exists people -- Query returned successfully in 45 msec. ``` ## Check 用途：create constraints that adhere to certain conditions. ```sql! -- ex create table example( id serial primary_key, age smallint check (age > 21), parent_age smallint check (parent_age > age) ) ``` # Conditional Expressions and Procedures 目的：增加邏輯和 workflow 到 sql 裡面去 ## Case The main difference is that the case expression syntax first evaluates an expression then compares the result with each value in the when clauses sequentially. ```sql! -- general case statement case when condition1 then result1 when condition2 then result2 else some_other_result as column_name end -- case expression case expression when condition1 then result1 when condition2 then result2 else some_other_result as column_name end ``` ```sql select customer_id, case when ( customer_id <= 100 ) then 'Premium' when ( customer_id between 100 and 200) then 'Plus' else 'Normal' end as customer_class from customer select customer_id, case customer_id when 2 then 'Winner' when 5 then 'Second Place' else 'Normal' end raffle_results from customer select sum( case rental_rate when 0.99 then 1 else 0 end ) as num_of_bargain_movies, sum( case rental_rate when 2.99 then 1 else 0 end )as num_of_regular_movies, sum( case rental_rate when 4.99 then 1 else 0 end )as num_of_premium_movies from film select sum( case rating when 'R' then 1 else 0 end ) as r, sum( case rating when 'PG' then 1 else 0 end ) as pg, sum( case rating when 'PG-13' then 1 else 0 end ) as pg13 from film ``` ## Coalesce 定義：Coalesce function accepts an unlimited number of arguments. It returns the first argument that is not null. 如果所有引述都為 null 就回傳 null `Coalesce(argument_1, argument_2, ...)` 通常這些引數為 col_name 用途：表如果有 null，又想要替換 null 為其他的值，例如 ` select item, (price - coalesce(discount, 0) as final_price from table` ## Cast 定義：cast operator let's you convert from one data type into another 並不是任意的資料型態都可以透過 cast 被轉成任意的資料型態，他必須要合理。 ```sql! select cast('5' as integer), -- cast function '3'::integer -- cast operator ``` 用途：select query with a column name instead of a single instance ```sql! select cast(date as timestamp) from table select char_length(inventory_id::varchar) from rental ``` ## NULLIF 定義：NULLIF function take two arguments, and returns null if both are equal ow it returns the first argument passed 用途：Null 會在某些 query 造成無法預期的結果或錯誤，這時 NULLIF 就派上用場了 ```sql! NULLIF(arg_1, arg_2) -- example create table department( first_name varchar(50), department varchar(50) ) insert into department( first_name, department ) values ('Tom', 'A'), ('Peter', 'A'), ('Nick', 'B'); select ( sum( case when department = 'A' then 1 else 0 end ) / sum ( case when department = 'B' then 1 else 0 end ) ) as depart_ratio -- 2 from department -- 如果 department = 'B' 裡面的人都離開了，這個 ratio 就會出錯 delete from department where department = 'B' -- 使用 nullif 的時機到了 select ( sum( case when department = 'A' then 1 else 0 end ) / nullif( sum ( case when department = 'B' then 1 else 0 end ), 0 ) ) as depart_ratio from department ``` ## Views 將常用到的 query 存起來 ![](https://i.imgur.com/vYf9Hor.png) 示意： ![](https://i.imgur.com/pgxcogR.png) 1. A view is a database object that is of a stored query 2. A view can be accessed as a virtual table in postgresql 3. A view simply stores query 4. 也可以針對既有的 view 去 update or alter ```sql! -- 語法 create view view_name as query create view customer_info as select first_name, last_name, address from customer inner join address on customer.address_id = address.address_id -- 要異動 view 裡面的內容 create or replace view customer_info as select first_name, last_name, address, district from customer inner join address on customer.address_id = address.address_id -- remove view drop view if exists customer_info -- change name of view alter view customer_info rename to c_info ``` ## Import and Export 用途：import data from a.csv file to an already existing table 檔案路徑要正確且檔案裡面的資料型態要可以被 sql 正確解讀，這樣 import 才到 db 才會正常 ![](https://i.imgur.com/A2jHj1E.png) [COPY](https://www.postgresql.org/docs/15/sql-copy.html) import command 並沒有幫你創建表，他假設使用前你已經自己建好了 ```sql! -- 先建立表 create table simple( a integer, b integer, c integer ) ``` [Can I automatically create a table in PostgreSQL from a csv file with headers?](https://stackoverflow.com/questions/21018256/can-i-automatically-create-a-table-in-postgresql-from-a-csv-file-with-headers) [How to import and export data using CSV files in PostgreSQL ](https://www.enterprisedb.com/postgres-tutorials/how-import-and-export-data-using-csv-files-postgresql) [How to import CSV file data into a PostgreSQL table](https://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgresql-table) # 完食碎念時間好感人，終於完食了 SQL 初階課程。雖然很久以前也在 cocademy 上練習過 sql 但果然沒用到的東西，過不久就會通通忘記。因為需求又不得不回歸 sql 的懷抱，完食初級就要到了進階版的 advanced SQL。這週的課程很基礎，學習上沒有太大的壓力，也有發現一些有趣的小東西，跟有在專案裡看過的老朋友們。希望下週的進食，不會吃到想吐出來，大吼大叫表示：腦子你為什麼要學這麼慢 ORZ。 Q182 done