# UW CSE 414 Database System ## Resource * [Course Website](https://sites.google.com/cs.washington.edu/cse414-24wi) * [Syllabus](https://docs.google.com/document/d/1sBZX7QOy7wmu1seclG5O5VZBO4F3ZbTdt0tlrA6gH6A/edit?pli=1) ## 2024/1/3 Leture ### Course materials * [slides](https://drive.google.com/file/d/1WsBHwExOcfSgrcosHQ4YD03L4exsRiJV/view) * reading: 2.1 ~ 3 ### Database Management System (DBMSs) **Definition of DBMSs:** “A big program written by someone else that allows us to manage efficiently a large database and allows it to persist over long periods of time” **Examples of DMBSs** * Oracle, IBM DB2, Microsoft SQL Server, Vertica, Teradata * Open source: MySQL (Sun/Oracle), PostgreSQL, CouchDB * Open source library: SQLite ### 3 parts of a Data Model * Instance: the actual data * Schema: a description of what data is being stored * Query Language: how to retrieve and manipulate data | Schema 1 | Schema 2 | Schema 3 | | -------- | -------- | -------- | | Instance 1 | Instance 2 | Instance 3 | ## HW1 ### Resource * [When to use SQLite and when not to use it.](https://www.sqlite.org/whentouse.html) * A [simple guide](https://www.pantz.org/software/sqlite/sqlite_commands_and_general_usage) for commonly used command-line functions in SQLite. * [More information](https://www.sqlite.org/cli.html) on formatting output in SQLite. * [An index](https://www.sqlite.org/lang.html) of more detailed information for SQL commands in SQLite. * A [SQL style guide](https://www.sqlstyle.guide) in case you are interested (FYI only). ### basic command ```sql .help: To view help contents .tables: To view a list of all your tables .exit: To exit ``` #### Create a file to store the data ```sql sqlite3 HW1.db // type in the terminal ``` #### Create a table * Write the SQL statement to create a table Edges(Source, Destination) where both Source and Destination are integers. ```sql CREATE TABLE Edges(Source INT, Destination INT); ``` * Create a table called MyRestaurants with the following attributes (you can pick your own names for the attributes, just make sure it is clear which one is for which): a. Name of the restaurant: a varchar field b. Type of food they make: a varchar field c. Distance (in minutes) from your house: an int d. Date of your last visit: a varchar field, interpreted as date e. Whether you like it or not: an int, interpreted as a Boolean ```sql CREATE TABLE MyRestaurants(Name VARCHAR(20), FoodType VARCHAR(20), Distance INT, Date VARCHAR(20), Feedback INT); ``` #### Insert tuples * insert into Edges ```sql INSERT INTO Edges VALUES(10, 5); INSERT INTO Edges VALUES(6, 25); INSERT INTO Edges VALUES(1, 3); INSERT INTO Edges VALUES(4, 4); INSERT INTO Edges VALUES('-1', '2000'); // Type Affinity ``` * Insert into MyRestaurants ```sql INSERT INTO MyRestaurants VALUES('Thai Tom', 'Thai', 15, '2024-01-04', 1); INSERT INTO MyRestaurants VALUES('Green Island', 'Chinese', 25, '2024-01-02', 0); INSERT INTO MyRestaurants VALUES('DTF', 'Taiwanese', 23, '2024-01-01', 1); INSERT INTO MyRestaurants VALUES('Poke', 'Japanese', 10, '2024-01-06', 0); INSERT INTO MyRestaurants VALUES('Local Point', 'American', 5, '2024-01-07', NULL); ``` #### Print out the table * Print out Edges ```sql SELECT * FROM Edges; // print out the whole table 10|5 6|25 1|3 4|4 SELECT Source FROM Edges; // print out the specific column 10 6 1 4 SELECT * FROM Edges WHERE Source > Destination; // condition 10|5 ``` * Print out MyRestaurants: write the code (command and query) to return the results in these three formats: 1. print the results in comma-separated form 2. print the results in list form, delimited by | (vertical line character) 3. print the results in column form where every column has width >= 15. Every column must have this width, not just the first one. 4. Write a SQL query that returns only the name and distance of all restaurants within and including 20 minutes of your house. The query should list the restaurants alphabetically by name. 5. Write a SQL query that returns all attributes of all restaurants that you like, but have not visited for more than 3 months ago (not including exactly 3 months ago). Make sure that you use the date() function to calculate the date 3 months ago. ```sql 1. .headers on .separator ',' SELECT * FROM MyRestaurants; Name,FoodType,Distance,Date,Feedback Thai Tom,Thai,15,2024-01-04,1 Green Island,Chinese,25,2024-01-02,0 DTF,Taiwanese,23,2024-01-01,1 Poke,Japanese,10,2024-01-06,0 Local Point,American,5,2024-01-07, 2. .mode list SELECT * FROM MyRestaurants; Name|FoodType|Distance|Date|Feedback Thai Tom|Thai|15|2024-01-04|1 Green Island|Chinese|25|2024-01-02|0 DTF|Taiwanese|23|2024-01-01|1 Poke|Japanese|10|2024-01-06|0 Local Point|American|5|2024-01-07| 3. .mode column .width 15 15 15 15 15 SELECT * FROM MyRestaurants; Name FoodType Distance Date Feedback --------------- --------------- --------------- --------------- --------------- Thai Tom Thai 15 2024-01-04 1 Green Island Chinese 25 2024-01-02 0 DTF Taiwanese 23 2024-01-01 1 Poke Japanese 10 2024-01-06 0 Local Point American 5 2024-01-07 4. SELECT Name, Distance FROM MyRestaurants WHERE Distance >= 20 ORDER BY Name ASC; DTF 23 Green Island 25 5. SELECT * FROM MyRestaurants WHERE Date > DATE('now', '-3 month') AND Feedback = 1; Thai Tom Thai 15 2024-01-04 1 DTF Taiwanese 23 2024-01-01 1 ``` #### Delete ```sql DELETE FROM MyRestaurants WHERE Distance > 10; SELECT * FROM MyRestaurants; Poke Japanese 10 2024-01-06 0 Local Point American 5 2024-01-07 ``` #### Update the table ```sql UPDATE MyRestaurants SET Name = 'Poke Bowl' WHERE Distance = 10; SELECT * FROM MyRestaurants; Poke Bowl Japanese 10 2024-01-06 0 Local Point American 5 2024-01-07 ``` #### Alter the table * [reference](https://www.sqlitetutorial.net/sqlite-alter-table/) ```sql ALTER TABLE MyRestaurants RENAME Date to Time; ALTER TABLE MyRestaurants ADD COLUMN Stars; ALTER TABLE MyRestaurants DROP Distance; SELECT * FROM MyRestaurants; Name FoodType Time Feedback Stars --------------- --------------- --------------- --------------- --------------- Poke Bowl Japanese 2024-01-06 0 Local Point American 2024-01-07 ``` #### Type Affinity ```sql INSERT INTO Edges VALUES('-1', '2000'); ```sql There is no error when I input the statement above because of the idea "Type Affinity" in SQLite. To maximize compatibility between SQLite and other database engines, SQLite supports the concept of "type affinity" on columns. For example, if I input '123' in an integer column, SQLite will automatically convert this into an integer. #### Date function * [reference](https://www.sqlitetutorial.net/sqlite-date-functions/sqlite-date-function/) ```sql SELECT * FROM MyRestaurants WHERE Date > DATE('now', '-3 month') AND Feedback = 1; ``` ### Order by * [reference](https://www.sqlitetutorial.net/sqlite-order-by/#:~:text=The%20ORDER%20BY%20clause%20comes,The%20ASC%20keyword%20means%20ascending.) ```sql SELECT Name, Distance FROM MyRestaurants WHERE Distance >= 20 ORDER BY Name ASC; ``` ## 2024/1/4 Quiz Section ### Course materials * [slides](https://docs.google.com/presentation/d/e/2PACX-1vTRAPfO-drFlA-Qxq-UzyE0aePCQXbIvtvxcpUgh9Bphzm3W3krFKryZCciNkHV3xC5JUu7qaJLrIt6/pub?start=false&loop=false&delayms=3000&slide=id.g9cd7810522_0_183) * [Demo code](https://drive.google.com/file/d/1GwszQraQIUqS-li7oVlEJVzNBo0vQeVf/view) ### Special commands ```sql .headers on/off // show/hide the column headers in query results .mode [type] // change the type the table display .show // list the display options ``` ## 2024/1/5 Lecture ### Course materials * [slides](https://drive.google.com/file/d/1kyVnLTAswHjIBLpVHSlzb73-KvuqDo7r/view) * [Demo code](https://drive.google.com/file/d/1gjS_ppvvOpx-Glej_DQ-5REs4xs9UBtS/view) ### Commands #### Table ```sql Name Job Salary --------------- --------------- --------------- 123 TA 50000 345 TA 60000 567 Prof 90000 789 Prof 100000 ``` #### AS * [reference](https://www.tutorialspoint.com/sqlite/sqlite_alias_syntax.htm) ```sql SELECT P.Name FROM Payroll AS P WHERE Name > 500; ```sql **Table** ```sql Name --------------- 567 789 ``` #### PRIMARY KEY **Method 1** ```sql CREATE TABLE Payroll ( UserID INT PRIMARY KEY, Name VARCHAR(100), Job VARCHAR(100), Salary INT); ``` **Method 2** ```sql CREATE TABLE Payroll ( UserID INT, Name VARCHAR(100), Job VARCHAR(100), Salary INT, PRIMARY KEY (UserId); ``` **Method 3** ```sql CREATE TABLE Payroll ( Name VARCHAR(100), Job VARCHAR(100), Salary INT, PRIMARY KEY (Name, Job)); ```sql Sometimes no single attribute is unique, but combinations of attributes are a unique key for the table. #### ORDER BY Orders result tuples by specified attributes (default ascending) ```sql SELECT P.Name, P.UserID FROM Payroll AS P WHERE P.Job = 'TA' ORDER BY P.Salary, P.Name; Name UserID --------------- --------------- Allison 345 Jack 123 // inverse ORDER BY P.Salary DESC ``` #### DISTINCT Deduplicates result tuples ```sql SELECT DISTINCT P.Job FROM Payroll AS P WHERE P.Salary > 70000; ``` #### Foreign Key * A Foreign Key is one or more attributes that uniquely identify a row in another table. * Foreign keys must reference (point to) a unique attribute, almost always a primary key. **Method 1** ```sql CREATE TABLE Regist ( UserID INT REFERENCES Payroll(UserID), Car VARCHAR(100)); ``` **Method 2** when attribute name is the same: ```sql CREATE TABLE Regist ( UserID INT S Payroll, Car VARCHAR(100)); ``` **Method 3** ```sql CREATE TABLE Payroll ( UserID INT, Name VARCHAR(100), Job VARCHAR(100), Salary INT, PRIMARY KEY(UserID, Name)); CREATE TABLE Regist ( UserID INT, Name VARCHAR(100), Car VARCHAR(100), FOREIGN KEY (UserID, Name) REFERENCES Payroll); ``` ## 2024/1/8 Lecture ### Course Materials * [slides](https://drive.google.com/file/d/1yNnomOZx1K_SN0rYWwpOmaAJTV7k8L_r/view) ### Course Notes #### Primary Key & Select count * [reference](https://www.sqlitetutorial.net/sqlite-primary-key/) The following example create a Primary Key and demo the usage of the count function. Also, there is an rumtime error caused by the wrong usage of primary key. ```sql CREATE TABLE Payroll ( UserID INT, Name VARCHAR(100), Job VARCHAR(100), Salary INT, PRIMARY KEY(UserID)); INSERT INTO Payroll VALUES(123, 'Peter', 'Student', 1000); INSERT INTO Payroll VALUES(234, 'Allison', 'Student', 2000); INSERT INTO Payroll VALUES(234, 'Tim', 'Engineer', 10000); Runtime error: UNIQUE constraint failed: Payroll.UserID (19) INSERT INTO Payroll VALUES(345, 'Tim', 'Engineer', 10000); INSERT INTO Payroll VALUES(456, 'Christine', 'Housewife', 20000); SELECT * FROM Payroll; UserID Name Job Salary ------ --------- --------- ------ 123 Peter Student 1000 234 Allison Student 2000 345 Tim Engineer 10000 456 Christine Housewife 20000 SELECT COUNT(*) FROM Payroll; COUNT(*) -------- 4 SELECT COUNT(*) AS num_people FROM Payroll; num_people ---------- 4 ``` #### Foreign Key * [reference](https://www.sqlitetutorial.net/sqlite-foreign-key/) The following example create an reference to the primary ket above. ```sql! CREATE TABLE Regist( UserID INT, Car VARCHAR(100), FOREIGN KEY(UserID) REFERENCES Payroll); INSERT INTO Regist VALUES(123, 'Lexus'); INSERT INTO Regist VALUES(234, 'BMW'), (234, 'Toyota'), (345, 'Volvo'), (456, 'Tesla'); SELECT * FROM Regist; UserID Car ------ ------ 123 Lexus 234 BMW 234 Toyota 345 Volvo 456 Tesla ``` #### Joins * Foreign keys are able to describe a relationship between tables * Joins are able to realize combinations of data * Joins do not require a foreign key, but often they go together 1. Method 1 ```sql SELECT P.Name, R.Car FROM Payroll AS P JOIN Regist AS R ON P.UserID = R.UserID; Name Car --------- ------ Peter Lexus Allison BMW Allison Toyota Tim Volvo Christine Tesla DELETE FROM Regist WHERE UserID = 123; SELECT P.Name, R.Car FROM Payroll AS P JOIN Regist AS R ON P.UserID = R.UserID; Name Car --------- ------ Allison BMW Allison Toyota Tim Volvo Christine Tesla ``` 2. Method 2 ```sql SELECT P.Name, R.Car FROM Payroll AS P, Regist AS R WHERE P.UserID = R.UserID; Name Car --------- ------ Allison BMW Allison Toyota Tim Volvo Christine Tesla ``` #### Outer Joins * LEFT OUTER JOIN - All rows in left table are preserved * RIGHT OUTER JOIN - All rows in right table are preserved * FULL OUTER JOIN - All rows are preserved ```sql SELECT P.Name, R.Car FROM Payroll AS P LEFT OUTER JOIN Regist AS R ON P.UserID = R.UserID; Name Car --------- ------ Peter Allison BMW Allison Toyota Tim Volvo Christine Tesla SELECT P.Name, R.Car FROM Payroll AS P RIGHT OUTER JOIN Regist AS R ON P.UserID = R.UserID; Name Car --------- ------ Allison BMW Allison Toyota Tim Volvo Christine Tesla SELECT P.Name, R.Car FROM Payroll AS P FULL OUTER JOIN Regist AS R ON P.UserID = R.UserID; Name Car --------- ------ Peter Allison BMW Allison Toyota Tim Volvo Christine Tesla SELECT P.Name, R.Car FROM Payroll AS P, Regist AS R WHERE P.UserID = R.UserID AND R.Car = 'Volvo'; Name Car ---- ----- Tim Volvo The following example would not work. SELECT P.Name, R.Car FROM Payroll AS P, Regist AS R WHERE P.UserID = R.UserID AND R.Car = ‘BMW’ AND R.Car = ‘Toyota’; Revised: SELECT P.Name, R1.Car FROM Payroll AS P, Regist AS R1, Regist AS R2 WHERE P.UserID = R1.UserID AND P.UserID = R2.UserID AND R1.Car = ‘Civic’ AND R2.Car = ‘Pinto’; ``` ## 2024/1/10 Lecture ### Course Materials * [slide](https://drive.google.com/file/d/1hVJQmvUntSzRJsN7x4I5j2Tb75Rt-8Ao/view) ### Course Notes #### Aggregates * COUNT * AVG * MAX * MIN * SUM ```sql! SELECT * FROM Payroll; UserID Name Job Salary ------ --------- --------- ------ 123 Peter Student 1000 234 Allison Student 2000 345 Tim Engineer 10000 456 Christine Housewife 20000 SELECT COUNT(*) AS People, AVG(P.Salary) AS AVG, SUM(P.Salary) AS SUM, MAX(P.Salary) AS MAX, MIN(P.Salary) AS MIN FROM Payroll; People AVG SUM MAX MIN ------ ------ ----- ----- ---- 4 8250.0 33000 20000 1000 SELECT AVG(P.Salary) FROM Payroll AS P, Regist AS R WHERE P.UserID = R.UserID; AVG(P.Salary) ------------- 8500.0 ``` #### Grouping ```sql! SELECT * FROM Payroll; UserID Name Job Salary ------ --------- --------- ------ 123 Peter Student 1000 234 Allison Student 2000 345 Tim Engineer 10000 456 Christine Housewife 20000 SELECT Job, AVG(Salary) FROM Payroll GROUP BY Job; Job AVG(Salary) --------- ----------- Engineer 10000.0 Housewife 20000.0 Student 1500.0 SELECT Job, AVG(Salary) FROM Payroll GROUP BY Job HAVING AVG(SALARY) > 8000; Job AVG(Salary) --------- ----------- Engineer 10000.0 Housewife 20000.0 ``` #### FWGHOS SELECT: Last evaluate the SELECT clause FROM: First evaluate the FROM clause WHERE: Next evaluate the WHERE clause GROUP BY: Group the attributes in the GROUPBY HAVING: Eliminate groups based on HAVING ORDER BY: Sort the results based on ORDER BY ## 2024/1/11 Quiz Section ### Course Materials * [slide](https://docs.google.com/presentation/d/1aCvChaPlSJhSpe8VPgyWTOzJmgJ-4X6WpzFb7WQAnKM/edit#slide=id.p1) * [worksheet](https://docs.google.com/document/d/1Th5E2-tKiRx64LOiio8Bb3wzLWGi_gLFLk9UFEa1t2A/edit) * [worksheet solution](https://docs.google.com/document/d/13lI9lP_CFgOpBlEu-XSclVqRE19TgoSN/edit) ### Course Notes #### Worksheet ```sql! CREATE TABLE A (a int); CREATE TABLE B (b int); INSERT INTO A VALUES (1), (2), (3), (4); INSERT INTO B VALUES (3), (4), (5), (6); SELECT * FROM A INNER JOIN B ON A.a = B.b; 3|3 4|4 SELECT * FROM A LEFT OUTER JOIN B ON A.a = B.b; 1| 2| 3|3 4|4 SELECT * FROM A RIGHT OUTER JOIN B ON A.a = B.b; 3|3 4|4 |5 |6 SELECT * FROM A FULL OUTER JOIN B ON A.a = B.b; 1| 2| 3|3 4|4 |5 |6 SELECT * FROM A INNER JOIN B; 1|3 1|4 1|5 1|6 2|3 2|4 2|5 2|6 3|3 3|4 3|5 3|6 4|3 4|4 4|5 4|6 CREATE TABLE Employees (id int, boss int); INSERT INTO Employees VALUES (123, 1), (234, 1), (345, 2), (123, 1), (234, 2), (456, 1); SELECT * FROM Employees; 123|1 234|1 345|2 123|1 234|2 456|1 ``` ## 2024/1/12 Lecture ### Course Materials * [slide](https://drive.google.com/file/d/1K2XKmQNxEdrVGjr4HiTGgw7D6eETVP_Z/view) ### Course Notes SELECT, HAVING, ORDER BY Must use aggregate functions or attributes in GROUP BY ## 2024/1/17 Lecture * [slide](https://drive.google.com/file/d/1M6QLddHfyGbhI5ILYCuZSj59lHXH5P_d/view) ### Subquery ```sql! UserID Name Job Salary ------ --------- --------- ------ 123 Peter Student 1000 234 Allison Student 2000 345 Tim Engineer 10000 456 Christine Housewife 20000 SELECT P.Name FROM Payroll AS P WHERE NOT EXISTS (SELECT * FROM Payroll AS P1 WHERE P.Job = P1.Job AND P.UserID != P1.UserID); Name --------- Tim Christine SELECT P.Name, P.Salary FROM Payroll AS P WHERE P.UserID IN (SELECT UserID FROM Regist) Name Salary --------- ------ Allison 2000 Tim 10000 Christine 20000 ``` ### Monotonicity * Monotonic query is one that obeys the following rule where I and J are data instances and q() are the results of running the query q on those instances: * That is for any superset of I, the query over that superset must contain at least the query results of I. * In other words, adding more tuples to the input table never removes tuples from the output on the next query. * Theorem: If Q is a SELECT-FROM-WHERE query that does not have subqueries, and no aggregates, then it is monotone. * Proof: We use the nested loop semantics: if we insert a tuple in a relation Ri , this will not remove any tuples from the answer. * Consequence: If a query is not monotonic, then we cannot write it as a SELECT-FROM-WHERE query without nested subqueries or aggregates. ```sql! 1. a Monotonic query SELECT P.Name, P.Car FROM Payroll AS P, Regist AS R WHERE P.UserID = R.UserID 2. not a Monotonic query SELECT P.Name FROM Payroll AS P WHERE P.Salary >= ALL (SELECT Salary FROM Payroll) ``` ## 2024/1/18 Quiz Section * [slide](https://docs.google.com/presentation/d/1jW8eL5ILPIVRD37AcN7P1UH8ZWzkKj3Q73qVsBEzlUo/edit) * [worksheet](https://docs.google.com/document/d/14uvjmTbrQaA9Pahq-YrXCtqG4cUoDYF1/edit#heading=h.vfe0kcwpk4jc) ## 2024/1/22 Lecture ### Relational Algebra * [slide](https://drive.google.com/file/d/18QLWENG5MmN3rMt-yEwY9h37FqAYnfxT/view?usp=sharing) ## 2024/1/24 Lecture ### ER Diagrams * [slide](https://drive.google.com/file/d/1tN-VPbM4sTXwBP-JrMNfTVHXZcAugRUu/view) ## 2024/1/26 Lecture ### Design Theory * [slide](https://drive.google.com/file/d/1dVoB2m4_3XFcYanSfZsKjZcLdRyUemlu/view) ## 2024/1/29 Lecture ### Design Theory * [slide](https://drive.google.com/file/d/1dVoB2m4_3XFcYanSfZsKjZcLdRyUemlu/view) ## 2024/1/31 Lecture ### BCNF * [slide](https://drive.google.com/file/d/1oowO5nGSIoBH_ueMmJ778W4jAgAecGNw/view) ## 2024/2/2 Lecture ### Transaction * [slide](https://drive.google.com/file/d/1iONPH0iS8Ze3nuOXfdAsgizoJwlaeLWq/view) ## 2024/2/9 Lecture ### Locking * [slide](https://drive.google.com/file/d/1eSzaKChZ5zOCqjJnET4_yhG_LVzQJhCc/view)