--- tags: 大三 --- # 資料庫管理系統導論 - 期中筆記 - 國立中興大學大資訊工程系 Autumn 2020 [![](https://img.shields.io/badge/dynamic/json?style=flat-square&color=yellow&label=%E7%B8%BD%E8%A7%80%E7%9C%8B%E6%95%B8&query=%24.viewcount&suffix=%E4%BA%BA%E6%AC%A1&url=https%3A%2F%2Fhackmd.io%2FJpgIw0ndRgGDP8xn9yTm0Q%2Finfo)](https://hackmd.io/JpgIw0ndRgGDP8xn9yTm0Q/info) [![](https://img.shields.io/static/v1?style=flat-square&label=命中率&message=約50%&color=<COLOR>)](https://hackmd.io/jZfXixhrTmWPZzYY5IH_fQ) <img src="https://i.imgur.com/2WqDLzv.jpg" style="height: 500px;display:inline"> <img src="https://i.imgur.com/6hQ049h.png"> ## Chapter 1 Introduction ### Purpose of Database systems when DB built directly on file systems: + data redundancy and inconsistency + difficulty in accessing data + data isolation + integrity problems + atomicity of updates + concurrent access by multiple users + secure problem ### View of Data + data models > some tools for describing data, data relationships, data semantics... + Data abstraction ### relational model + table-like + columns -> fields + rows -> records ### data abstraction + physical level:how to store + logical level:what to store + view level:what to show ### instances and schemas > instance <-> schemas can be analog to object <-> class + logical schema: + sets of all fields + physical schema: + physical structure ### data independence + physical: change physical schema -> no change of logical schema + logical: change logical schema -> no change of app. + interface should be well defined ``` view +----------------+ <--+ logical +-+ interface +----------------+ <--+ physical ``` ### language used in Database + Data definition language(DDL) + Data manipulation language(DML) + Pure: + Relational algebra + Tuple relational calculus + Domain relational calculus + Commercial + SQL + Procedural DML + Needs to specify how to get data + Declarative DML + Doesn't need to specify how to get data ### SQL + Non-procedural + Not Turing machine equivalent + Usually embedded in other high level language(host language) ### Database Design + Logical design (what schema to have) + Physical design (what physical layout) ### Database Engine ![](https://i.imgur.com/ojn9oLF.png) + Storage manager + <-> OS file manager + components + Authorization and integrity manager + Transaction manager + file manager + BUffer manager + Physical level + Data files + Data dictionary + store metadata + exe plan + statistic + Indices + fast access + Query processor + DDL interpreter + interpret DDL and records definition in data dictionary + DML compiler + translate DML to execution plan consisting of low-level instructions + query optimization + Query evaluation engine executes low-level instructions + Transaction management + properties: 1. Atomicity 2. Consistency 3. Isolation 4. Durability + 1.4. -> recovery manager + 2.3. -> concurrency manager ### Database Architecture + Centralized database ![](https://i.imgur.com/n52190G.png) + Client-server database ![](https://i.imgur.com/luTQigL.png) + Parallel database 沒圖啦 + Distributed database ![](https://i.imgur.com/WyutXyd.png) ## Chapter 2 Intro to Relational Model 中興噁男資料表 (pervert table) | ID | name | dept_name | rank | | :--------: | :----: | :-------: | :--: | | 4107056002 | 許文全 | 大資工系 | 0 | | 4107056003 | 林〇安 | 大資工系 | 1 | | 4106030323 | 〇毅 | 小生技系 | 2 | 學系學院對照表 (college table) | dept\_name | college | | ---------- | -------- | | 大資工系 | 大資電學院 | | 小資電學士班 | 大資電學院 | | 小生技系 | 小農資學院 | ### Structure of Relational Databases > pervert = (ID, name, dept\_name, rank) > > R = (A1, A2, A3, ... ,An) is a relation schema > A1, A2, A3, ... are attributes + The set of allowed values for each attribute is called the **domain of the attribute**.(domain 這個術語如同程式語言常說的 datatype) + Attribute values are (normally) required to be atomic. + **null** is a member of every domain. Indicated that the value is unknown(稍後介紹) + Relations are unordered. ### Database Schema + **Database schema** -- is the <u>logical structure</u> of the database. + **Database instance** -- is a <u>snapshot of the data</u> in the database at a given instant in time. + Example + *schema*: pervert(ID, name, dept\_name, rank) + *instance*: | ID | name | dept_name | rank | | :--------: | :----: | :-------: | :--: | | 4107056002 | 許文全 | 大資工系 | 0 | | 4107056003 | 林〇安 | 大資工系 | 1 | | 4106030323 | 〇毅 | 小生技系 | 2 | ### Keys > 坤芳:用來區別資料用的,比如組合語言中的 address, 物件導向中的 OID,資料庫中的 Key 就是一種 naming schema + Let $K ⊆ R$ + K Is a <mark>superkey</mark> of R if values for **K** are sufficient to identify a unique tuple of each possible relation r(R) i.e. to name a tuple in a relation > 中興噁男資料表中的 name 不能當 key 因為可能有兩個噁男的 name 是同一個值。key 必需是永遠不會重複的,有特殊設計的,比如學號 > > e.g. (ID) 和 (ID, name) 在 pervert 表中都可以是 superkey. + Super key is a <mark>candidate key</mark> if K is minimal. e.g. (ID) + One of the candidate keys is selected to be the primary key > Which One? 在設置 primary key 時,資料庫為了避免值重複,所以會自動建立 index,因此在選擇 primary key 時,應該選擇最常被 query 的欄位 + <u>Foreign key</u> constraint: Value in one relation must appear in another(某個 table 的欄位是另一個 table 欄位中的 primary key) + Referencing relation + Referenced relation + e.g. pervert table 中的 dept\_name 是來自 department table 的 foreign key > | dept\_name | college | > | ---------- | -------- | > | 大資工系 | 大資電學院 | > | 小資電學士班 | 大資電學院 | > | 小生技系 | 小農資學院 | another example: ![](https://i.imgur.com/vme8bxn.png) takes 中的 **id** 是 foreign key, advisor 的 **s\_id** 是 foreign key ## Relational Module > 1. structure (建立table用的) > 2. query language (尋找資料用的又分兩類) > 1. procedural (需要某個資料時我要給予資料庫指示,比較難寫) > 2. declarative (需要某個資料時,只需要給予你所需要的資料的條件,比較簡單) "Pure" language: (Language module) 1. **Relational algebra (RA)** > 透過運算得到新的東西 relational algebra 基本運算單位就是一個 set 比如做交集、聯集 > > e.g. > > > R.A.: ∏ name(σ dept_name='cs'(instructor)) Λ salary > 10⁵ 2. Relational calculus > 透過條件的方式來選取資料 > > **2-1 Tuple relational calculus** (一筆資料) > > e.g. > > > T.R.C.: \{ t.name| t ε instructor \} > > **2-2 Domain relational calculus** (一欄資料) > > e.g. > > > D.R.C.: \{\<n\> | <i, n, d, s> ε instructor (ID, name, dept_name, salary) Λ dept\_name='cs' salary ≥ 10⁵\} The above 3 pure languages (relational algebra, tuple relational calculus, domain relational calculus) are **not really** equivalent in computing power. 一般的情況下這三個 pure languages 是 equivalent 只有在表示不在範圍內的資料時 R.A. 會無法表示 e.g. 如果你要用表示 R.A. 表示 > T.R.C \{¬t | t ε instructor\} ## Relational Algebra six basic operations + select: σ + project: ∏ + union: ∪ + set difference: – + Cartesian product: x + rename: ρ ### Select Operation σ 選擇 The "select" operation selects tuples that satisfy a given predicate. + Notation: $σ_p(r)$ + p is called the **selection predicate** Query: $$ σ_{name="林〇安"}\ (pervert) $$ Result: | ID | name | dept_name | rank | | :--------: | :----: | :-------: | :--: | | 4107056003 | 林〇安 | 大資工系 | 1 | ### Project Operation ∏ 投影 An operation that returns its argument relation, with certain attributes left out. + Notation: $∏_{A_1, A_2, A_3,...,A_k}(r)$ where $A_1, A_2, A_3,...,A_k$ are attribute names and $r$ is a relation name. + The result is defined as the relation of k column obtained by erasing the column that are not listed. + **Duplicated rows removed from result**, since relations are sets. (在實際的SQL中稍微不同) Query: $$ ∏_{name, rank}(pervert) $$ Result: | name | rank | | :----: | :--: | | 許文全 | 0 | | 林〇安 | 1 | | 〇毅 | 2 | ### Cartesian-Product Operation X The Cartesian-Product Operation (denoted by X) allows us to combine information from any two relations. Query: $$ pervert\ X\ college $$ Result: | ID | name | pervert.dept | rank | college.dept | college | |:----------:|:------:|:------------:|:----:| ------------ | ---------- | | 4107056002 | 許文全 | 大資工系 | 0 | 資工系 | 大資電學院 | | 4107056002 | 許文全 | 大資工系 | 0 | 資電學士班 | 大資電學院 | | 4107056002 | 許文全 | 大資工系 | 0 | 小生技系 | 小農資學院 | | 4107056003 | 林〇安 | 大資工系 | 1 | 大資工系 | 大資電學院 | | 4107056003 | 林〇安 | 大資工系 | 1 | 小資電學士班 | 大資電學院 | | 4107056003 | 林〇安 | 大資工系 | 1 | 小生技系 | 小農資學院 | | 4106030323 | 〇毅 | 小生技系 | 2 | 大資工系 | 大資電學院 | | 4106030323 | 〇毅 | 小生技系 | 2 | 小資電學士班 | 大資電學院 | | 4106030323 | 〇毅 | 小生技系 | 2 | 小生技系 | 小農資學院 | ### Join Operation ⋈ 就影 The "join" operation allows us to combine a "select" operation and a Cartesian-Product "theta join": 是條件,滿足條件就是相關 "natural join": 相同欄位要有相同的值 (default) Notation: > $r⋈_{\theta} s = σ_{\theta}(r{\times}s)$ > > *r* and *s* are table name; *θ* is a condition Query: $$ σ_{pervert.dept=college.dept} (pervert\ X\ college)) $$ or > $pervert ⋈_{pervert.dept=college.dept} college$ Result: | ID | name | pervert.dept | rank | college.dept | college | |:----------:|:------:|:------------:|:----:| ------------ | ---------- | | 4107056002 | 許文全 | 大資工系 | 0 | 大資工系 | 大電資學院 | | 4107056003 | 林〇安 | 大資工系 | 1 | 大資工系 | 大電資學院 | | 4106030323 | 〇毅 | 小生技系 | 2 | 小生技系 | 小農資學院 | ### Union Operation ∪ 聯集 The union operation allows us to combine two relations + Notation: $r\, \cup\, s$ For $u \cup s$ to be valid. 1. r, s must have the same **arity** (same number of attributes) 2. The attribute domains must be compatible (e.g. 2<sup>nd</sup> column of *r* deals with the same type of values as does the 2<sup>nd</sup> column of *s*) Query: $$ ∏_{name}(σ_{dept\_name="小生技系"}(pervert))\ \cup\ ∏_{name}(σ_{rank=0}(pervert)) $$ result: | name | | :----: | | 〇毅 | | 許文全 | ### Set Intersection Operation ⋂ 交集 The set intersection operation allows us to find tuples that are in both the input relations. + Notation: $r\, \cap\, s$ Assume: + r, s have the same arity + attributes of *r* and *s* are compatible Query: $$ ∏_{name}(σ_{dept\_name="大資工系"}(pervert))\ \cap\ ∏_{name}(σ_{rank=1}(pervert)) $$ result: | name | | :----: | | 林〇安 | ### Set Difference Operation - 差集 The set-difference operation allows us to find tuples that are in one relation but are not in another. + Notation: $r-s$ set difference must be taken between **compatible** relations. + *r* and *s* must have the same arity + attribute domains of *r* and *s* must be compatible Query: $$ ∏_{name}(σ_{dept\_name="大資工系"}(pervert))\ - ∏_{name}(σ_{rank<1}(pervert)) $$ Result: | name | | :----: | | 林〇安 | ### The Assignment Operation ← 賦值(a.k.a 左箭頭) It is convenient at times to write a relational-algebra expression by assigning parts of it to temporary relation variables. + Notation: new\_name← some\_expressions + The assignment operation is like assignment in a programming language. + With the assignment operation, a query can be written as a sequential program consisting of a series of assignments followed by an expression whose value is displayed as the result of the query. ### The rename Operation ρ 重新命名(a.k.a F2) The results of relational-algebra expressions do not have a name that we can use to refer to them. + Notation: $ρ_x (E)$ returns the result of expression $E$ under the name $x$ --- > **NOTE** > > 1. $ρ_{disgusting} (E)$ > > 2. $disgusting ← E$ > > 這兩個所表達的意思不一樣,第 2 個是指在額外花一個空間去存,而第 1 個則只是更名而已 ### Equivalent Queries There is **more than one way to write a query** in relational algebra. e.g. Find information about name in the 大資工系 with rank smaller than 3 > Query1 > $$ > \sigma _{dept\_name="大資工系" \wedge\ rank<3}(name) > $$ > Query2 > $$ > \sigma_{dept\_name = "大資工系"}(\sigma_{rank<3}(name)) > $$ ## Chapter 3 Introduction to SQL <mark>SQL is case insensitive, thus Name ≡ NAME ≡ name</mark> ### DDL ```sql= create table S( a int, b varchar(20) not null, c varchar(50), d numeric(3,1), /*allows 44.5*/ e float(5), /*allow 5 mantissa including 1.*/ f float, /*double precision*/ g real, /*single precision*/ y smallint, primary key(a, d), foreign key(c) references T(c) ); create table T( c varchar(50), f varchar(50), primary key(c) ); drop table table_name; /*delete a table*/ alter table table_name add column_name datatype; alter table table_name drop column_name; create view U(a, b, f) as select a, b, f from S, T where S.c = T.c and S.y = 2020; ``` ### DML ```sql= /*建立表格*/ create table students( ID int, name varchar(50), dept_name varchar(50), rank int, tot_score int, primary key(ID) ); insert into pervert values (4106030323,'〇毅','生技系',2,200); insert into pervert values (4107056003,'林〇安','資工系',1,201); insert into pervert values (4107056006,'游〇瑋','資工系',3,202); insert into pervert values (4107056012,'黃〇凱','資工系',8,203); ``` | ID | name | dept_name | rank | tot_score | | :--------: | :----: | :-------: | :--: |:--:| | 4106030323 | 〇毅 | 生技系 | 2 |200| | 4107056003 | 林〇安 | 資工系 | 1 |201| | 4107056006 | 游〇瑋 | 資工系 | 3 |202| | 4107056012 | 黃〇凱 | 資工系 | 8 |203| ```sql= /*select from格式*/ select name from students; ``` |name| |:--:| |〇毅| |林〇安| |游〇瑋| |黃〇凱| ```sql= /*distinct statement*/ /*default is select all,will not remove duplicates*/ select distinct dept_name from students; ``` |dept_name| |:--:| |生技系| |資工系| ```sql= select * from students; /*all attributes*/ select '437' /*1 col,1 row='437'*/ select '437' as FOO; /*1 col name='FOO',1 row='437'*/ select 'A' from students; ``` |N/A| |:--:| |A| |A| |A| |A| ```sql= select ID,name,score/2 from students; /*or with rename*/ select ID,name,score/2 as avg_score from students; ``` ```sql= /*where cluster*/ select name from students where dept_name = '資工系' and tot_score = 201; ``` | name | | :----: | | 林〇安 | ```sql= /*遇到同欄位會自動重命名*/ select * from students, instructor; ``` ```sql= /*Rename*/ /*Used for */ select distinct T.name from pervert as T, pervert as S where T.rank < S.rank and S.dept_name='資工系' ``` ![](https://i.imgur.com/rKxyB72.png) ```sql= /*self join*/ /*用於尋找連鎖關係*/ SELECT r1.supervisor FROM emp_super AS r1, emp_super AS r2 WHERE r1.person = 'Bob' AND r1.supervisor = r2.person; ``` ![](https://i.imgur.com/QrtOYPo.png) ![](https://i.imgur.com/XeIKmsa.png) String operation + `%` + match any sub-string, including λ-string + `_` + match any character + to use these character without special meaning in string, escape character is needed ```sql= /*String operations*/ select name from pervert where name like '林%';/*pic1*/ select name from pervert where name like '__';/*pic2*/ /*escape char example*/ select name from pervert where attendency like '100\%' escape '\'; ``` pic1: ![](https://i.imgur.com/3UD9Whj.png) pic2: ![](https://i.imgur.com/58PVOa0.png) Ordering + default is asc ```sql= select distinct name,dept_name,rank from pervert order by rank, dept_name desc,name asc ``` ![](https://i.imgur.com/ikM4pbG.png) Between + <= 和 >= 的語法糖 + `between 90000 and 10000` ≡ `>= 9000 and <= 10000` Set operation --- <style> #author{ border: 1px solid gray; border-radius: 10px; padding: 10px; width: 80%; margin: 10px auto; } </style> <div id="author"> <p>111級中興資工</p> <p>作者:<a title="瑋哥" href="https://github.com/wei-coding">游庭瑋</a>、<a href="https://github.com/liao2000">廖柏丞(作者是不會自己犠牲的)</a></p> <p>壯烈犧牲:<a title="不要以為你沒github就可以逃過一劫" href="https://www.facebook.com/profile.php?id=100002871222642">許文全</a>、<a title="永遠的噁男二號" href="https://github.com/5438arthur">楊〇</a>、<a title="a.k.a 中興噁男一號" href="https://github.com/leonardo-lin">林〇安</a>、<a title="這個馬掉還是看的出來吔" href="https://github.com/wei-coding">游〇瑋</a>、<a title="缺資料被硬塞進來的 8 號噁男" href="https://www.facebook.com/sky.ng.319">黃〇凱</a></p> </div>