# Introduction of Computer Science<br>Ch14 & Ch15 NTNU 計算機概論 ##### [Back to Note Overview](https://reurl.cc/XXeYaE) ##### [Back to Introduction of Computer Science](https://hackmd.io/@NTNUCSIE112/SyU5Uuyhr) {%hackmd @NTNUCSIE112/csabb %} {%hackmd @NTNUCSIE112/csch %} <!-- https://hackmd.io/@sophie8909/csabb https://hackmd.io/@sophie8909/csch --> ###### tags: `NTNU` `CSIE` `必修` `Introduction of Computer Sciencce` # Ch14 Databases ## 14-1 Introduction<!----> A database is a collection of related, logically coherent data used by the application programs in an organization. - Database system application - Bank - School - etc. - the method to build database system - using programming language - using specialized database software (e.g. Microsoft Access) ### Database Management Systems(DBMS) ```graphviz digraph hierarhcy { nodesep=1.0 node[color = pink, fontname=Courier,shape=box] edge [color=black,style=line] user->Application->DBMS "Database Manager"->DBMS->{Database1 Database2 Database3} } ``` ```graphviz digraph hierarhcy { nodesep=1.0 node[color = pink, fontname=Courier,shape=box] edge [color=black,style=line] DBMS->{Hardware Softare Data Users Procedures} } ``` --- ## 14-2 Database Architecture ### ANSI/SPARC structure ```graphviz digraph hierarhcy { nodesep=1.0 node[color = pink, fontname=Courier,shape=box] edge [color=black,style=line] {" user view" "user view" "user view "}->"External level" "External level"->"conceptual level"->"Internal level" "External level" } ``` + **External level** interacts directly with the user.(what the user see) + **Conceptual level** defines the logical view of data. (connection between internal and external) + **Internal level** determines where data are actually stored. --- ## 14-3 Database models + A database model defines the logical design of data. + The model also describes the relationship between different parts of data. + Three models: + Hierarchical model: obsolete + Network model: obsolete + Relational model:widely used today ### Hierarchical model - Data are organized as an upside down tree. - Each entity has only one parent but can have several children. - Advantage:easy to insert and delete structure - Disadvantage:delete parent node will both delete child nodes - Examples:IBM's IMS and DL/1 - ![](https://i.imgur.com/qxCdwCc.png) ### Network model - The entities are organized in a graph, where some entities can be accessed through several path. - Advantage:variability - Disadvantage:complex - Examples:HP's IMAGE and Honeywell's IDS - ![](https://i.imgur.com/2lf2uAA.png) --- ## 14-4 Relationship model - ![](https://i.imgur.com/sDYI0O6.png) - Data are organized in two-dimensional tables called relations(關係). | Attributes | v | v | | |:----------:|:-------------:|:------:|:------:| | No | Course-Name | Unit | | | ------- | ------- | ------ | Tuples | | CIS17 | Intro to java | 5 | **<-** | | CIS15 | Intro to C | 5 | **<-** | | CIS19 | Unix | 4 | **<-** | | CIS51 | Networking | 5 | **<-** | <!-- 這樣是對的ㄇ --> - The number of **attributes** in a relation is its **degree** - The number of **rows** in a relation is its **cardinality** - Advantage:Suitable for expressing more complex relationships - Disadvantage:This structure is inefficient when the number of random queries is small - Relational database is using widest - e.g. Microsoft SQL Server, Sybase, Informix, MySQL, PostgreSQL, Dbase III Plus, Microsoft Access --- ## 14-5 Operations on relations + Select : unary operation + Insert : unary operation + Delete : unary operation + Update : unary operation + Project : unary operation + 選取一關係圖中特定幾項元素顯示 <br> + Join : binary operation + ![](https://i.imgur.com/csNPpWw.png)加入額外元素成為新關係圖 + Union : binary operation + ![](https://i.imgur.com/Af6n2gX.png)取兩關係圖間的聯集 + Intersection : binary operation + ![](https://i.imgur.com/xa1jOb8.png)取兩關係圖間的交集 + Difference : binary operation + ![](https://i.imgur.com/rrCybv0.png)A關係圖多餘B關係圖之元素 ### SQL(Standard Query Language) SQL is the language standardized by ANSI and ISO for use on relation database SQL語言可以分為三個部分: 1. Data Definition Language(DDL):可用來建立新的或刪除已有的表格 2. Data Manipulation Language(DML):可用來對已有的表格進行新增,刪除,存取......等 3. Data Control Language(DCL):可用來控制資料存取的權限 4. Online SQL Test:[SQL Tryit Editor](https://www.w3schools.com/sql/trysql.asp?filename=trysql_op_in) 5. SQL Library:[Microsoft SQL](https://docs.microsoft.com/zh-tw/sql/sql-server/?view=sql-server-ver15) (以下示範左邊為原表格,右邊為經過指令修改/後顯示的東西) #### Data Definition Language - Create table ```SQL= CREATE TABLE member ( member_id CHAR(8), person_id CHAR(10), name VARCHAR(8), birtday DATE, ) #### Data Manipulation Language - Select Operation ```SQL= SELECT * --SQL指令大寫 註解用-- FROM relation_name WHERE criteria ``` - COURSES | No | Course-Name | Unit | | Course-Name | Unit | |:-----:|:-------------:|:----:|:------:|:-------------:|:----:| | CIS17 | Intro to java | 5 | -> | Intro to java | 5 | | CIS15 | Intro to C | 5 | select | Intro to C | 5 | | CIS19 | Unix | 4 | -> | Unix | 4 | | CIS51 | Networking | 5 | | Networking | 5 | - Insert Operation ```SQL= INSERT INTO relation_name VALUES (xx,xx,xx) ``` - COURSES | No | Course-Name | Unit | | No | Course-Name | Unit | |:-----:|:-------------:|:----:|:------:|:---------:|:-------------:|:-----:| | CIS17 | Intro to java | 5 | -> | CIS17 | Intro to java | 5 | | CIS15 | Intro to C | 5 | insert | CIS15 | Intro to C | 5 | | CIS19 | Unix | 4 | -> | CIS19 | Unix | 4 | | CIS51 | Networking | 5 | | CIS51 | Networking | 5 | | | | | | **CIS52** | **TCP/IP** | **6** | - Delete Operation ```SQL= DELETE FROM relation_name WHERE criteria ``` - COURSES | No | Course-Name | Unit | | No | Course-Name | Unit | |:--------:|:--------------:|:-----:|:------:|:-----:|:-------------:|:----:| | CIS15 | Intro to C | 5 | | CIS15 | Intro to C | 5 | | CIS17 | Intro to java | 5 | -> | CIS17 | Intro to java | 5 | | CIS19 | Unix | 4 | delete | CIS51 | Networking | 5 | | **CIS5** | **Networking** | **5** | -> | CIS52 | TCP/IP | 6 | | CIS52 | TCP/IP | 6 | | | | | - Update Operation ```SQL= UPDATA relation_name SET attribute1 = value1, attribute2 = value2... WHERE criteria ``` - COURSES | No | Course-Name | Unit | | No | Course-Name | Unit | |:-----:|:-------------:|:----:|:------:|:-----:|:-------------:|:----:| | CIS15 | Intro to C | 5 | | CIS15 | Intro to C | 5 | | CIS17 | Intro to java | 5 | -> | CIS17 | Intro to java | 5 | | CIS19 | Unix | 4 | update | CIS19 | Unix | 4 | | CIS51 | Networking | 5 | -> | CIS51 | Networking | **6** | | CIS52 | TCP/IP | 6 | | CIS52 | TCP/IP | 6 | - Project Operation ```SQL= SELECT attribute_list FROM relation_name ``` - COURSES | No | Course-Name | Unit | | No | Unit | |:-----:|:-------------:|:----:|:-------:|:-----:|:----:| | CIS15 | Intro to C | 5 | | CIS15 | 5 | | CIS17 | Intro to java | 5 | -> | CIS17 | 5 | | CIS19 | Unix | 4 | project | CIS19 | 4 | | CIS51 | Networking | 5 | -> | CIS51 | 5 | | CIS52 | TCP/IP | 6 | | CIS52 | 6 | - Join Operation ```SQL= SELECT attribute_list FROM relation1, relation2 WHERE criteria ``` - join - COURSES                | No | Course-Name | Unit | |:-----:|:-------------:|:----:| | CIS15 | Intro to C | 5 | | CIS17 | Intro to java | 5 | | CIS19 | Unix | 4 | | CIS51 | Networking | 5 | | CIS52 | TCP/IP | 6 | - TAUGHT-BY | No | Professor | |:-----:|:---------:| | CIS15 | Lee | | CIS17 | Lu | | CIS19 | Walter | | CIS51 | Lu | | CIS52 | Lee | - Result | No | Course-Name | Unit | Professor | |:-----:|:-------------:|:----:|:---------:| | CIS15 | Intro to C | 5 | Lee | | CIS17 | Intro to java | 5 | Lu | | CIS19 | Unix | 4 | Walter | | CIS51 | Networking | 5 | Lu | | CIS52 | TCP/IP | 6 | Lee | - Union Operation ```SQL= SELECT * FROM relation1 UNION SELECT * FROM relation2 ``` - Union - CIS15-Roster                 | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 245-89-6580 | Anne | Green | | 459-98-6789 | Ted | Purple | - CIS52-Roster  | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 342-88-9999 | Rich | White | - Result | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 245-89-6580 | Anne | Green | | 459-98-6789 | Ted | Purple | | 342-88-9999 | Rich | White | - Intersection operation ```SQL= SELECT* FROM relation1 INTERSECT SELECT* FROM relation2 ``` - Intersection - CIS15-Roster | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 245-89-6580 | Anne | Green | | 459-98-6789 | Ted | Purple | - CIS52-Roster | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 342-88-9999 | Rich | White | - **Result** | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | - Different Operation ```SQL= SELECT * FROM relation1 MINUS SELECT * FROM relation2 ``` - Different - CIS15-Roster | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 342-88-9999 | Rich | White | - CIS52-Roster  | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 145-67-6754 | John | Brown | | 232-56-5690 | George | Yellow | | 245-89-6580 | Anne | Green | | 459-98-6789 | Ted | Purple | - Result | StudentID | F-Name | L-Name | |:-----------:|:------:|:------:| | 245-89-6580 | Anne | Green | | 459-98-6789 | Ted | Purple | --- ## 14-6 Database Design ```graphviz digraph{ nodesep=1 node[color = pink, fontname=Courier,shape=box] edge [color=pink,style=line] 了解顧客需求->邏輯資料設計->實體資料設計->建立資料庫 } ``` ### Entity-Relationship Diagram(E-R) ```graphviz digraph{ nodesep=1 node[color = pink, fontname=Courier,shape=box] edge [color=pink] { No, Name, Unit }->Course { s_id[flag=ID] s_name[flag=Name] s_add[flag=Address] }->Student { p_id[flag=ID] p_name[flag=Name] p_add[flag=Address] node[shape=diamond] }->Professor Student->takes->Professor Course->teaches->Professor } ``` ### Normalization Normalization is the process by which a given set of relation are transformed to a new set of relations with a more solid structure. - First normal form (1NF) - 避免資料表上有多筆資料的情況 - 分割放置在不同row - Second normal form (2NF) - 某欄位只與Primary Key 的部分有相依性 - 拆解出來建一個新表 - Also having 3NF, 4NF... --- ## 14-7 Other Database Models - 分段式資料庫 - 片段分散式資料庫(fragmented distributed database)<br> 資料局部分散在各電腦,每部電腦維護部分資料庫。 - 複製分散式資料庫(replicated distributed database)<br> 一部電腦擁有另一部電腦精確複製的資料,儲存於某一部電腦資料所駔的修改,會確切的在其他電腦重複該動作。 - 物件導向資料庫(object-oriented database) - 例:Jasmine、Alltalk、GemStone、O2 --- # Ch15 Data Compression ## 15-1 Introduction ### Objectivese - Realize the need for data compression - Differentiate between lossless and lossy compression - Understand three lossless compression encoding techniques - run-length - Huffman - Lempel Ziv - Understand two lossy compression methods - JPEG - MPEG - Understand the main idea behind the MP3 standard for compressing audio ### Data compression methods - Data compression methods are either lossless(all information is recoverable) or lossy(some information is lost) - Data compression takes the original message and reduces the number of bits that are to be transmitted ## 15-2 Lossless Compression Methods ### Run-length encoding - replace consecutive repeating occurrences of a symbol by number of occurrences - AAAAAABBBBBB -> A6B6 - for two symbols - encode one symbol which is more frequent than the other - This example only encode 0 between 1 - {00001|0100} {0001|0011}{1|0000} {0001|0011} ### Huffman encoding - Frequency ↑ bits ↓ (fewer bits represent for the letter with higher frequency) - Optimization Algorithm - 二元樹避免歧異 ### Morse Code - 1836 built by Samuel F.B. Morse - 1838 Morse improve Telegraph receiver - 1844 The USA built 60km - 1949 G.K. Zipf 統計每一英文字母出現的頻率 - 缺點: - 良好的壓縮率 - 太慢,讀取二次才可編碼 - 第一次統計各符號的數量 - 第二次對訊息編碼 - LZ解決問題 ### Lempel Ziv encoding - Dictionary-based encoding - GIF and TIFF using this - One parsed ## 15-3 Lossy Cpmpression Methods ### Format #### GIF - Compression:Lempel Ziv - 優點: - 可有效節省壓縮空間,提高壓縮比例 - 支援動畫(將多幅的圖案合為一個GIF檔案後依序顯示) - 缺點:僅支援256色 #### TIFF - Compression:LZ, Huffman Coding, Run Length Encoding - 色彩: - 單色 - 灰階 - 全彩 #### PCX影像壓縮技術 - 以Run Length Encoding 作為壓縮的核心技術 - 優點:演算法簡單、易懂且易實作 - 缺點:壓縮率會隨著影像複雜度的不同而有所改變。遇到重複性低的影像資料,PCX處理後的影像空間有時反而會不減反增 #### JPEG - 由ISO及CCITT在1991年發表 - JPEG為一種有損影像壓縮法,解壓縮後無法完全恢復為原始影像 - JPEG壓縮效果比較好,成為現今最常用的影像壓縮法之一 - 壓縮全彩或是8位元的灰階影像 - Compression:[DCT](####DCT) <br>(過程詳情請見講義) ### Animation and Video #### MPEG - [I-frames](####I-frames) - [P-frames](####P-frames) - [B-frames](####B-frames) ### Audio #### MP3 - 1987由德國Fraunhofer與Erlangen大學共同制定的數位聲音壓縮演算法 - MP3是MPEG影音壓縮演算法中針對聲音部分所制定的演算法標準 - 原理:為降低聲音失真度,MP3利用人耳聽覺的特性,去除掉人耳聽不到的過高或過低的頻率,大幅減少儲存空間 #### 聲音檔比較 | 格式 | 簡介 | 優點 | 缺點 |example| |:----:|:--------------------:|:--------:|:--------------:|:-:| | WAV | 從CD或直接錄音而來 | 原音呈現 | 占空間 |44.1kHz 16bits <br>單聲道取樣<br>22675KB 263秒| | MP3 | 聲音檔的壓縮格式之一 | 壓縮率高 | 某種程度的失真 | 24kHz 16bits <br>單聲道取樣<br>1029KB 263秒| ### Compression Methods #### DCT - F(i,j) - Uniform Gray Scale - Case 1: Uniform gray scale - Case 2: Two sections - Gradient Gray Scale - Case 3: - Quantization & Compression - Reading the table #### I-frames - It is an **independent frame** that is not related to any other frame. - JPEG encoding - They are presented at regular intervals. - 12~15 images having 1 I-frame #### P-frames - Motion compensation can be predicted from the previous I-frame and P-frame changes - MPEG:Motion compensation - only saving different part #### B-frames