owned this note changed 2 years ago
Linked with GitHub

資料庫系統概論曾新穆

ER-Mod****el

Relational Data Model ( Ted Codd )
Before = Network Data Model (Cobol ans DDL, DML)
Very contentious: Database Wars (Charlie Bachman vs. Mike Stonebraker) -> Postgres->IIIustra->Informix->IBM
1st prototypes:
Ingres -> CA
Postgres -> IIIustra -> Informix -> IBM
System R -> Oracle, DB2

Why Relations?

Math:
sets: R = {1,2,3} S = {3,4}
A relation on R, S is any subset U R x S (e.g: {(1,4), (3,4)} )
DB:
Branch = {A, B, }
Accounts = { a, b, }
=> A U B = A x B ( e.g: {(A,a) (A,b) (A,) (B,a) (B,b) (B, ) } )

  • Key

Keys and Relations

Key can map to Relations
Candidate key
=> Primary key <> Composite key
=> Super key
(Key can also check integrity constraints)

Key Reference: Wiki

Enhanced ER Modle (EER) (ch13)

  • Overview
    • 1980s
    • add semantic concepts (Superclass and Subclass)
  • Superclass and Subclass
    • Superclass
      entity type include one or more distinct subgroupings of its occurences
    • Subclass
      distinct subgrouping of occurrences of an entity type
    • Relationships
  • attribute inheritance (屬性繼承)
  • Specialization (差異最大化)
    The process of maximizing the differences between members of an entity by identifying their distinguishing characteristics
  • Generalization (最小化)
    The process of maximizing the differences between entities by identifying their common characteristics
  • Participation constraint
    • mandatory
    • optional
  • Disjoint constraint
    • and (subclass 可兼職)
    • or (subclass 不可兼職)

(img source: http://goo.gl/TyMvRM)

(img source: http://goo.gl/xfxD9K)

  • Aggregation
    • has-a
    • is-part-of
    • whole
    • part

The Relational Model (ch4)

  • Overview

    • Terminology
    • tables use to replace data
    • connection between mathematical relations and relations in the relational model
    • properties
    • candidate, primary, alternate, foreign keys
    • meaning of entity integrity and referential integrity
  • Terminology

    • Relation: a table
    • Attribute: column of relation
    • Domain: set of allowable attributes
    • Tuple: row of relation
    • Degree: number of attribute
    • Cardinality: number of tuple
    • Relation database
  • Key

    • Super key
    • Candidate key
    • Primary key
    • Foreign key
    • Alternate key (candidate key that is not primary key)

(URL : http://www.mysql.tw/2015/04/super-keycandidate-keyprimary.html )

  • Mathematical Definition
    • subset of cartesian product (D1 x D2 = (d11,d21)}) (R = {x,y x ∈ D1, y ∈ D2)
  • Relations
    • Relation schema: relation defined by set of attribute and domain name pairs
      ex. {(a,b,c,d,e)}
    • Relational database schema: set of relation schema
      R = {R1,R2,R3}
    • Relation name must distinct
    • each cell of relation is single value
    • each attribute has distinct name
    • value of attribute are from same domain
    • each tuple is distinct
    • order of attributes not important
    • order of tuple not important
  • Integrity constraints
    • NULL: attribute value is currently unknown or not applicable for tuple
      • Deals with incomplete or exceptional data
      • Represents the absence of a value and is not the same as zero or spaces(value)
      • Entity Integrity (primary key cannot be NULL)
    • Referential Integrity
      • foreign key exists in a relation
    • General Constraints
      • additional rules specified by user
  • View
    • Base relation
    • View
      • virtual relation
      • been defined as query on base relations
      • dynamic
    • Purpose
      • powerful, flexible, security (hiding parts of info from certain users)
      • customized way by different users
      • simplify
    • Update

Relational Algebra and Relational Calculus (ch5)

  1. Introduction
  • formal languages associated
  • Relational algebra: high-level procedural language
  • Relational calculus: non-procedural language
  • equivalent to one another
  • relationally complete: A language produces a relation that can be derived using relational calculus
  1. Relational Algebra
  • work on relation to define another relation without changing the original relations

  • operands and results are relations

  • nesting

  • 5 Basic operations

    • Selection, Projection, Cartesian product, Union, Set Difference
  • Other operations

    • Join, Intersection, Division
  • Selection

    • σ®
    • tuple(rows) of R satisfy specified condition
  • Projection

    • π®
    • vertical subset of R that extracting the values of specified attributes and eliminating duplicates -> 可能會有多人有同一特徵
  • Union

    • R∪S
      • R and S must union-compatible
  • Set difference

    • R-S
    • R and S union-compatible
  • Intersection

    • R∩S
    • union-compatible
    • R∩S = R - (R-S)
  • Cartesian product

    • R × S
  • Join

    • derivative of cartesian product
    • forms of join
      • Theta join

        • contains tuples satsfying the predicate F from the Cartesian product of R and S
        • degree is the sum of R and S
        • the predicate F is of the form R.ai theta S.bi where theta may be one of the operator of (> = <)
      • Equijoin (theta join with operator only is =)

      • Natural join

        • Equijoin of two relations R and S and elimate redundont attriable
      • Outer join

        • display row in the result that do not have matchine values in the join column
        • (Left) outer join is join in which tuples from R that do not have
          • (顯示沒有被join的cotition procuct)
      • Semijoin

        • defines a relation that contains the tuples of R that participate in the join of R with S
        • useful if we only want to see the full attributes of staff for the result
  • Division (/)

    • defines a relation over the attributes C that consists of set of tuples from R that match combination of every tulpel in S
    • expression of basic operation
    • 選出元素必須含有全部被除數 (有點像整除的概念)
  • Aggregate Operations

    • the number of elements in set
    • τcount
    • ex: how many properties cost more than 350 rant per month
  • Grouping Operation

    • GAτAL®
  1. Relational Calculus
  • tuple based
    • Interested in finding tuples
    • Specify range of a tuple variable S as the Staff relation as: Staff(S)
    • ex: Find details of all staff earning 10000: S Staff(S) ^ S.salary = 10000
    • 較像要找到的東西 (algebra 較像尋找方法)

SQL (ch6)

  • Objectives of SQL
    • create database and relation structures
    • insertion modification deletion
    • queries
  • command
    • reserved words, user-defined words
    • case insensitive
    • BNF notation
      • Upper-case for reserved words
    • Literals
      • non-number ' '
      • number
    • SELECT
      • SELECT [DISTINCT | ALL]
        * [columnExpression[AS newName]][,]
        FROM TableName[alias][,]
        [WHERE condition]
        [GROUP BY columnList][HAVING condition]
        [ORDER BY columnList]
      • order of the clauses can't change
      • [ ] for optional
      • { } for must
      • ; end
      • *: all column
      • DISTINCT : list without reduncicy
      • AS : rename when viewing(won't change database)
      • WHERE BETWEENAND, NOT, >=<, LIKE, %name%(squence of char), _(just one char)
      • comment IS NULL
      • % for 1 or upper number of non-define char
      • _ just 1 char
      • ORDER by
      • COUNT(*), MIN, MAX -> to numeric and non-meric
      • SUM, AVG -> numeric
        • 除了COUNT,其它的實作都會先把NULL去除
        • Aggregate function can be used only in SELECT list and in HAVING cause
        • COUNT(DISTINCT selecting_row)
      • HAVING 搭配 GROUP 用
        • ex: HAVING COUNT(staff) > 1
    • Subquery (去別的table查詢)
      • SELECT
      • can't use ORDER BY
      • must used only single column name or expression except for EXIT
    • SOME/ANY/ALL
    • Multi-Table quary
      • can used alias
      • JOIN
        • NATUAL JOIN
      • ON
      • USING
      • EXISTS
    • Union/Intersect/Difference/Except
      • operator [ALL] [CORRESPONDING [ BY {column1 [,] } ] ]
      • INSERT INTO {()}
      • VALUES()
    • UPDATE
      • SET
    • DELETE FROME

Objectives (CH7)

  • Required Data
    • position VARCHAR(10) NOT NULL
  • Domain Constraints
    • CHECK
      sex CHAR NOT NULL
      CHECK(sex IN('M','F')
    • CREATE DOMAIN:
      CREATE DOMEAIN DomainName[AS] dataType
      [DEFAULT defaultOption]
      [CHECK (searchCondition)]
      [CHECK (searchCondition)]
      CREATE DOMAIN DOMAIN SexType AS CHAR
      CHECK(VALUE IN('M','F'));
      sex SexType NOT NULL
    • PRIMARY KEY( )
    • FOREIGH KEY( ) REFERENCES branchs -> ALTER TABLE
    • UNIQUE
  • Update and Delete
    • CASCADE
    • SET NULL
    • SET DEFAULT
    • NO ACTION
    • ALTER TABLE
      • CREATE ASSERTION
        CREATE ASSERTION j bn

      • catalogs

      • schema (no include data: table,view,doman)

      • CREATE ACHEMA [Name | AUTHORIZATION Creatorld]

      • DROP SCHEMA Name [RESTRICT | CASCADE]
        CREATE DOMAIN OwnerNiumber AS VARCHAR(5)
        CHECK(VALUE IN(SELECT ownerNo FROM PrivateOwner));
        CREATE DOMAIN StaffNumber AS VARCHAR(5)

        ​​​​​​​​​​  ALTER TABLE Staff
        ​​​​​​​​​​  ALTER position DROP DEFAULT;
        ​​​​​​​​​​  ALTER TABLE
        
  • View
    • Virtual relation
    • defined as query
      CREATE VIEW ViewName [ (newColumnName[])]
      AS subselect
      [WITH [CASCADED | LOCAL] CHECK OPTION]
    • WITH CHECK OPTION
    • UPDATE SET
    • DELETE FROM (TABLE/VIEWING)
    • WITH CHECK OPTION
  • Transaction

Normalization(CH14)

  • produce a set of suitable relations
  • suitable set
    • minimal number of attribute
    • close logical relationship
    • minimal redundancy
  • purpose
    • maintain
    • assess
  • Lossless-join and dependency
  • Functional Dependency
    • full functional dependency
    • partial dependency
    • one-to-one relationship between left and right hand side
    • hold for all time
    • determinat has minimal number of attributes
  • Transitive dependency
    • a -> b and b -> c then c is functional delpency on a
  • Determinants
    • attribuate 越少越好
  • Inference rule
    • idendify a set of minima functional dependencies
    • X+ is the extend of X
    • A, B, C are subsets of the attributes of the relation R -> Armstrong's axioms:
      • Reflexivity
        • If B is subset of A, then A -> B
      • Augmentation
        • A-> B, then A,C -> B,C
      • Trans
      • Self-determination
        • A -> A
      • Deco
      • mposition
        • A -> B,C, then A -> B and A -> C
      • Union
      • Composition
    • Minimal set
      • Every dependency in X has a single attribute on its right-hand side
      • cannot replace any dependency A->B to C->B where C is a proper subset of A
      • cannot remove dependency from X and still have a set of dependencies that is equivalent to X
    • BCNF
      • A relation is in BCNF if
      • defference between 3NF and BCNF is that for a functional dependency of
      • suppose X -> A violates BCNF, thenone of the following holds
        • X is a subset of some key K
        • X is not a subset of
    • Third Normal Dorm
      • BCNF + A is part of some key for R
      • similar to that of BCNF, with the only difference being the third condition
      • a key for a relation is a minimal set of attributes that uniquely determines all other
      • X -> A cause a violation of 3NF:
        • X is a proper subset of some key K - partial dependency
        • X is not a proper subset of any key - transitive dependency
    • Decomposition
      • R1 U R2 U R3 U U Rn = R => Decomposition
      • Theorem:
        • R: a relation schema
        • F: set of functional dependencies on R
        • The decomposition of R into relations with attribute sets R1, R2 is a lossless-join decomposition iff
          • (R1 ^ R2) -> R1 in F
          • or
          • (R1 ^ R2) -> R2 in F
    • Dependency Preservation
      • s

Objectives(CH15)

  • two main techniques for query optimization
    • heuristic rules that order operations in a query
    • comparing different stategies based on relative costs, and selecting one that minimizes resource usage
  • disk assess tends to be dominant cost in query processing for centralized DBMS
  • Query Processing
    • activities involved in retrieving data from that database
    • aim:
      • transform query written in high-level language, into correct and efficient execution strategy expressed in low-level language
      • execute strategy to retrieve required data
    • reduce total execution time and response time

Query Processing(CH23)

  • analysis
    • query tree
  • normalization
    • conjunctive normal form (Product Of Sum)
    • disjunctive normal form (Sum Of Product)
  • Semantic analysis
    • relation connection graph
      • node for relation and result
      • edge for join and source of projection operations
    • normalized attribute connection graph
      • node for reference to attribute, or constant 0
      • directed edge between nodes represent to a join, and between node and constant 0 represents a selection operation
      • weight
      • if cycle, the sum must be positive
    • simplification (反正把表格變小就對了)
    • query restructuring
  • heuristical processing strategies
    • selection operations as early as possible
    • c

Transaction
ACID

Select a repo