# Relational Model
:::info
[TOC]
:::

<br/>
## Introduction
The relational model is a foundational data model in database systems that represents data using tables (relations).
It provides a simple, mathematically defined structure that supports precise querying and data integrity, and it forms the basis of SQL and modern relational databases.
1. Defines a model of data
- Specifies how data is **structured, represented, and interpreted**.
2. Data → **a set of relations**
- The entire database is viewed as a **collection of tables (relations)**, where each relation is a **set of tuples**.
<br/>
### ★ Eample Relation
Relation: ==Student(SID, Name, Major)==
| SID | Name | Major |
|-----|-------|--------|
| 1 | Alice | CS |
| 2 | Bob | EE |
- Each row is a **tuple**
- Each column is an **attribute**
- The table as a whole is a **relation**
- Duplicate rows are not allowed
<br/>
## Properties
1. Each relation has **at least one key**
→ A **key** is a set of attributes whose values are **unique** across all tuples.
2. Attributes **without duplicate values**
→ No two tuples can share the same key value.
- ✰ Example: ISBN _(International Standard Book Number)_ ++uniquely++ identifies each book.
3. Each tuple is **uniquely identifiable**
→ A key guarantees that every tuple can be distinguished from all others.
> ++Reordering attributes/tuples++ does not change the relation.
4. Attribute domain
→ The **set of all possible values** that an attribute can take.
- ✰ Example: The domain of attribute `price` is real numbers between `$0` and `$1000`.
- Each attribute value is **indivisible** (e.g., integer, string).
5. Value of an attribute cannot be a relation
→ An attribute cannot contain a table or set of values.
6. **Schema (pattern)** of a relation
→ The **meta-data** that describes the structure of a relation.
- Includes:
- ✰ Example: **Book(ISBN, title, price, year)**
- `Book`: Name of the relation
- `ISBN, title, price, year...`: Names of attributes
- `year > 0, ...`: Domains of attributes
- Constraints on attributes (e.g., key, non-key)
- Instance: _(Like an obj instance of class `List()` in Python)_
```S
[
(11, ‘AI Systems’, $102, 2001),
(22, ‘Cell’, $201, 1954)
]
```
> A schema may have **numerous instances**, just like there can be many different tables.
<br/>
## Relation Database (DB)
A relational database is a **set of relations**.
- **DB schema**: the set of **schemas of all relations** in the database
- ✰ Example:
- **Book(ISBN, title, price, year)**
- **Category(CID, name)**
- **Book-Category(CID, ISBN)**
> Equal to structure of all tables.
- **Instance of a DB schema**: a set of **instances of its relations**
→ The actual data stored in the database at a given time.
> Means the actual data in all tables.
<br/>
## Relational Languages
Relational languages are formal languages used to **query and manipulate data** in a relational database. They provide a **mathematical foundation** for operations on relations.
- Main types:
1. **Datalog**: logic-based, rule-driven language
2. **Relational algebra (RA)**: procedural, specifies *how* to retrieve
3. **Relational calculus (RC)**: declarative, specifies *what* to retrieve
| Language | Type | What/How | Example Focus |
|----------|------------|-----------|--------------------------|
| **Datalog** | Logic | What | Rules / Facts / Queries |
| **RA** | Procedural | How | Operations on tables |
| **RC** | Declarative| What | Specify conditions only |
:::info
**RA** = **Safe Nonrecursive Datalog** = **Domain-Independent RC**
:::
:::warning
- Avoid using Rules that cannot be evaluated in a finite time (unsafe rules)
- A rule is safe if every variable ++appear in at least one positive atom (body)++
:::
<br/>
### 1️⃣ Datalog (nonrecursive)
Specify facts and rules, querying database using logic.
- Each ++tuple++ in a DB is a ++**fact**++:
- ✰ Example:
```S
Movie(8,‘Godfather II’, 1974, 390)
Actor(9, ’Robert De Niro’, 1943)
```
- Each ++query++ is a ++**rule**++:
- ✰ Example ①:
```S
Q1(y):- Movie(x, y, 1998, 20)
```
> (Movies that were produced in 1998 and made $20)
- ✰ Example ②:
```S
Q2(y):- Actor(x, y, z), Plays(t, x), Movie(t, v, w, 20)
```
> (Actors who played in a movie whose total gross is $20)
- ✰ Example ③:
```S
ExpensiveBook(B):- Book(B, _, P, _), P > 150
```
> May use `=`, `<`, `>` in rules
> (`B` is an ExpensiveBook if `B` is a Book and its price `P > 150`)
- ✰ Example ④:
```S
Q4(y):- Actor(x, y, z), Plays(t, x), Movie(t, v, w, 20)
Q4(y):- Actor(x, y, z), Plays(t, x), Movie(t, v, 1990, h)
```
> ++Union of rules++
> (Actors who played in a movie with gross of $20 or a movie made in 1990)
- ✰ Example ⑤:
```S
U5(x, y, z):- Actor(x, y, z), Plays(t, x), Plays(t, f), Actor(f, 'William', g)
Q5(y):- Actor(x, y, z), not U5(x,y,z)
```
> ++Using negation++ (All actors who did not play in a movie with "William")
:::success
==Head `:-` Body== acts like implication (`←`) in logic
==Atom== is the basic fact in the **body**.
:::
<br/>
### 2️⃣ Relational Algebra (RA)
Algebraic view on representing queries, performing operations on relations and returns another relation.
- **Key operators**:
| Operation | Operator | Description |
|-|-|-|
| Selection (filter tuples) | `σ` | Select a subset of rows from relation |
| Projection (select columns) | `π` | Select desired columns from relation |
| Join | `⋈` | Combine tuples from two relations based on a condition |
| Intersection | `∩` | Tuples that appear in **both** relations |
| Union | `∪` | Tuples in reln. 1 and in reln. 2. |
| self-Difference | `-` | Tuples in reln. 1, but not in reln. 2 |
| Cartesian Product (cross-product) | `×` | Combine two relations |
<br/>
- ✰ Example ①:
$$π_{title} (σ_{price>150} (Book))$$
> (Find titles of books priced over 150)
- ✰ Example ②:
$$Actor_{⋈} (𝐴𝑐𝑡𝑜𝑟.𝑎𝑖𝑑 =𝑃𝑙𝑎𝑦𝑠.𝑎𝑖𝑑) Play$$
> Perform a **join** between `Actor` and `Play` where `Actor.aid = Plays.aid`.
> More meaningful than cross-product
:::success
**Union-Compatible**:
- Two input relations have same number of attributes
- Corresponding attributes have the same domain (type)
- The output has the same schema as input
:::
<br/>
### 3️⃣ Relational Calculus (RC)
Describe properties of the desired result without specifying computation steps and apply **first-order logic**.
1. It is a **Declarative** Query Language.
2. SQL and RA can theoretically be **mapped to RC**.
3. It doesn't describe "++how to do it++", but only "++what conditions the answer must meet.++"
> Query results = All variable assignments that make the logical formula **True**
<br/>
- **Key Predicates**:
| Relational Predicate | Meaning / Description |
|-|-|
| `Atom` | A basic fact or relation, e.g., `Actor(x, y, z)` |
| `P ∧ P` (AND) | **Conjunction**: both predicates must be true |
| `P ∨ P` (OR) | **Disjunction**: either predicate can be true |
| `P ⇒ P` | **Implication**: if first predicate is true, second must be true |
| `not(P)` | **Negation**: predicate is false |
| `∀x.P` | **Universal quantifier**: P holds for all x |
| `∃x.P` | **Existential quantifier**: P holds for at least one x |
| `Query Q(x1, …, xn)` | A query is a predicate over selected variables |
:::spoiler **Quantifier Negation**
<br/>
| Original | Equivalent | Intuitive Meaning |
|-|-|-|
| ¬∀x P(x) | ∃x ¬P(x) | **Not all satisfy P → At least one does not satisfy P** |
| ¬∃x P(x) | ∀x ¬P(x) | **No element satisfies P → All tuples do not satisfy P** |
:::
<br/>
- Two Forms of RC:
| Form | Name | Description |
|-|-|-|
| **TRC** | Tuple Relational Calculus | The variable represents a "Tuple" |
| **DRC** | Domain Relational Calculus | The variable represents the "Field Value" |
<br/>
> ⦿ DRC Syntax
> `Q(x1, x2, ..., xn) := P(x1, x2, ..., xn)`
> - `Q(...)`: Query result schema
> - `P(...)`:: Logical Conditions
<!-- $$
\{ t.title \mid Book(t) \wedge t.price > 150 \}
$$
$$
\{ t \mid \exists ISBN, price, year (Book(ISBN, t, price, year) \wedge price > 150) \}
$$
<br/> -->
- ✰ Example ①:
```S
Q(name) := ∃a, b, m, t, y (Actor(a, name, b) ∧ Plays(m, a) ∧ Movie(m, t, y, 2000))
```
> (Actors who played in a movie with total gross of $2,000)
- ✰ Example ②:
```S
Q(name) := ∃a, b (Actor(a, name, b) ∧ ∀m ( Plays(m, a) ⇒ ∃t, g (Movie(m, t, 1990, g)) ))
```
> (All the films the actor has starred in were produced in 1990)
> `∀m ( Plays(m, a) ⇒ ... )` → ++If `a` played `m`, then `m` must have been born in 1990++
- ✰ Example ③:
```S
Q(name) := ∃a, b, m (Actor(a, name, b) ∧ Plays(m, a) ∧ ¬∃a2 ((a2 ≠ a) ∧ Plays(m, a2)))
```
> (Find the actors who have acted in "one-man show films")
> `Plays(m, a)` → `a` is in movie `m`
> `(Not Exist) ¬∃a2 ≠ a ...` → There are no other actors who also acted in `m`
> Hence, `m` has only one actor.
- ✰ Example ④: **(Negation)**
> Negation must be restricted to the known relation.
```S
Q(name) := ∃a, b (Actor(a, name, b) ∧ ¬∃m (Plays(m, a) ∧ ∃f, g (Actor(f, 'Sharon', g) ∧ Plays(m, f))))
```
> (Actors who have never worked with "Sharon")
- ✰ Example ⑤: ==(Unsafe / Domain-Dependent RC)==
```S
❌ Q(x) := not Actor(x, y, z)
```
> `x` can be any value in any universe (infinite),
> Result in an infinite number of answers.
- ✰ Example ⑥: **(Correspondence between ++RC++ and ++SQL++)**
- _**RC**_
```R
∃a, b, m, t, y (Actor(a, name, b) ∧ Plays(m, a) ∧ Movie(m, t, y, 2000))
```
- _**SQL**_
```SQL
SELECT A.name
FROM Actor A, Plays P, Movie M
WHERE A.aid = P.aid AND P.mid = M.mid AND M.gross = 2000
```
> SQL WHERE == RC Predicate
<br/>
## Conclusion
- The relational model provides a strong theoretical foundation for databases.
- **SQL** is translated internally into **RA**.
- **Datalog** and **RC** are widely used for reasoning, constraints, and knowledge representation.
- **RA, safe nonrecursive Datalog, and domain-independent RC are equivalent in expressive power**.
<br/>
:::spoiler Relevant Resource
[Ramakrishnan Database Management Systems](https://raw.githubusercontent.com/pforpallav/school/master/CPSC404/Ramakrishnan%20-%20Database%20Management%20Systems%203rd%20Edition.pdf)
<!-- [SQL Tutorial](http://w3schools.com/sql/) -->
:::