<!--
> Alguns gajos que tiveram ganda 20:
> [GIL](https://github.com/pdsam/Bazooki/wiki/a01)
> [MOAS](https://github.com/MOAAS/LBAW/wiki/a1)
> [EDU](https://github.com/EduRibeiro00/NewsLab-feup-lbaw)
> [Rui E Miguel duarte](https://github.com/xRuiAlves/FEUP-LBAW)
-->
# Database Specification (EBD)
## A4: Conceptual Data Model
This artefact specifies the UML diagram for our database, which will serve as a guide when coding the actual database for our website.
### 1. Class diagram

### 2. Additional Business Rules
| Identifier | Name | Description |
| ---------- | -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| ABR01 | Deleted User keeps Content | In UML the 0-1 association to the User mean that if the user is deleted, the Comment is kept, but displayed as "Created by deleted user" |
| ABR02 | Order Date | The date of the order must be before the date of the respective event. |
| ABR03 | Calculate Total | The total price of an order is calculated by the product of the Number of tickets and Price of the event. |
| ABR04 | Product Name and Vendor unique | For a given vendor, there can not exist two products with the same name. |
---------------------------------------------------
## A5: Relational schema, validation and schema refinement
This artifact contains the Relational Schema obtained by mapping from the Conceptual Data Model. The Relational Schema includes the relation schema, attributes, domains, primary keys, foreign keys and other integrity rules: UNIQUE, DEFAULT, NOT NULL, CHECK.
### 1. Relational Schema
| Relation Reference | Relation Compact Notation |
| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| R01 | user(<ins>user_id</ins>, first_name **NN**, last_name **NN**, email **NN** **UK**, password **NN**, role **IN** UserTypes, active **NN** , img_path **DF** UserImg) |
| R02 | blog_post(<ins>blog_post_id</ins> , title **NN**, img_path **NN**, content **NN**, publication_date **NN** **DF** NOW ,author **NN**, editor → user **NN** ) |
| R03 | event(<ins>event_id</ins>, start_date **NN** , end_date **NN**, price **NN** **CK** price > 0,publication_date **DF** NOW, title **NN**, img_path **NN**, description **NN**, editor_id → user **NN**, location_id → location, category → event_category , **CK** (start_date < end_date )
| R04 | category(<ins>category_id</ins>, name **NN** **UK**, img_path) |
| R05 | blog_category(<ins>blog_post_id</ins> → blog_post, <ins>category_id</ins> → category) |
| R06 | event_category(<ins>event_id</ins> → event, <ins>category_id</ins> → category) |
| R07 | vendor(<ins>vendor_id</ins>, name **NN**, job, location_id → location,img_path) |
| R08 | product(<ins>product_id</ins>,vendor_id → vendor, name **NN**, img_path **NN**, **UK**(vendor_id, name)) |
| R09 | comment(<ins>comment_id</ins>, user_id → user, content **NN**, date **NN** **DF** NOW, blog_post_id → blog **NN** ) |
| R10 | location(<ins>location_id</ins>, address **NN**, zip_code **NN**, dcp_id→district_county_parish **NN**) |
| R11 | district_county_parish(<ins>dcp_id</ins>, district **NN**, country **NN**, parish **NN**) |
| R12 | order(<ins>order_id</ins>, date **NN** **DF** NOW, code, number_tickets **NN**, event_id → event, user_id → user,total **CK** total > 0 )
### 2. Domains
| Domain Name | Domain Specification |
| -------- | -------- |
| UserTypes | ENUM('Visitor','Basic','Premium','Editor','Admin') |
| UserImg | defaultImage.png |
| NOW | DATE DEFAULT CURRENT_DATE |
### 3. Schema validation
| **TABLE R01** | user |
| -------------- | --- |
| **Keys** | {user_id}, {email} |
| **Functional Dependencies:** | |
| FD0101 | {user_id} → {first_name, last_name, email, password, role, active, img_path} |
| FD0102 | {email} → {user_id, first_name, last_name, password, role, active, img_path} |
| **NORMAL FORM** | BCNF |
| **TABLE R02** | blog_post |
| -------------- | --- |
| **Keys** | {blog_post_id} |
| **Functional Dependencies:** | |
| FD0201 | {blog_post_id} → {title, img_path, content, publication_date, author, editor_id, category} |
| **NORMAL FORM** | BCNF |
| **TABLE R03** | event |
| -------------- | --- |
| **Keys** | {event_id} |
| **Functional Dependencies:** | |
| FD0301 | {event_id} → {start_date, end_date, price, title, location, img_path, description, publication_date, editor, category} |
| **NORMAL FORM** | BCNF |
| **TABLE R04** | category |
| -------------- | --- |
| **Keys** | {category_id} |
| **Functional Dependencies:** | |
| FD0401 | {category_id} → {name,img_path} |
| FD0402 | {name} → {category_id,img_path} |
| **NORMAL FORM** | BCNF |
| **TABLE R05** | blog_category |
| -------------- | --- |
| **Keys** | {blog_post, blog_category_id} |
| **Functional Dependencies:** | |
| - | - |
| **NORMAL FORM** | BCNF |
| **TABLE R06** | event_category |
| -------------- | --- |
| **Keys** | {event_id, event_category_id} |
| **Functional Dependencies:** | |
| - | - |
| **NORMAL FORM** | BCNF |
| **TABLE R07** | vendor |
| -------------- | --- |
| **Keys** | {vendor_id} |
| **Functional Dependencies:** | |
| FD0701 | vendor_id → {name, job, location_id, image_path} |
| **NORMAL FORM** | BCNF |
| **TABLE R08** | product |
| -------------- | --- |
| **Keys** | {product_id}, {vendor_id, name}|
| **Functional Dependencies:** | |
| FD0801 | {product_id} → {vendor_id,name, img_path} |
|FD0802 | {vendor_id, name} → {product_id, img_path} |
| **NORMAL FORM** | BCNF |
| **TABLE R09** | comment |
| -------------- | --- |
| **Keys** | {comment_id} |
| **Functional Dependencies:** | |
| FD0901 | {comment_id} → {user_id, content, date, blog_post_id}|
| **NORMAL FORM** | BCNF |
| **TABLE R10** | location |
| -------------- | --- |
| **Keys** | {location_id} |
| **Functional Dependencies:** | |
| FD1001 | {location_id} →{address, zip_code, dcp_id} |
| **NORMAL FORM** | BCNF |
| **TABLE R11** | district_county_parish |
| -------------- | --- |
| **Keys** | {dcp_id} |
| **Functional Dependencies:** | |
| FD1101 | {dcp_id}→{district, county, parish} |
| **NORMAL FORM** | BCNF |
| **TABLE R12** | order |
| -------------- | --- |
| **Keys** | {order_id} |
| **Functional Dependencies:** | |
| FD1201 | {order_id} → {date, code, number_tickets, event_id, user_id, total} |
| FD1202 | {code} → {date, order_id, number_tickets, event_id, user_id, total} |
| **NORMAL FORM** | BCNF |
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions:
* It should be in the Third Normal Form.
* And, for any dependency A → B, A must be a super key.
Since all the relations are in the Boyce-Codd Normal Form the relation scheme is also in the BCNF.
### 4. SQL Code
```sql
DROP TYPE IF EXISTS role CASCADE;
DROP TABLE IF EXISTS "district_county_parish" CASCADE;
DROP TABLE IF EXISTS "location" CASCADE;
DROP TABLE IF EXISTS "user" CASCADE;
DROP TABLE IF EXISTS "blog_post" CASCADE;
DROP TABLE IF EXISTS "event" CASCADE;
DROP TABLE IF EXISTS "category" CASCADE;
DROP TABLE IF EXISTS "blog_category" CASCADE;
DROP TABLE IF EXISTS "event_category" CASCADE;
DROP TABLE IF EXISTS "vendor" CASCADE;
DROP TABLE IF EXISTS "product" CASCADE;
DROP TABLE IF EXISTS "comment" CASCADE;
DROP TABLE IF EXISTS "order" CASCADE;
CREATE TYPE role AS ENUM ('Basic', 'Premium', 'Editor', 'Admin');
CREATE TABLE "district_county_parish" ( --2882
dcp_id SERIAL PRIMARY KEY,
district TEXT NOT NULL,
county TEXT NOT NULL,
parish TEXT NOT NULL
);
CREATE TABLE "location" ( --100
location_id SERIAL PRIMARY KEY,
address TEXT NOT NULL,
zip_code TEXT NOT NULL,
dcp_id integer NOT NULL REFERENCES "district_county_parish"(dcp_id)
);
CREATE TABLE "user" ( --100
user_id SERIAL PRIMARY KEY,
first_name text NOT NULL,
last_name text NOT NULL,
email text NOT NULL CONSTRAINT user_email_uk UNIQUE,
password text NOT NULL,
"role" role NOT NULL DEFAULT 'Basic',
active BOOLEAN NOT NULL DEFAULT true,
image_path text,
user_search TSVECTOR
);
CREATE TABLE "blog_post" ( --30
blog_post_id SERIAL PRIMARY KEY,
title text NOT NULL,
image_path text NOT NULL,
content text NOT NULL,
publication_date timestamp NOT NULL DEFAULT now(),
author text NOT NULL,
editor integer NOT NULL REFERENCES "user"(user_id),
blog_search TSVECTOR
);
CREATE TABLE "event" ( --30
event_id SERIAL PRIMARY KEY,
start_date timestamp NOT NULL,
end_date timestamp NOT NULL,
publication_date timestamp NOT NULL DEFAULT now(),
price money NOT NULL CHECK (price > money(0.0)),
name text NOT NULL,
image_path text,
description text NOT NULL,
editor integer NOT NULL REFERENCES "user"(user_id),
location integer NOT NULL REFERENCES "location"(location_id),
event_search TSVECTOR,
CHECK (end_date > start_date)
);
CREATE TABLE "category" ( --15
category_id SERIAL PRIMARY KEY,
name text NOT NULL UNIQUE,
image_path text NOT NULL
);
CREATE TABLE "blog_category" ( --18
blog_category_id integer REFERENCES "category"(category_id) ON DELETE CASCADE,
blog_post_id integer REFERENCES "blog_post"(blog_post_id) ON DELETE CASCADE,
PRIMARY KEY(blog_category_id, blog_post_id)
);
CREATE TABLE "event_category" ( --18
event_category_id integer REFERENCES "category"(category_id) ON DELETE CASCADE,
event_id integer REFERENCES "event"(event_id) ON DELETE CASCADE,
PRIMARY KEY(event_category_id, event_id)
);
CREATE TABLE "vendor" ( --30
vendor_id SERIAL PRIMARY KEY,
"name" text NOT NULL,
job text,
location_id integer REFERENCES "location"(location_id),
description text,
image_path text NOT NULL,
vendor_search TSVECTOR
);
CREATE TABLE "product" ( --100
product_id SERIAL PRIMARY KEY,
vendor_id integer NOT NULL REFERENCES "vendor"(vendor_id) ON DELETE CASCADE,
"name" text NOT NULL,
image_path text NOT NULL,
UNIQUE(vendor_id, "name")
);
CREATE TABLE "comment" ( --100
comment_id SERIAL PRIMARY KEY,
blog_post_id integer NOT NULL REFERENCES "blog_post"(blog_post_id) ON DELETE CASCADE,
user_id integer NOT NULL REFERENCES "user"(user_id),
"content" text NOT NULL,
date timestamp NOT NULL DEFAULT now()
);
CREATE TABLE "order" ( --30
order_id SERIAL PRIMARY KEY,
event_id integer NOT NULL REFERENCES "event"(event_id),
user_id integer NOT NULL REFERENCES "user"(user_id),
date timestamp NOT NULL DEFAULT now(),
code text,
number_tickets integer NOT NULL,
total money NOT NULL DEFAULT 0 CHECK (total >= money(0.0))
);
```
---
## A6: Indexes, triggers, user functions, transactions and population
This document starts by studying the predicted database load, more specifically the dimension of each tuple (and their estimated growth) and the most important queries/updates (and their frequency). From this analysis, the most relevant indexes are identified and proposed.
The triggers and user-defined functions that support the data integrity are then described, along with the database transactions, clearly justified, required in the system to maintain the data's integrity on concurrent accesses.
The SQL code necessary to create the tables, triggers, UDFs, and indexes, as well as populate the database, is included in the end.
### 1. Database Workload
A study of the predicted system load (database load), organized in subsections.
#### 1.1. Tuple Estimation
| **Relation reference** | **Relation Name** | **Order of magnitude** | **Estimated growth** |
| ------------------ | ------------- | ------------------------- | -------- |
| R01 | user | thousands| units per day|
| R02 | blog_post | hundreds | dozens per month |
| R03 | event | hundreds | units per day |
| R04 | category | dozens | no growth |
| R05 | blog_category | dozens | no growth|
| R06 | event_category | dozens | no growth |
| R07 | vendor | hundreds | units per month |
| R08 | product | thousands | units per day |
| R09 | comment | thousands | units per day |
| R10 | location | thousands | units per day |
| R11 | district_county_parish | thousands | no growth |
| R12 | order | tens of thousands | dozens per day |
#### 1.2. Frequent Queries
| **Query**| SELECT01|
| ---| ---|
| **Description** | User login |
| **Frequency** | hundreds per day |
```sql
SELECT user_id, email, password
FROM "user"
WHERE email = $email
AND password = $password AND active=true;
```
| **Query**| SELECT02|
| ---| ---|
| **Description** | User profile |
| **Frequency** | dozens per day |
```sql
SELECT first_name, last_name, role, image_path
FROM "user"
WHERE user_id = $user_id;
```
| **Query**| SELECT03|
| ---| ---|
| **Description** | Main Blog posts |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM blog_post
ORDER BY publication_date DESC
LIMIT 10 --OFFSET 10;
```
| **Query**| SELECT04|
| ---| ---|
| **Description** | Main Events |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM "event"
ORDER BY start_date DESC
LIMIT 6 --OFFSET 6;
```
| **Query**| SELECT05|
| ---| ---|
| **Description** | Blog post comments |
| **Frequency** | hundreds per day |
```sql
SELECT user_id, comment_id, first_name, last_name, content, date, image_path
FROM comment NATURAL JOIN "user"
WHERE blog_post_id = $blog_post_id;
```
| **Query**| SELECT06|
| ---| ---|
| **Description** | Filter Blog posts by category |
| **Frequency** | dozens per day |
```sql
SELECT category_id, category AS category_name,
blog_post.blog_post.id, publication_date, title, content, blog_post.image_path, blog_post.author, blog_post.editor
FROM blog_post
JOIN blog_category ON blog_post.blog_post_id = blog_category.blog_post_id
JOIN category ON category_id = blog_category_id
WHERE category_id = $category_id;
```
| **Query**| SELECT07|
| ---| ---|
| **Description** | Filtered Events by category |
| **Frequency** | dozens per day |
```sql
SELECT category_id, category AS category_name,
event.event_id AS event_id,Event.start_date, end_date, publication_date, price, Event.name, Event.image_path, Event.description, location
FROM event
JOIN event_category ON event.event_id = event_category.event_id
JOIN category ON category_id = event_category_id
WHERE category_id = $category_id;
```
| **Query**| SELECT08|
| ---| ---|
| **Description** | Get Vendor |
| **Frequency** | hundreds per day |
```sql
SELECT *
FROM vendor NATURAL
JOIN location NATURAL
JOIN district_county_parish
WHERE vendor_id = $vendor_id;
```
| **Query**| SELECT09|
| ---| ---|
| **Description** | Vendor Products |
| **Frequency** | hundreds per day |
```sql
SELECT *
FROM product
JOIN vendor
ON vendor.vendor_id = product.vendor_id
WHERE vendor.vendor_id = $vendor_id;
```
| **Query**| SELECT10|
| ---| ---|
| **Description** | Current week events |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM event
WHERE start_date
BETWEEN
now() AND
current_date + 7
ORDER BY start_date;
```
| **Query**| SELECT11|
| ---| ---|
| **Description** | Next week events |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM event
WHERE start_date
BETWEEN
current_date + 7 AND
current_date + 14
ORDER BY start_date;
```
| **Query**| SELECT12|
| ---| ---|
| **Description** | Current month events |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM event
WHERE start_date
BETWEEN
now() AND
current_date + 31
ORDER BY start_date;
```
| **Query**| SELECT13|
| ---| ---|
| **Description** | Counting number of orders of an Event |
| **Frequency** | hundreds per day |
```sql
SELECT COUNT(*)
FROM "order"
WHERE event_id = $event_id;
```
| **Query**| SELECT14|
| ---| ---|
| **Description** | Events from past month |
| **Frequency** | hundreds per day |
```sql
SELECT * FROM "blog_post"
WHERE publication_date
BETWEEN
current_date - 31 AND
now()
ORDER BY publication_date;
```
| **Query**| SELECT15|
| ---| ---|
| **Description** | Select all locations in district|
| **Frequency** | hundreds per day |
```sql
SELECT location_id, address, zip_code,
district, county, parish,
district_county_parish.dcp_id
FROM "location" NATURAL JOIN district_county_parish
WHERE district = $district
order by location_id;
```
| **Query**| SELECT16|
| ---| ---|
| **Description** | Select user from type|
| **Frequency** | dozens per day |
```sql
SELECT * FROM "user"
WHERE "role" = $role;
```
| **Query**| SELECT17|
| ---| ---|
| **Description** | Searching for event |
| **Frequency** | hundreds per day |
```sql
SELECT event_id,name,start_date,end_date,price, location,description, ts_rank_cd(event_search, plainto_tsquery('english', $query)) AS rank
FROM event
WHERE event_search @@ plainto_tsquery('english', $query)
AND start_date >= current_date
ORDER BY rank DESC;
```
| **Query**| SELECT18|
| ---| ---|
| **Description** | Searching for a blog post |
| **Frequency** | hundreds per day |
```sql
SELECT blog_post_id, title, content, publication_date, author, editor, image_path, ts_rank_cd(blog_search, plainto_tsquery('english', $query)) AS rank
FROM event
WHERE blog_search @@ plainto_tsquery('english', $query)
ORDER BY rank DESC;
```
| **Query**| SELECT19|
| ---| ---|
| **Description** | Searching for a vendor |
| **Frequency** | hundreds per day |
```sql
SELECT vendor_id, name, job, location_id, description, image_path, ts_rank_cd(vendor_search, plainto_tsquery('english', $query)) AS rank
FROM "vendor"
WHERE vendor_search @@ plainto_tsquery('english', $query)
ORDER BY rank DESC;
```
| **Query**| SELECT20|
| ---| ---|
| **Description** | Searching for a user |
| **Frequency** | hundreds per day |
```sql
SELECT user_id, first_name, last_name, email, "role", image_path, ts_rank_cd(user_search, plainto_tsquery('english', $query)) AS rank
FROM "user"
WHERE user_search @@ plainto_tsquery('english', $query)
ORDER BY rank DESC;
```
#### 1.3. Frequent Updates
| **Query**| UPDATE01|
| ---| ---|
| **Description** | Update comment |
| **Frequency** | hundreds per day |
```sql
UPDATE comment
SET content=$content
WHERE comment_id=$comment_id;
```
| **Query**| UPDATE02|
| ---| ---|
| **Description** | Update event |
| **Frequency** | dozens per day |
```sql
UPDATE event
SET name = $name,
image_path = $image_path,
location = $location,
price = $price
WHERE event_id = $event_id;
```
| **Query** | UPDATE03 |
| --- | --- |
| **Description** | Update Vendor|
| **Frequency** | units per month |
```sql
UPDATE "vendor"
SET name= $name,
job = $job,
location_id = $location_id,
description = $description,
image_path = $image_path
WHERE vendor_id = $vendor_id;
```
| **Query** | UPDATE04 |
| --- | --- |
| **Description** | Update Product|
| **Frequency** | units per month |
```sql
UPDATE "product"
SET vendor_id = $vendor_id,
"name"= $name,
image_path = $image_path
WHERE product_id = $product_id;
```
| **Query**| INSERT01|
| ---| ---|
| **Description** | Create user |
| **Frequency** | dozens per day |
```sql
INSERT INTO "user" (first_name, last_name, email, password, role, image_path)
VALUES ($first_name, $last_name, $email, $password, $role, $image_path)
```
| **Query**| INSERT02 |
| ---| ---|
| **Description** | Create Blog Post |
| **Frequency** | dozens per day |
```sql
INSERT INTO blog_post (title, image_path, content, author, editor)
VALUES ($title, $image_path, $content, $author, $editor);
```
| **Query**| INSERT03|
| ---| ---|
| **Description** | Create Comment|
| **Frequency** | hundreds per day |
```sql
INSERT INTO "comment"(blog_post_id, user_id, "content")
VALUES ($blog_post_id, $user_id, $content);
```
| **Query**| INSERT04|
| ---| ---|
| **Description** | Create Vendor|
| **Frequency** | dozens per day |
```sql
INSERT INTO "vendor" ("name", job, location_id, description, image_path)
VALUES ($name, $job, $location_id, $description, $image_path)
```
| **Query**| INSERT05|
| ---| ---|
| **Description** | Create order|
| **Frequency** | dozens per day |
```sql
INSERT INTO "order" (event_id, user_id, date, code, number_tickets, total)
VALUES ($event_id, $user_id, $date, $code, $number_tickets, $total);
```
| **Query** | INSERT06 |
| ---| ---|
| **Description** | Create event|
| **Frequency** | dozens per day |
```sql
INSERT INTO event
(start_date, end_date,
price, name, image_path,
description, editor, location)
VALUES (
$start_date, $end_date, $price,
$name, $image_path, $description,
$editor, $location);
```
| **Query**| INSERT07|
| ---| ---|
| **Description** | Create Product |
| **Frequency** | dozens per day |
```sql
INSERT INTO "product" (first_name, last_name, email, password, role, image_path)
VALUES ($first_name, $last_name, $email, $password, $role, $image_path)
```
| **Query**| DELETE01|
| ---| ---|
| **Description** | Delete User|
| **Frequency** | units per month |
```sql
DELETE FROM "user"
WHERE user_id=$user_id;
```
| **Query**| DELETE02|
| ---| ---|
| **Description** | Delete Blog Post |
| **Frequency** | units per month |
```sql
DELETE FROM "blog_post"
WHERE blog_post_id=$blog_post_id;
```
| **Query**| DELETE03|
| ---| ---|
| **Description** | Delete Comment|
| **Frequency** | dozens per day |
```sql
DELETE FROM "comment"
WHERE comment_id=$comment_id;
```
| **Query**| DELETE04|
| ---| ---|
| **Description** | Delete Order|
| **Frequency** | units per month |
```sql
DELETE FROM "order"
WHERE order_id=$order_id;
```
| **Query**| DELETE05|
| ---| ---|
| **Description** | Delete Event|
| **Frequency** | units per month |
```sql
DELETE FROM event
WHERE event_id = $id;
```
| **Query** | DELETE06 |
| --- | --- |
| **Description** | Delete Vendor|
| **Frequency** | units per month |
```sql
DELETE FROM "vendor"
WHERE vendor_id=$vendor_id;
```
| **Query** | DELETE07 |
| --- | --- |
| **Description** | Delete Product|
| **Frequency** | units per week |
```sql
DELETE FROM "product"
WHERE product_id=$product_id;
```
### 2. Proposed Indices
#### 2.1. Performance Indices
| **Index** | IDX01 |
| --- | --- |
| **Related queries** | SELECT04, SELECT10,SELECT11,SELECT12 |
| **Relation** | event |
| **Attribute** | start_date |
| **Type** | B-tree |
| **Cardinality** | High |
| **Clustering** | No |
| **Justification** | Both the event post searching and the event posts on the event page are crucial features that will be run very often, and they always use the date for ordering or condition. The table is also large, and this index will improve performance on those queries. B-tree is chosen as inequality operators are used. Cardinality is high because there will be almost no events at the same time and day, and consequently it has no potential for clustering. |
```sql
CREATE INDEX event_date ON event USING btree(start_date);
```
<!--| **Index** | IDX02 |
| --- | --- |
| **Related queries** | SELECT07 |
| **Relation** | event |
| **Attribute** | price |
| **Type** | Hash |
| **Cardinality** | Medium |
| **Clustering** | No |
| **Justification** | The event table is very large and the price will be very requested when viewing the event and while ordering. The Hash was chosen because the website does not implement a system to order by price. Cardinallity is medium, since some events have the same price, being a good choice for clustering. |
```sql
CREATE INDEX event_price ON event USING hash(price);
```
| **Index** | IDX03 |
| --- | --- |
| **Related queries** | SELECT09 |
| **Relation** | product |
| **Attribute** | vendor_id |
| **Type** | Hash |
| **Cardinality** |Medium |
| **Clustering** | Yes |
| **Justification** | The products will run very frequently. The table is large, and this index will improve performance on these queries. |
```sql
CREATE INDEX vendor_products ON "product" USING btree(vendor_id);
``` -->
| **Index** | IDX02 |
| --- | --- |
| **Related queries** | SELECT05 |
| **Relation** | "comment" |
| **Attribute** | blog_post_id |
| **Type** | Hash |
| **Cardinality** | Medium |
| **Clustering** | Yes |
| **Justification** | The comment table is very large and a post's comments must be quickly queried. Cardinality is medium and the post id is the main attribute we're looking for, so it's a great candidate for clustering. |
```sql
CREATE INDEX comments_idx ON "comment" USING btree(blog_post_id);
```
| **Index** | IDX03 |
| --- | --- |
| **Related queries** | SELECT06 |
| **Relation** | "blog_post" |
| **Attribute** | publication_date |
| **Type** | B-tree |
| **Cardinality** | Medium |
| **Clustering** | No |
| **Justification** | Both the blog post searching and the event posts on the event Blog page are crucial features that will be run very often, and they always use the date for ordering or condition. The table is also large, and this index will improve performance on those queries. B-tree is chosen as inequality operators are used. Cardinality is High because there will be only one blog post at a time, not being a good candidate for clustering.
```sql
CREATE INDEX blog_post_date ON "blog_post" USING btree(publication_date);
```
| **Index** | IDX04 |
| --- | --- |
| **Related queries** | SELECT02 |
| **Relation** | "user" |
| **Attribute** | "role" |
| **Type** | Hash |
| **Cardinality** | Low |
| **Clustering** | No |
| **Justification** | The "role" of the user is a great choice for an index, since it will be requested for many reasons, including to see the user permissions when doing certain actions. Since this value is an enumeration, the values will have to be exact, the hash being the obvious choice. Cardinality is low since all the users can only have 4 types of values.
```sql
CREATE INDEX user_role_idx ON "user" USING hash("role");
```
| **Index** | IDX05 |
| --- | --- |
| **Related queries** | SELECT15 |
| **Relation** | district_county_parish |
| **Attribute** | district |
| **Type** | Hash |
| **Cardinality** | Medium |
| **Clustering** | Yes |
| **Justification** | The location table along with the district_county_parish are very large, having thousands of values that are ofter queried. Cardinality is medium, so district is a good candidate for clustering. |
```sql
CREATE INDEX dcp_district_idx ON "district_county_parish" USING hash(district);
```
| **Index** | IDX06 |
| --- | --- |
| **Related queries** | SELECT15 |
| **Relation** | district_county_parish |
| **Attribute** | county |
| **Type** | Hash |
| **Cardinality** | Medium |
| **Clustering** | Yes |
| **Justification** | The district county table is a very large table with thousands of entries, and oftenly queried since the events and the vendors have locations. Cardinality is medium, so it's a great candidate for clustering. |
```sql
CREATE INDEX dcp_county_idx ON "district_county_parish" USING hash(county);
```
| **Index** | IDX07 |
| --- | --- |
| **Related queries** | SELECT06 |
| **Relation** | blog_category |
| **Attribute** | blog_category_id |
| **Type** | Hash |
| **Cardinality** | Low |
| **Clustering** | Yes |
| **Justification** | The blog has posts with different categories so we can cluster them. Cardinality is low, so it's a great candidate for clustering. |
```sql
CREATE INDEX blog_category_idx ON blog_category USING hash(blog_category_id);
```
| **Index** | IDX08 |
| --- | --- |
| **Related queries** | SELECT07 |
| **Relation** | event_category |
| **Attribute** | event_category_id |
| **Type** | Hash |
| **Cardinality** | Low |
| **Clustering** | Yes |
| **Justification** | Events have different categories so we can cluster them. Cardinality is low, so it's a great candidate for clustering. |
```sql
CREATE INDEX event_category_idx ON event_category USING hash(event_category_id);
```
#### 2.2. Full-text Search Indices
| **Index** | IDX09 |
| --- | --- |
| **Related queries** | SELECT17 |
| **Relation** | event |
| **Attribute** | event_search (name, description) |
| **Type** | GiST |
| **Clustering** | No |
| **Justification** | To improve the performance of full text searches on event posts, while searching them by their name and description. GiST because it's better for dynamic data. |
```sql
CREATE INDEX search_event_idx ON event USING GIST(event_search);
```
| **Index** | IDX10 |
| --- | --- |
| **Related queries** | SELECT18 |
| **Relation** | blog_post |
| **Attribute** | blog_search (title, content) |
| **Type** | GiST |
| **Clustering** | No |
| **Justification** | To improve the performance of full text searches on blog posts, while searching them by their title and content. GiST because it's better for dynamic data. |
```sql
CREATE INDEX search_blog_idx ON "blog_post" USING GIST(blog_search);
```
| **Index** | IDX11 |
| --- | --- |
| **Related queries** | SELECT19 |
| **Relation** | vendor |
| **Attribute** | vendor_search (name, job, description) |
| **Type** | GiST |
| **Clustering** | No |
| **Justification** | To improve the performance of full text searches on vendor, while searching them by their name, job and description. GiST because it's better for dynamic data. |
```sql
CREATE INDEX search_vendor_idx ON "vendor" USING GIST(vendor_search);
```
| **Index** | IDX12 |
| --- | --- |
| **Related queries** | SELECT20 |
| **Relation** | user |
| **Attribute** | user_search (first_name, last_name, email) |
| **Type** | GiST |
| **Clustering** | No |
| **Justification** | To improve the performance of full text searches on users, while searching them by their first_name, last_name and email. GiST because it's better for dynamic data. |
```sql
CREATE INDEX search_user_idx ON "user" USING GIST(user_search);
```
### 3. Triggers
| **Trigger** | TRIGGER01 |
| --- | --- |
| **Description** | Calculate total on new order |
```sql
DROP TRIGGER IF EXISTS calculate_total ON "order";
CREATE OR REPLACE FUNCTION calculate_total() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.total = (SELECT price
FROM event
WHERE event.event_id = NEW.event_id) * NEW.number_tickets;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER calculate_total
BEFORE INSERT OR UPDATE ON "order"
FOR EACH ROW
EXECUTE PROCEDURE calculate_total();
```
| **Trigger** | TRIGGER02 |
| --- | --- |
| **Description** | Calculate total when updating event |
```sql
DROP TRIGGER IF EXISTS update_total ON "event";
CREATE OR REPLACE FUNCTION update_total() RETURNS TRIGGER AS
$BODY$
BEGIN
UPDATE "order"
SET total = (NEW.price) * "order".number_tickets
WHERE "order".event_id = NEW.event_id;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER update_total
AFTER UPDATE ON event
FOR EACH ROW
EXECUTE PROCEDURE update_total();
```
| **Trigger** | TRIGGER03 |
| --- | --- |
| **Description** | Start Date Check when inserting event |
```sql
DROP TRIGGER IF EXISTS event_date ON "event";
CREATE OR REPLACE FUNCTION event_date() RETURNS TRIGGER AS
$BODY$
BEGIN
IF (NEW.start_date < current_date) THEN
RAISE EXCEPTION 'Start Date must be in the future on event %', NEW.event_id;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER event_date
BEFORE INSERT OR UPDATE ON event
FOR EACH ROW
EXECUTE PROCEDURE event_date();
```
| **Trigger** | TRIGGER04 |
| --- | --- |
| **Description** | Start date check when inserting order |
```sql
DROP TRIGGER IF EXISTS order_start_date ON "order";
CREATE OR REPLACE FUNCTION order_start_date() RETURNS TRIGGER AS
$BODY$
BEGIN
IF (NEW.date > (SELECT start_date
FROM event
WHERE event.event_id = NEW.event_id)) THEN
RAISE EXCEPTION 'Order date must be before Event date. Order_id: %; Event_id: %', NEW.order_id, NEW.event_id ;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER order_start_date
BEFORE INSERT OR UPDATE ON "order"
FOR EACH ROW
EXECUTE PROCEDURE order_start_date();
```
| **Trigger** | TRIGGER05 |
| --- | --- |
| **Description** | Check if it was an editor doing the blog and event posts |
```sql
--Verify if it was an editor doing the post
CREATE OR REPLACE FUNCTION verify_editor() RETURNS TRIGGER AS
$BODY$
BEGIN
IF ('Editor' != (SELECT "role"
FROM "user"
WHERE NEW.editor = "user".user_id)) THEN
RAISE EXCEPTION 'Only Users with Editor permissions';
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
--Verify if it was an editor in the blog_post
DROP TRIGGER IF EXISTS verify_editor_blog ON blog_post;
CREATE TRIGGER verify_editor_blog
BEFORE INSERT OR UPDATE ON blog_post
FOR EACH ROW
EXECUTE PROCEDURE verify_editor();
--Verify if it was an editor in the event_post
DROP TRIGGER IF EXISTS verify_editor_event ON event;
CREATE TRIGGER verify_editor_event
BEFORE INSERT OR UPDATE ON event
FOR EACH ROW
EXECUTE PROCEDURE verify_editor();
```
| **Trigger** | TRIGGER06 |
| --- | --- |
| **Description** | When an event's info is changed/added, search tsvectors are properly updated. |
```sql
CREATE OR REPLACE FUNCTION event_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.event_search = setweight(to_tsvector('english', NEW.name), 'A') || setweight(to_tsvector('english', NEW.description), 'B');
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS event_search_update ON event;
CREATE TRIGGER event_search_update
BEFORE INSERT OR UPDATE ON event
FOR EACH ROW
EXECUTE PROCEDURE event_search_update();
```
| **Trigger** | TRIGGER07 |
| --- | --- |
| **Description** | When a blog post info is changed/added, search tsvectors are properly updated. |
```sql
CREATE OR REPLACE FUNCTION blog_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.blog_search = setweight(to_tsvector('english', NEW.title), 'A') || setweight(to_tsvector('english', NEW.content), 'B');
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS blog_search_update ON event;
CREATE TRIGGER blog_search_update
BEFORE INSERT OR UPDATE ON "blog_post"
FOR EACH ROW
EXECUTE PROCEDURE blog_search_update();
```
| **Trigger** | TRIGGER08 |
| --- | --- |
| **Description** | When a vendor's info is changed/added, search tsvectors are properly updated. |
```sql
CREATE OR REPLACE FUNCTION vendor_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.vendor_search = setweight(to_tsvector('english', NEW.name), 'A') || setweight(to_tsvector('english', NEW.job), 'B') || setweight(to_tsvector('english', NEW.description), 'C');
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS vendor_search_update ON event;
CREATE TRIGGER vendor_search_update
BEFORE INSERT OR UPDATE ON "vendor"
FOR EACH ROW
EXECUTE PROCEDURE vendor_search_update();
```
| **Trigger** | TRIGGER09 |
| --- | --- |
| **Description** | When user's info is changed/added, search tsvectors are properly updated. |
```sql
CREATE OR REPLACE FUNCTION user_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
NEW.user_search = setweight(to_tsvector('english', NEW.first_name), 'A') ||
setweight(to_tsvector('english', NEW.last_name), 'A') ||
setweight(to_tsvector('english', NEW.email), 'B');
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS user_search_update ON "user";
CREATE TRIGGER user_search_update
BEFORE INSERT OR UPDATE ON "user"
FOR EACH ROW
EXECUTE PROCEDURE user_search_update();
```
### 4. Transactions
| SQL Reference | Transaction Name |
| --------------- | ----------------------------------- |
| Justification | This transaction exists in order to maintain consistency, since the products are only updated and inserted when the vendor is created. If there is an error, a ROOLBACK is issued, maintaining both tables consistent. The isolation level is Repeatable Read, so that it doesn't update vendor_id during the transaction. |
| Isolation level | REPEATABLE READ|
```sql
BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO "vendor" (name, job, location_id, description, image_path)
VALUES ($name, $job, $location_id, $description, $image_path);
INSERT INTO "product" (vendor_id, name, image_path)
VALUES (currval('vendor_vendor_id_seq'), $name', $image_path);
COMMIT;
```
### 5. Complete SQL Code
#### 5.1. Database schema
```sql
```
#### 5.2. Database population
```sql
```
---
## Revision history
Changes made to the first submission:
1. (None)
***
GROUP2144, 23/04/2021
* Diogo André Barbosa Nunes, up201808546@fe.up.pt (editor)
* Marina Tostões Fernandes Leitão Dias, up201806787@fe.up.pt
* Rodrigo Campos Reis, up201806534@fe.up.pt
* Tiago Alexandre Pinto de Faria Ferreira Alves, up201603820@fe.up.pt