# A6: Indexes, triggers, user functions, transactions and population
This artefact contains the physical schema of the database, the identification and characterisation of the indexes, the support of data integrity rules with triggers and the definition of the database user-defined functions.
This artefact shows the database transactions needed to assure the integrity of the data in the presence of concurrent accesses. For each transaction, the isolation level is explicitly stated and justified and read-only transactions to improve global performance are identified and justified.
This artefact also contains the database's workload as well as the complete database creation script, including all SQL necessary to define all integrity constraints, indexes and triggers.
### 1. Database Workload
Understanding the nature of the workload for the application, and the performance goals, is essential to developing a good database design. The workload includes:
- the most important queries (SELECT) and how often they arise
- the most important modifications (UPDATE, DELETE) and how often they arise
- the desired performance for these queries and updates
- an estimate of the number of tuples for each relation
#### 1.1. Tuple Estimation
| **Relation reference** | **Relation Name** | **Order of magnitude** | **Estimated growth** |
| ---- | ------------------ | ----------------- | ---------------- |
| R01 | shopper | thousands | tens per day |
| R02 | client | thousands | tens per day |
| R03 | supplier | hundreds | units per day |
| R04 | purchase | hundreds | tens per day |
| R05 | item | thousands | tens per day |
| R06 | image | tens of thousands | tens per day |
| R07 | item_info | thousands | tens per day |
| R08 | tag | hundreds | units per day |
| R09 | tag_item | thousands | hundreds per day |
| R10 | coupon | hundreds | units per day |
| R11 | product | thousands | tens per day |
| R12 | bundle_product | hundreds | tens per day |
| R13 | ship_detail | thousands | tens per day |
| R14 | credit_card | thousands | tens per day |
| R15 | review | hundreds | tens per day |
| R16 | favorite | hundreds | tens per day |
| R17 | cart | thousands | tens per day |
| R18 | product_image | tens of thousands | hundreds per day |
#### 1.2. Frequent Queries
| **Query** | SELECT01 |
| --- | --- |
| **Description** | Log-in of user |
| **Frequency** | hundreds per day |
```sql
SELECT shopper_id
FROM shopper
WHERE shopper.email = $email
AND shopper.password = $password;
```
| **Query** | SELECT02 |
| --- | --- |
| **Description** | See favorites |
| **Frequency** | hundreds per day |
```sql
SELECT item.name, item.description, item.price
FROM favorite, client, item
WHERE client.client_id = favorite.id_client
AND item.item_id = favorite.id_item
AND client.client_id = $id_client;
```
| **Query** | SELECT03 |
| --- | --- |
| **Description** | See cart |
| **Frequency** | hundreds per day |
```sql
SELECT item.name, item.description, item.price
FROM cart, client, item
WHERE client.client_id = cart.id_client
AND item.item_id = cart.id_item
AND client.client_id = $id_client;
```
| **Query** | SELECT04 |
| --- | --- |
| **Description** | See supplier/client profile information |
| **Frequency** | tens per day |
```sql
-- Supplier
-- Supplier
SELECT name, address, post_code, city, description, path
FROM supplier, image
WHERE supplier.id_image = image.image_id
AND supplier.supplier_id = $id_supplier;
-- Client
SELECT client.name, image.path
FROM client, image
WHERE client.id_image = image.image_id
AND client.client_id = $id_client;
```
| **Query** | SELECT05 |
| --- | --- |
| **Description** | See reviews for item |
| **Frequency** | thousands per day |
```sql
SELECT item.item_id, item.name, review.rating, review.description, client.name
FROM review, item, client
WHERE review.id_item = item.item_id
AND review.id_client = client.client_id
AND review.id_item = $item_id;
```
| **Query** | SELECT06 |
| --- | --- |
| **Description** | Search items, tags and suppliers |
| **Frequency** | thousands per day |
```sql
SELECT *, ts_rank(text_search, to_tsquery('simple', $user_search)) as "rank"
FROM fts_view_weights
WHERE text_search @@ to_tsquery('simple', $user_search))
ORDER BY "rank" DESC;
```
| **Query** | SELECT07 |
| --- | --- |
| **Description** | Filtered Search |
| **Frequency** | thousands per day |
```sql
-- Search only items by name
SELECT *, ts_rank(search, to_tsquery('simple', $user_search)) as "rank"
FROM item
WHERE search @@ to_tsquery('simple', $user_search)
ORDER BY "rank" DESC;
-- Search only items by name and tags
SELECT *, ts_rank(search_query.text_search, to_tsquery('simple', $user_search)) FROM
(SELECT item.item_id as item_id,
string_agg(value, ' ') as tags,
(setweight(to_tsvector('simple', item.name), 'A') ||
setweight(to_tsvector('simple', string_agg(value, ' ')), 'B')
) as text_search
FROM item
JOIN tag_item ON (item.item_id = tag_item.id_item)
JOIN tag ON (tag_item.id_tag = tag.tag_id)
GROUP BY item_id) AS search_query
WHERE search_query.text_search @@ to_tsquery('simple', $user_search)
ORDER BY search_query.item_id;
```
#### 1.3. Frequent Updates
> Most important updates (INSERT, UPDATE, DELETE) and their frequency.
| **Query** | INSERT01 |
| --- | --- |
| **Description** | Create review |
| **Frequency** | units per day |
```sql
INSERT INTO review (id_client, id_item, rating, description)
VALUES ($client_id, $id_item, $rating, $description);
```
| **Query** | INSERT02 |
| --- | --- |
| **Description** | Upload image |
| **Frequency** | dozens per day |
```sql
INSERT INTO image (path) VALUES ($path);
```
| **Query** | INSERT03 |
| --- | --- |
| **Description** | Create Ship Detail |
| **Frequency** | dozens per day |
```sql
INSERT INTO ship_detail (first_name, last_name, address, door_n, post_code, district, city, country, phone_n, id_client)
VALUES ($first_name, $last_name, $address, $door_n, $zip_code, $district, $city, $country, $phone_n, $id_client);
```
| **Query** | INSERT04 |
| --- | --- |
| **Description** | Create Credit Card |
| **Frequency** | units per day |
```sql
INSERT INTO credit_card (card_n, expiration, cvv, holder, id_client)
VALUES ($card_n, $expiration, $cvv, $holder, $id_client);
```
<br><br>
| **Query** | UPDATE01 |
| --- | --- |
| **Description** | Update coupon expiration date |
| **Frequency** | Units per week |
```sql
UPDATE coupon
SET expiration = $new_date
WHERE coupon_id = $coupon_id;
```
| **Query** | UPDATE02 |
| --- | --- |
| **Description** | Update user information |
| **Frequency** | tens per day |
```sql
-- Update email
UPDATE shopper
SET email = $email
WHERE shopper_id = $shopper_id;
-- Update password
UPDATE shopper
SET password = $password
WHERE shopper_id = $shopper_id;
```
| **Query** | UPDATE03 |
| --- | --- |
| **Description** | Update product visability |
| **Frequency** | tens per day |
```sql
UPDATE item
SET active = $active
WHERE item_id = $item_id;
```
| **Query** | UPDATE04 |
| --- | --- |
| **Description** | Update supplier approval |
| **Frequency** | units per day |
```sql
UPDATE supplier
SET accepted = 'true'
WHERE supplier_id = $supplier_id;
```
<br><br>
| **Query** | DELETE01 |
| --- | --- |
| **Description** | Delete review |
| **Frequency** | tens per day |
```sql
DELETE FROM review
WHERE id_client = $id_client
AND id_item = $id_item;
```
| **Query** | DELETE02 |
| --- | --- |
| **Description** | Delete image |
| **Frequency** | tens per day |
```sql
DELETE FROM image
WHERE image_id = $id;
```
### 2. Proposed Indices
Indexes are used to enhance database performance by allowing the database server to find and retrieve specific rows much faster. An index defined on a column that is part of a join condition can also significantly speed up queries with joins. Moreover, indexes can also benefit UPDATE and DELETE commands with search conditions.
After an index is created, the system has to keep it synchronised with the table, which adds overhead to data manipulation operations. As indexes add overhead to the database system as a whole, they are used sensibly. The indexes proposed in the next section took in good consideration the impact on updates in the workload. The indexes benefits at least one query and none have negative impact in the updates or inserts
#### 2.1. Performance Indices
| **Index** | IDX01 |
| --- | --- |
| **Related queries** | SELECT02 |
| **Relation** | favorite |
| **Attribute** | id_client |
| **Type** | Hash |
| **Cardinality** | Medium |
| **Clustering** | yes |
| **Justification** | Table can become very large; query SELECT02 is used to search the user's favorite items and has to be fast because it's executed many times and represents items the user want to access fast; doesn't need range query support; cardinality is medium so it's a good candidate for clustering. |
```sql
CREATE INDEX favorite_client ON favorite USING hash (id_client);
```
| **Index** | IDX02 |
| --- | --- |
| **Related queries** | INSERT04 |
| **Relation** | credit_card |
| **Attribute** | id_client |
| **Type** | Hash |
| **Cardinality** | High |
| **Clustering** | yes |
| **Justification** | Table is large; query INSERT04 is used to insert a new credit_card associated with the user; Subsequent SELECT queries need to be fast because they'll be executed many times and can be executed during critical operations such as check-out; doesn't need range query support; cardinality is high so it's a good candidate for clustering. |
```sql
CREATE INDEX credit_card_client ON credit_card USING hash (id_client);
```
#### 2.2. Full-text Search Indices
| **Index** | IDX01 |
| --- | --- |
| **Related queries** | SELECT06, SELECT07 |
| **Relation** | fts_view_weights |
| **Attribute** | text_search |
| **Type** | GIST |
| **Clustering** | No |
| **Justification** | FTS tables need indexes on their search column for faster access and to retrieve results quicker |
```sql
CREATE INDEX search_weight_idx ON fts_view_weights USING GIST (text_search);
```
| **Index** | IDX02 |
| --- | --- |
| **Related queries** | SELECT06, SELECT07 |
| **Relation** | item |
| **Attribute** | search |
| **Type** | GIST |
| **Clustering** | No |
| **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column |
```sql
CREATE INDEX search_product_idx ON item USING GIST (search);
```
| **Index** | IDX03 |
| --- | --- |
| **Related queries** | SELECT06, SELECT07 |
| **Relation** | supplier |
| **Attribute** | search |
| **Type** | GIST |
| **Clustering** | No |
| **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column |
```sql
CREATE INDEX search_supplier_idx ON supplier USING GIST (search);
```
| **Index** | IDX04 |
| --- | --- |
| **Related queries** | SELECT06, SELECT07 |
| **Relation** | tag |
| **Attribute** | search |
| **Type** | GIST |
| **Clustering** | No |
| **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column |
```sql
CREATE INDEX search_tag_idx ON tag USING GIST (search);
```
### 3. Triggers
| **Trigger** | TRIGGER01 |
| --- | --- |
| **Description** | A coupon is deleted when the expiration date arrives |
| **Justification**| When a coupon date expires, it needs to be removed from the database|
```sql
CREATE OR REPLACE FUNCTION expired_coupon() RETURNS TRIGGER AS
$BODY$
BEGIN
IF EXISTS
(SELECT *
FROM coupon
WHERE expiration = now())
THEN
DELETE FROM coupon
WHERE coupon_id = OLD.coupon_id;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER expired_coupon
BEFORE INSERT OR UPDATE ON coupon
FOR EACH ROW
EXECUTE PROCEDURE expired_coupon();
```
| **Trigger** | TRIGGER02 |
| --- | --- |
| **Description** | A item becomes inactive when the supplier deletes the account|
| **Justification** | When the supplier deletes the account, his items become inactive so that clients can't search for them anymore|
```sql
CREATE OR REPLACE FUNCTION inactive_item() RETURNS TRIGGER AS
$BODY$
BEGIN
IF EXISTS
(SELECT *
FROM item, supplier
WHERE item.id_supplier = supplier.supplier_id)
THEN
UPDATE item
SET active = FALSE
WHERE supplier_id = OLD.supplier_id;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER inactive_item
BEFORE DELETE ON supplier
FOR EACH ROW
EXECUTE PROCEDURE inactive_item();
```
| **Trigger** | TRIGGER03 |
| --- | --- |
| **Description** | When a item receives a review, calculate the rating of that item |
| **Justification**| The trigger is needed to ensure the item rating is always up-to-date|
```sql
CREATE OR REPLACE FUNCTION update_rating() RETURNS TRIGGER AS
$BODY$
BEGIN
UPDATE item
SET rating = (SELECT AVG(review.rating) FROM review WHERE review.id_item = NEW.id_item)
WHERE item.item_id = NEW.id_item;
RETURN NULL;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER update_rating
AFTER INSERT OR UPDATE ON review
FOR EACH ROW
EXECUTE PROCEDURE update_rating();
```
| **Trigger** | TRIGGER04 |
| --- | --- |
| **Description** | A review only can be made on purchased items |
| **Justification**| This trigger enforces an existing business rule |
```sql
CREATE OR REPLACE FUNCTION item_review() RETURNS TRIGGER AS
$BODY$
BEGIN
IF NOT EXISTS
(SELECT * FROM item_info, purchase
WHERE NEW.id_client = purchase.id_client
AND item_info.id_purchase = purchase.purchase_id
AND item_info.id_item = NEW.id_item)
THEN
RAISE EXCEPTION
'A client cannot leave a review on a not purchased item: id_client: % | id_item: %', NEW.id_client, NEW.id_item;
END IF;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER item_review
BEFORE INSERT OR UPDATE ON review
FOR EACH ROW
EXECUTE PROCEDURE item_review();
```
| **Trigger** | TRIGGER05 |
| --- | --- |
| **Description** | When there's an update to an item, supplier or tag, the materialized view needs to be refreshed |
| **Justification**| By Refreshin the view, the search without filters is always up-to-date |
```sql
CREATE OR REPLACE FUNCTION search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
REFRESH MATERIALIZED VIEW fts_view_weights;
RETURN NULL;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER tag_item_search_update
BEFORE INSERT OR UPDATE ON tag_item
FOR EACH ROW
EXECUTE PROCEDURE search_update();
```
| **Trigger** | TRIGGER06 |
| --- | --- |
| **Description** | When a supplier creates or changes an account, calculate the tsvector for that supplier|
| **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead|
```sql
CREATE OR REPLACE FUNCTION supplier_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
IF TG_OP = 'INSERT'
THEN
NEW.search = setweight(to_tsvector('english', NEW.name), 'B');
-- || setweight(to_tsvector('english', NEW.description), 'C');
END IF;
IF TG_OP = 'UPDATE'
THEN
IF NEW.name <> OLD.name
THEN
NEW.search = setweight(to_tsvector('english', NEW.name), 'B');
-- || setweight(to_tsvector('english', NEW.description), 'C');
END IF;
END IF;
REFRESH MATERIALIZED VIEW fts_view_weights;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER supplier_search_update
BEFORE INSERT OR UPDATE ON supplier
FOR EACH ROW
EXECUTE PROCEDURE supplier_search_update();
```
| **Trigger** | TRIGGER07 |
| --- | --- |
| **Description** | When a supplier adds an item, calculate the ts_vector for that item |
| **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead|
```sql
CREATE OR REPLACE FUNCTION item_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
IF TG_OP = 'INSERT' THEN
NEW.search = setweight(to_tsvector('english', NEW.name), 'A');
END IF;
IF TG_OP = 'UPDATE'
THEN
IF NEW.name <> OLD.name
THEN
NEW.search = setweight(to_tsvector('english', new.name),'A');
END IF;
END IF;
REFRESH MATERIALIZED VIEW fts_view_weights;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER item_search_update
BEFORE INSERT OR UPDATE ON item
FOR EACH ROW
EXECUTE PROCEDURE item_search_update();
```
| **Trigger** | TRIGGER08 |
| --- | --- |
| **Description** | When a supplier adds a tag, calculate the ts_vector for that tag |
| **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead|
```sql
CREATE OR REPLACE FUNCTION tag_search_update() RETURNS TRIGGER AS
$BODY$
BEGIN
IF TG_OP = 'INSERT'
THEN
NEW.search = setweight(to_tsvector('english', NEW.value),'C');
END IF;
IF TG_OP = 'UPDATE'
THEN
IF NEW.name <> OLD.name
THEN
NEW.search = setweight(to_tsvector('english', NEW.value),'C');
END IF;
END IF;
REFRESH MATERIALIZED VIEW fts_view_weights;
RETURN NEW;
END
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER tag_search_update
BEFORE INSERT OR UPDATE ON tag
FOR EACH ROW
EXECUTE PROCEDURE tag_search_update();
```
### 4. Transactions
| T01 | Buying the products of the cart |
| --------------- | ----------------------------------- |
| Justification | When a client decides to buy the products of the cart, they can be out of stock because someone bought them first. If this happens, the all the purchase must fail.|
| Isolation level | SERIALIZABLE |
```sql
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
DO $$
DECLARE
out_of_stock int;
itm record;
BEGIN
SELECT COUNT(*) INTO out_of_stock
FROM cart, item, item_info
WHERE cart.id_item = item.item_id AND cart.id_client = $client_id
AND item.stock - item_info.amount < 0;
IF out_of_stock > 0 THEN
RAISE NOTICE 'Attempted to buy out of stock items';
ELSE
FOR itm IN SELECT item.item_id, item_info.amount
FROM cart, item, item_info
WHERE cart.id_item = item.item_id AND cart.id_client = $id_client
LOOP
UPDATE item
SET item.stock = item.stock - itm.amount
WHERE item.item_id = itm.item_id;
END LOOP;
INSERT INTO purchase (id_client, paid, purchase_date, type)
VALUES ($client_id, $paid, $purchase_date, $type);
DELETE FROM cart
WHERE id_client = $id_client;
COMMIT;
END IF;
END$$;
```
| T02 | Creating a new client/supplier |
| --------------- | ----------------------------------- |
| Justification | When a client creates an account, a system failure in the right moment can make that an entry in the table shopper is created, but not in the table client. In this scenario, the client couldn't login into his account or create a new one with the same email. The same goes for the supplier. The isolation level is Repeatable Read, since any insertion would disrupt the behaviour of the currval function|
| Isolation level | REPEATABLE READ |
```sql
--Client
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
DO $$
DECLARE new_id integer;
BEGIN
INSERT INTO shopper (email, password, is_admin)
VALUES ($email, $password, 'false') RETURNING shopper_id INTO new_id;
INSERT INTO client (client_id, name, id_image)
VALUES (new_id, $name, $id_image);
END $$;
--Supplier
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
DO $$
DECLARE new_id integer;
BEGIN
INSERT INTO shopper (email, password, is_admin)
VALUES ($email, $password, 'false') RETURNING shopper_id INTO new_id;
INSERT INTO supplier (supplier_id, name, address, post_code, city, description, accepted, id_image)
VALUES (new_id, $name, $address, $post_code, $city, $description, 'false', $id_image);
END $$;
```
| T03 | Creating a new product |
| --------------- | ----------------------------------- |
| Justification | When creating a product, two entries are created in two different tables. To guarantee consistency, a transaction is needed. The isolation level is Repeatable Read, since any insertion would disrupt the behaviour of the currval function |
| Isolation level | REPEATABLE READ |
```sql
--Product
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
DO $$
DECLARE new_id integer;
BEGIN
INSERT INTO item (id_supplier, name, price, stock, description, active, rating, is_bundle)
VALUES ($id_supplier, $name, $price, $stock, $description, $active, $rating, $is_bundle) RETURNING item_id INTO new_id;
INSERT INTO product(product_id, type)
VALUES (new_id, 'Kg');
END $$;
```
### 5. Complete SQL Code
#### 5.1. Database schema
[Link to the Database schema creation script]()
#### 5.2. Database population
[Link to the Database schema creation script]()