# A6: Indexes, triggers, user functions, transactions and population This artefact contains the physical schema of the database, the identification and characterisation of the indexes, the support of data integrity rules with triggers and the definition of the database user-defined functions. This artefact shows the database transactions needed to assure the integrity of the data in the presence of concurrent accesses. For each transaction, the isolation level is explicitly stated and justified and read-only transactions to improve global performance are identified and justified. This artefact also contains the database's workload as well as the complete database creation script, including all SQL necessary to define all integrity constraints, indexes and triggers. ### 1. Database Workload Understanding the nature of the workload for the application, and the performance goals, is essential to developing a good database design. The workload includes: - the most important queries (SELECT) and how often they arise - the most important modifications (UPDATE, DELETE) and how often they arise - the desired performance for these queries and updates - an estimate of the number of tuples for each relation #### 1.1. Tuple Estimation | **Relation reference** | **Relation Name** | **Order of magnitude** | **Estimated growth** | | ---- | ------------------ | ----------------- | ---------------- | | R01 | shopper | thousands | tens per day | | R02 | client | thousands | tens per day | | R03 | supplier | hundreds | units per day | | R04 | purchase | hundreds | tens per day | | R05 | item | thousands | tens per day | | R06 | image | tens of thousands | tens per day | | R07 | item_info | thousands | tens per day | | R08 | tag | hundreds | units per day | | R09 | tag_item | thousands | hundreds per day | | R10 | coupon | hundreds | units per day | | R11 | product | thousands | tens per day | | R12 | bundle_product | hundreds | tens per day | | R13 | ship_detail | thousands | tens per day | | R14 | credit_card | thousands | tens per day | | R15 | review | hundreds | tens per day | | R16 | favorite | hundreds | tens per day | | R17 | cart | thousands | tens per day | | R18 | product_image | tens of thousands | hundreds per day | #### 1.2. Frequent Queries | **Query** | SELECT01 | | --- | --- | | **Description** | Log-in of user | | **Frequency** | hundreds per day | ```sql SELECT shopper_id FROM shopper WHERE shopper.email = $email AND shopper.password = $password; ``` | **Query** | SELECT02 | | --- | --- | | **Description** | See favorites | | **Frequency** | hundreds per day | ```sql SELECT item.name, item.description, item.price FROM favorite, client, item WHERE client.client_id = favorite.id_client AND item.item_id = favorite.id_item AND client.client_id = $id_client; ``` | **Query** | SELECT03 | | --- | --- | | **Description** | See cart | | **Frequency** | hundreds per day | ```sql SELECT item.name, item.description, item.price FROM cart, client, item WHERE client.client_id = cart.id_client AND item.item_id = cart.id_item AND client.client_id = $id_client; ``` | **Query** | SELECT04 | | --- | --- | | **Description** | See supplier/client profile information | | **Frequency** | tens per day | ```sql -- Supplier -- Supplier SELECT name, address, post_code, city, description, path FROM supplier, image WHERE supplier.id_image = image.image_id AND supplier.supplier_id = $id_supplier; -- Client SELECT client.name, image.path FROM client, image WHERE client.id_image = image.image_id AND client.client_id = $id_client; ``` | **Query** | SELECT05 | | --- | --- | | **Description** | See reviews for item | | **Frequency** | thousands per day | ```sql SELECT item.item_id, item.name, review.rating, review.description, client.name FROM review, item, client WHERE review.id_item = item.item_id AND review.id_client = client.client_id AND review.id_item = $item_id; ``` | **Query** | SELECT06 | | --- | --- | | **Description** | Search items, tags and suppliers | | **Frequency** | thousands per day | ```sql SELECT *, ts_rank(text_search, to_tsquery('simple', $user_search)) as "rank" FROM fts_view_weights WHERE text_search @@ to_tsquery('simple', $user_search)) ORDER BY "rank" DESC; ``` | **Query** | SELECT07 | | --- | --- | | **Description** | Filtered Search | | **Frequency** | thousands per day | ```sql -- Search only items by name SELECT *, ts_rank(search, to_tsquery('simple', $user_search)) as "rank" FROM item WHERE search @@ to_tsquery('simple', $user_search) ORDER BY "rank" DESC; -- Search only items by name and tags SELECT *, ts_rank(search_query.text_search, to_tsquery('simple', $user_search)) FROM (SELECT item.item_id as item_id, string_agg(value, ' ') as tags, (setweight(to_tsvector('simple', item.name), 'A') || setweight(to_tsvector('simple', string_agg(value, ' ')), 'B') ) as text_search FROM item JOIN tag_item ON (item.item_id = tag_item.id_item) JOIN tag ON (tag_item.id_tag = tag.tag_id) GROUP BY item_id) AS search_query WHERE search_query.text_search @@ to_tsquery('simple', $user_search) ORDER BY search_query.item_id; ``` #### 1.3. Frequent Updates > Most important updates (INSERT, UPDATE, DELETE) and their frequency. | **Query** | INSERT01 | | --- | --- | | **Description** | Create review | | **Frequency** | units per day | ```sql INSERT INTO review (id_client, id_item, rating, description) VALUES ($client_id, $id_item, $rating, $description); ``` | **Query** | INSERT02 | | --- | --- | | **Description** | Upload image | | **Frequency** | dozens per day | ```sql INSERT INTO image (path) VALUES ($path); ``` | **Query** | INSERT03 | | --- | --- | | **Description** | Create Ship Detail | | **Frequency** | dozens per day | ```sql INSERT INTO ship_detail (first_name, last_name, address, door_n, post_code, district, city, country, phone_n, id_client) VALUES ($first_name, $last_name, $address, $door_n, $zip_code, $district, $city, $country, $phone_n, $id_client); ``` | **Query** | INSERT04 | | --- | --- | | **Description** | Create Credit Card | | **Frequency** | units per day | ```sql INSERT INTO credit_card (card_n, expiration, cvv, holder, id_client) VALUES ($card_n, $expiration, $cvv, $holder, $id_client); ``` <br><br> | **Query** | UPDATE01 | | --- | --- | | **Description** | Update coupon expiration date | | **Frequency** | Units per week | ```sql UPDATE coupon SET expiration = $new_date WHERE coupon_id = $coupon_id; ``` | **Query** | UPDATE02 | | --- | --- | | **Description** | Update user information | | **Frequency** | tens per day | ```sql -- Update email UPDATE shopper SET email = $email WHERE shopper_id = $shopper_id; -- Update password UPDATE shopper SET password = $password WHERE shopper_id = $shopper_id; ``` | **Query** | UPDATE03 | | --- | --- | | **Description** | Update product visability | | **Frequency** | tens per day | ```sql UPDATE item SET active = $active WHERE item_id = $item_id; ``` | **Query** | UPDATE04 | | --- | --- | | **Description** | Update supplier approval | | **Frequency** | units per day | ```sql UPDATE supplier SET accepted = 'true' WHERE supplier_id = $supplier_id; ``` <br><br> | **Query** | DELETE01 | | --- | --- | | **Description** | Delete review | | **Frequency** | tens per day | ```sql DELETE FROM review WHERE id_client = $id_client AND id_item = $id_item; ``` | **Query** | DELETE02 | | --- | --- | | **Description** | Delete image | | **Frequency** | tens per day | ```sql DELETE FROM image WHERE image_id = $id; ``` ### 2. Proposed Indices Indexes are used to enhance database performance by allowing the database server to find and retrieve specific rows much faster. An index defined on a column that is part of a join condition can also significantly speed up queries with joins. Moreover, indexes can also benefit UPDATE and DELETE commands with search conditions. After an index is created, the system has to keep it synchronised with the table, which adds overhead to data manipulation operations. As indexes add overhead to the database system as a whole, they are used sensibly. The indexes proposed in the next section took in good consideration the impact on updates in the workload. The indexes benefits at least one query and none have negative impact in the updates or inserts #### 2.1. Performance Indices | **Index** | IDX01 | | --- | --- | | **Related queries** | SELECT02 | | **Relation** | favorite | | **Attribute** | id_client | | **Type** | Hash | | **Cardinality** | Medium | | **Clustering** | yes | | **Justification** | Table can become very large; query SELECT02 is used to search the user's favorite items and has to be fast because it's executed many times and represents items the user want to access fast; doesn't need range query support; cardinality is medium so it's a good candidate for clustering. | ```sql CREATE INDEX favorite_client ON favorite USING hash (id_client); ``` | **Index** | IDX02 | | --- | --- | | **Related queries** | INSERT04 | | **Relation** | credit_card | | **Attribute** | id_client | | **Type** | Hash | | **Cardinality** | High | | **Clustering** | yes | | **Justification** | Table is large; query INSERT04 is used to insert a new credit_card associated with the user; Subsequent SELECT queries need to be fast because they'll be executed many times and can be executed during critical operations such as check-out; doesn't need range query support; cardinality is high so it's a good candidate for clustering. | ```sql CREATE INDEX credit_card_client ON credit_card USING hash (id_client); ``` #### 2.2. Full-text Search Indices | **Index** | IDX01 | | --- | --- | | **Related queries** | SELECT06, SELECT07 | | **Relation** | fts_view_weights | | **Attribute** | text_search | | **Type** | GIST | | **Clustering** | No | | **Justification** | FTS tables need indexes on their search column for faster access and to retrieve results quicker | ```sql CREATE INDEX search_weight_idx ON fts_view_weights USING GIST (text_search); ``` | **Index** | IDX02 | | --- | --- | | **Related queries** | SELECT06, SELECT07 | | **Relation** | item | | **Attribute** | search | | **Type** | GIST | | **Clustering** | No | | **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column | ```sql CREATE INDEX search_product_idx ON item USING GIST (search); ``` | **Index** | IDX03 | | --- | --- | | **Related queries** | SELECT06, SELECT07 | | **Relation** | supplier | | **Attribute** | search | | **Type** | GIST | | **Clustering** | No | | **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column | ```sql CREATE INDEX search_supplier_idx ON supplier USING GIST (search); ``` | **Index** | IDX04 | | --- | --- | | **Related queries** | SELECT06, SELECT07 | | **Relation** | tag | | **Attribute** | search | | **Type** | GIST | | **Clustering** | No | | **Justification** | To be able to utilize FTS to its fullest, there needs to be a index on the search column | ```sql CREATE INDEX search_tag_idx ON tag USING GIST (search); ``` ### 3. Triggers | **Trigger** | TRIGGER01 | | --- | --- | | **Description** | A coupon is deleted when the expiration date arrives | | **Justification**| When a coupon date expires, it needs to be removed from the database| ```sql CREATE OR REPLACE FUNCTION expired_coupon() RETURNS TRIGGER AS $BODY$ BEGIN IF EXISTS (SELECT * FROM coupon WHERE expiration = now()) THEN DELETE FROM coupon WHERE coupon_id = OLD.coupon_id; END IF; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER expired_coupon BEFORE INSERT OR UPDATE ON coupon FOR EACH ROW EXECUTE PROCEDURE expired_coupon(); ``` | **Trigger** | TRIGGER02 | | --- | --- | | **Description** | A item becomes inactive when the supplier deletes the account| | **Justification** | When the supplier deletes the account, his items become inactive so that clients can't search for them anymore| ```sql CREATE OR REPLACE FUNCTION inactive_item() RETURNS TRIGGER AS $BODY$ BEGIN IF EXISTS (SELECT * FROM item, supplier WHERE item.id_supplier = supplier.supplier_id) THEN UPDATE item SET active = FALSE WHERE supplier_id = OLD.supplier_id; END IF; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER inactive_item BEFORE DELETE ON supplier FOR EACH ROW EXECUTE PROCEDURE inactive_item(); ``` | **Trigger** | TRIGGER03 | | --- | --- | | **Description** | When a item receives a review, calculate the rating of that item | | **Justification**| The trigger is needed to ensure the item rating is always up-to-date| ```sql CREATE OR REPLACE FUNCTION update_rating() RETURNS TRIGGER AS $BODY$ BEGIN UPDATE item SET rating = (SELECT AVG(review.rating) FROM review WHERE review.id_item = NEW.id_item) WHERE item.item_id = NEW.id_item; RETURN NULL; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER update_rating AFTER INSERT OR UPDATE ON review FOR EACH ROW EXECUTE PROCEDURE update_rating(); ``` | **Trigger** | TRIGGER04 | | --- | --- | | **Description** | A review only can be made on purchased items | | **Justification**| This trigger enforces an existing business rule | ```sql CREATE OR REPLACE FUNCTION item_review() RETURNS TRIGGER AS $BODY$ BEGIN IF NOT EXISTS (SELECT * FROM item_info, purchase WHERE NEW.id_client = purchase.id_client AND item_info.id_purchase = purchase.purchase_id AND item_info.id_item = NEW.id_item) THEN RAISE EXCEPTION 'A client cannot leave a review on a not purchased item: id_client: % | id_item: %', NEW.id_client, NEW.id_item; END IF; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER item_review BEFORE INSERT OR UPDATE ON review FOR EACH ROW EXECUTE PROCEDURE item_review(); ``` | **Trigger** | TRIGGER05 | | --- | --- | | **Description** | When there's an update to an item, supplier or tag, the materialized view needs to be refreshed | | **Justification**| By Refreshin the view, the search without filters is always up-to-date | ```sql CREATE OR REPLACE FUNCTION search_update() RETURNS TRIGGER AS $BODY$ BEGIN REFRESH MATERIALIZED VIEW fts_view_weights; RETURN NULL; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER tag_item_search_update BEFORE INSERT OR UPDATE ON tag_item FOR EACH ROW EXECUTE PROCEDURE search_update(); ``` | **Trigger** | TRIGGER06 | | --- | --- | | **Description** | When a supplier creates or changes an account, calculate the tsvector for that supplier| | **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead| ```sql CREATE OR REPLACE FUNCTION supplier_search_update() RETURNS TRIGGER AS $BODY$ BEGIN IF TG_OP = 'INSERT' THEN NEW.search = setweight(to_tsvector('english', NEW.name), 'B'); -- || setweight(to_tsvector('english', NEW.description), 'C'); END IF; IF TG_OP = 'UPDATE' THEN IF NEW.name <> OLD.name THEN NEW.search = setweight(to_tsvector('english', NEW.name), 'B'); -- || setweight(to_tsvector('english', NEW.description), 'C'); END IF; END IF; REFRESH MATERIALIZED VIEW fts_view_weights; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER supplier_search_update BEFORE INSERT OR UPDATE ON supplier FOR EACH ROW EXECUTE PROCEDURE supplier_search_update(); ``` | **Trigger** | TRIGGER07 | | --- | --- | | **Description** | When a supplier adds an item, calculate the ts_vector for that item | | **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead| ```sql CREATE OR REPLACE FUNCTION item_search_update() RETURNS TRIGGER AS $BODY$ BEGIN IF TG_OP = 'INSERT' THEN NEW.search = setweight(to_tsvector('english', NEW.name), 'A'); END IF; IF TG_OP = 'UPDATE' THEN IF NEW.name <> OLD.name THEN NEW.search = setweight(to_tsvector('english', new.name),'A'); END IF; END IF; REFRESH MATERIALIZED VIEW fts_view_weights; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER item_search_update BEFORE INSERT OR UPDATE ON item FOR EACH ROW EXECUTE PROCEDURE item_search_update(); ``` | **Trigger** | TRIGGER08 | | --- | --- | | **Description** | When a supplier adds a tag, calculate the ts_vector for that tag | | **Justification**| By calculating the tsvector upon insertion and update only, we are reducing the overhead| ```sql CREATE OR REPLACE FUNCTION tag_search_update() RETURNS TRIGGER AS $BODY$ BEGIN IF TG_OP = 'INSERT' THEN NEW.search = setweight(to_tsvector('english', NEW.value),'C'); END IF; IF TG_OP = 'UPDATE' THEN IF NEW.name <> OLD.name THEN NEW.search = setweight(to_tsvector('english', NEW.value),'C'); END IF; END IF; REFRESH MATERIALIZED VIEW fts_view_weights; RETURN NEW; END $BODY$ LANGUAGE plpgsql; CREATE TRIGGER tag_search_update BEFORE INSERT OR UPDATE ON tag FOR EACH ROW EXECUTE PROCEDURE tag_search_update(); ``` ### 4. Transactions | T01 | Buying the products of the cart | | --------------- | ----------------------------------- | | Justification | When a client decides to buy the products of the cart, they can be out of stock because someone bought them first. If this happens, the all the purchase must fail.| | Isolation level | SERIALIZABLE | ```sql SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; DO $$ DECLARE out_of_stock int; itm record; BEGIN SELECT COUNT(*) INTO out_of_stock FROM cart, item, item_info WHERE cart.id_item = item.item_id AND cart.id_client = $client_id AND item.stock - item_info.amount < 0; IF out_of_stock > 0 THEN RAISE NOTICE 'Attempted to buy out of stock items'; ELSE FOR itm IN SELECT item.item_id, item_info.amount FROM cart, item, item_info WHERE cart.id_item = item.item_id AND cart.id_client = $id_client LOOP UPDATE item SET item.stock = item.stock - itm.amount WHERE item.item_id = itm.item_id; END LOOP; INSERT INTO purchase (id_client, paid, purchase_date, type) VALUES ($client_id, $paid, $purchase_date, $type); DELETE FROM cart WHERE id_client = $id_client; COMMIT; END IF; END$$; ``` | T02 | Creating a new client/supplier | | --------------- | ----------------------------------- | | Justification | When a client creates an account, a system failure in the right moment can make that an entry in the table shopper is created, but not in the table client. In this scenario, the client couldn't login into his account or create a new one with the same email. The same goes for the supplier. The isolation level is Repeatable Read, since any insertion would disrupt the behaviour of the currval function| | Isolation level | REPEATABLE READ | ```sql --Client SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; DO $$ DECLARE new_id integer; BEGIN INSERT INTO shopper (email, password, is_admin) VALUES ($email, $password, 'false') RETURNING shopper_id INTO new_id; INSERT INTO client (client_id, name, id_image) VALUES (new_id, $name, $id_image); END $$; --Supplier SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; DO $$ DECLARE new_id integer; BEGIN INSERT INTO shopper (email, password, is_admin) VALUES ($email, $password, 'false') RETURNING shopper_id INTO new_id; INSERT INTO supplier (supplier_id, name, address, post_code, city, description, accepted, id_image) VALUES (new_id, $name, $address, $post_code, $city, $description, 'false', $id_image); END $$; ``` | T03 | Creating a new product | | --------------- | ----------------------------------- | | Justification | When creating a product, two entries are created in two different tables. To guarantee consistency, a transaction is needed. The isolation level is Repeatable Read, since any insertion would disrupt the behaviour of the currval function | | Isolation level | REPEATABLE READ | ```sql --Product SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; DO $$ DECLARE new_id integer; BEGIN INSERT INTO item (id_supplier, name, price, stock, description, active, rating, is_bundle) VALUES ($id_supplier, $name, $price, $stock, $description, $active, $rating, $is_bundle) RETURNING item_id INTO new_id; INSERT INTO product(product_id, type) VALUES (new_id, 'Kg'); END $$; ``` ### 5. Complete SQL Code #### 5.1. Database schema [Link to the Database schema creation script]() #### 5.2. Database population [Link to the Database schema creation script]()