Database Systems Pre-Lab 5

--- title: Database Systems Pre-Lab 5 --- <h1 style='border: none'><center>Database Systems Pre-Lab 5</center></h1> <h2 style='border: none'><center>Intermediate SQL</center></h2> <h5><center>The Islamic University of Gaza Engineering Faculty Department of Computer Engineering</center></h5> <h6>Authors: Usama R. Al Zayan & Rasha E. Kahil2023/03/16</h6> --- ## Expecteed outcomes * To be introduced to Database Authorization commands and concepts. * To learn how to define, alter, and use Views and Materialized Views in PostgreSQL. * To practice working with dates in SQL. * To learn how to import CSV file into a PostgreSQL table. ## Lab 5: Time and Plan | Tasks | Timing | | -------- | -------- | | Quiz 5 | 10 min | | Task 1 | 10 min | | Task 2 | 40 min | | Task 3 | 15 min | | Task 4 | 20 min | ## Table of Content [toc] # Introduction In the last couple of labs, we had the chance to work with SQL and define many queries with a fair amount of complexity. We had the chance to look at topics like aggregate functions, joins, string functions, nested subqueries and many more. In this lab, we continue our SQL journey by exploring 3 of the most important topics in practice. Authorization, Views, and working with Dates or temporal data in general. We start by introducing a nice new statement that will be useful in out quest. ## The EXPLAIN Statement The explain statement is a special statement in PostgreSQL that -by itself- does not introduce any changes to the database or the query. Explain returns a description of the execution plan the query planner generates for a given statement, it also shows how the tables involved in the statement will be scanned by index scanned or sequential scan, etc. Furthermore, one of the most important and useful information that the explain statement gives is start-cost before the first row can be returned and the total cost to return the complete result set. The following shows the syntax of the explain statement: <center> ```sql explain [ (option [, ...]) ] sql_statement; ``` </center> Where option can be one of many options but we are now interested in: * ANALYZE * VERBOSE ### ANALYZE The analyze option causes the SQL statement to be executed first and then actual run-time statistics is returned, including total elapsed time expended within each phase and the number of rows that phase actually returned. The analyze statement actually executes the SQL statement, therefore, if you want to analyze any DML statement such as insert, update, or delete without affecting the data, you should wrap the explain analyze in a transaction like the following example: ```sql= begin; explain analyse update instructor i set salary = salary + (select count(*) from teaches t where t.id = i.id) * 1000; rollback; ``` Note: you can write analyse or analyze. Both will work. Yes, PostgreSQL is a respectable DBMS 😁 ### VERBOSE The verbose parameter allows you to show additional information regarding the plan. This parameter sets to false by default. Try adding it after analyze and observe the extra details you get. We will use `explain analyse` many times in this lab and to following one to be able to note in details what is actually happening when we execute a statement. # Simple Database Authorization Database authorization is a security mechanism used to determine access levels or user privileges related to database resources and objects. Our database security system is based on a two-step process. The first step is authentication, which verifies the user identity. The second stage is authorization, which is the process of the database system determining which database operations the user can perform on which objects. Our database security system is role-based, which allows the user to use his various privileges on resources and objects based on his given role/roles. ## Roles PostgreSQL uses the concept of roles to manage database access permissions. A role can be considered as a "user", a "group", or both, depending on how the role is set up. A role that has login right is called user. A role may also be a member of other roles, which are known as groups. Roles can own database objects {relations, functions, views, etc.} and have database privileges. A role can control who has access to which object by assign privileges on this object to other roles. Furthermore, it is possible to grant membership in a role to another role. ```sql= --To list out all existing roles on postgresql server Select * from pg_roles; ``` In psql command line, `\du` command is useful for listing the existing roles. ### Creating roles ```sql= --To define a new database role CREATE ROLE rolename [ [ WITH ] [options ... ] ]; ``` Only superusers and users with CREATEROLE privilege can create new roles. However, if you wish to create a new superuser, you must connect as a superuser, CREATEROLE privilege is not enough in this case. The default user “postgres” is a superuser. Being a superuser implies the ability to bypass all access permission checks within the database, so superuserdom should not be granted lightly! CREATE USER is the same as CREATE ROLE except that it implies LOGIN. Other than that, CREATE GROUP and CREATE USER are aliases for CREATE ROLE. ### Role attributes Attributes of a database role define role’s privileges including login, superuser, database creation, role creation, password, etc. ```sql= --To creates a role that has login privilege, password, and valid date CREATE ROLE user1 WITH LOGIN PASSWORD 'pg@pass1999' VALID UNTIL '2023-04-01'; --To creates a superuser which means this role can bypass all authorization checks CREATE ROLE super_admin SUPERUSER; --To create a role with database and role creation privilege CREATE ROLE admin CREATEDB CREATEROLE; ``` **Important Note:** if you skip the clause `PASSWORD` when crating a role, PostgreSQL will assign that role a `password = NULL`. A role cannot log in when it has a NULL password so make sure you include it. ### Role membership Managing roles as groups allows us to grant or revoke privileges from a group as a whole. In PostgreSQL, you create a role that represents a group, and then grant membership in the group role to individual user roles. By convention, a group role does not have LOGIN privilege. To create a group role use the CREATE ROLE statement as follows: <center> ```sql CREATE ROLE group_role; ``` </center> ```sql= --To add user roles or group roles to a group role GRANT group_role TO user_role1, user_role2,..., group_role1; --And to remove a user role from a group role REVOKE group_role FROM user_role1, user_role2, ...; ``` PostgreSQL does not allow the circular membership loops, in which a role is the member of another role and vice versa. So be aware not to create membership chains. A member of a group role can use the privileges of the role **in two ways:** #### SET Role The member temporarily “becomes” the group role. In this state, the member database session has access to the privileges of the group role rather than the original login role, and any database objects created are considered owned by the group role not the original member role. #### Role inheritance The member role that has the INHERIT attribute automatically has use of the privileges of roles of which he is member, including any privileges inherited by those roles. Carefully follow the example below: ```sql= CREATE ROLE Ahmad LOGIN INHERIT; CREATE ROLE sales NOINHERIT; CREATE ROLE marketing; GRANT sales to Ahmad; GRANT marketing to sales; ``` If you connect to PostgreSQL as `Ahmad`, you will have privileges of `Ahmad`, plus privileges granted to `sales`, because Ahmad user role has the INHERIT attribute. However, you do not have privileges of `marketing` because the `NOINHERIT` attribute is defined for the sales user role. When you are connected as `Ahmad`, and after executing the following statement: ```sql= SET ROLE marketing; ``` You will have only those privileges granted to `marketing`, not the ones granted to `Ahmad`. To restore the original privileges, you can use the following statement: ```sql= RESET ROLE; ``` **Notice** that only privileges on database objects are inheritable. `LOGIN`, `SUPERUSER`, `CREATEROLE`, and `CREATEDB` are special roles that cannot be inherited as an ordinary privilege. #### Drop roles NOTE: Before removing a role, you must reassign or remove all objects it owns and revoke its privileges. If you remove a group role, PostgreSQL revokes all memberships in a group automatically. The user roles of the group are not affected. <center> ```sql DROP ROLE role_name; ``` </center> ## Authorization Specification When a database object is created, it is assigned an owner. For most kinds of objects, the initial state is that only the owner (or a superuser) can do anything with the object. The owner could, however, choose to revoke some of their own privileges for safety or security reasons. To allow other roles to use the object, privileges must be granted. A privilege is a right to execute a particular type of SQL statement or to access another user's object. There are different kinds of privileges: `SELECT`, `INSERT`, `UPDATE`, `DELETE`, `REFERENCES`, `CREATE`, and `CONNECT`. ### The GRANT Command Grant gives privileges and confers authorizations on database objects. The grantor of the privilege must be the owner, superuser, or must hold the privilege on the specified item with the explicit permeation to give them to others ‘with grant option’. ```sql GRANT {{privileges list} | ALL PRIVILEGES ( column_name [, ...] ) {privileges list} | ALL PRIVILEGES ( column_name [, ...] ) ... } ON { [ TABLE ] table_or_view_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] } -- ON SCHEMA schema_name [, ...] -- ON DATABASE database_name [, ...] TO role_specification [, ...] [ WITH GRANT OPTION ] ``` 1. `ALL [PRIVILEGES]` -- Grant all of the privileges available for the object's type. The `PRIVILEGES` keyword is optional in PostgreSQL. 2. `role_specification` -- Can be: `role_name` | `PUBLIC` | `CURRENT_USER` | `SESSION_USER`. 3. `PUBLIC` can be thought of as an implicitly defined group that always includes all roles. Any particular role will have the sum of privileges granted directly to it, privileges granted to any role it is presently a member of, and privileges granted to `PUBLIC`. ### The REVOKE Command Revoke is used to strip previously granted privileges from one or more roles. ```sql REVOKE {{privileges list} | ALL PRIVILEGES ( column_name [, ...] ) {privileges list} | ALL PRIVILEGES ( column_name [, ...] ) ... } ON { [ TABLE ] table_or_view_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] } -- ON SCHEMA schema_name [, ...] -- ON DATABASE database_name [, ...] FROM role_specification [, ...] [ cascade | restrict ] ``` Note: Revoking a privilege from `PUBLIC` does not necessarily mean that all roles have lost that privilege on the object; those who have it granted directly or via another role will still have it! # Views In PostgreSQL As we have seen in our lecture, a view is a database object that is in reality just a stored query. We have seen that a view can be accessed as a virtual table. In other words, we can issue select statements on it, we can join it with other tables or views, or use it anyway we use a real table. Keep in mind that a view does not store data physically, each time we use the view the DBMS runs the stored query and constructs the table to be used. This is the case for all view except for a materialized view which we will explore later. A view can be very useful in many cases. It helps simplify the complexity of a query because we can query a view -which can be based on a complex query- using a simple select statement. Also, like a table; you can grant permission to users through a view that contains specific data those users are authorized to see. To create a view, we use create view statement like the following example: ```sql= create view student_credits as select name, tot_cred from student; ``` To make simple changes to the defining query of a view, we can use the same create statement with or replace addition as follows: ```sql= create or replace view student_credits as select name, tot_cred from student where tot_cred is not null; ``` Note that in PostgreSQL using the create or replace clause you cannot remove an existing column in the view. :::warning :warning: If you try to do it, you will get an error message: “[Err] ERROR: cannot drop columns from view”. ::: The new query must generate the same columns that were generated when the view was created. To be more specific, the new columns must have the same names, same data types, and in the same order as they were created. However, you can append additional columns at the end of the column list. To be able to drastically change the defining query of a view, your only good option is to drop that view and create it again. You can do both the drop of the old view and the creation of the new one in a single transaction to be safe. But things get more complicated if your view has dependent objects. This is beyond the scope of this lab but you are encouraged to read about it. However, if you wish to make simple changes to the view such as changing its name or giving its columns a default value (that can be used in the insert) you can use the alter view statement. For example: ```sql= alter view a rename to b; create table base_table (id int, ts timestamptz); create view a_view as select * from base_table; alter view a_view alter column ts set default now(); insert into base_table(id) values(1); -- ts will receive a null insert into a_view(id) values(2); -- ts will receive the current time ``` :::info For your vast Knowledge: For historical reasons, alter table can be used with views too! ::: It is also worth noting that a view is not only a stored query, PostgreSQL stores additional information and parameters such as the schema of the view when created. You can test that😁. Views created with <code>select * from</code> wouldn't "automatically inherit" the new column, because * gets expanded into a column-list during view creation. Yet if you dropped and re-created the view, the column would appear. You can follow up on the pervious example by executing the following statements and observing the results! ```sql= select * from base_table; select * from a_view; alter table base_table add new_column varchar(20); select * from base_table; select * from a_view; ``` Finally, I guess I don’t have to tell you that you can drop a view use drop view. Anyway, let’s include that example for the sake of completion. ```sql= drop view if exists a_view; ``` ## View Usage Example As discussed in Section 4.4.8 in our textbook, we expect the constraint “an instructor cannot teach sections in two different classrooms in a semester in the same time slot” to hold. Let’s use a view to help us list all (instructor, section) combinations that violate this constraint. First, we create a view showing the ID, name, and all section information for each section thought by any instructor in our university. ```sql= create or replace view all_sections as select ID, "name" as instructor_name , course_id, sec_id, semester, "year", time_slot_id, building , room_number from instructor natural join teaches natural join "section"; ``` We can use this view to answer many queries such as * How many sections did an instructor teach in a certain year? * How many sessions does every room serve? * What is the course with the largest number of different instructors throughout time? * And so on. So now we can use this view to write our main query looking for instructors teaching sections in two different classrooms in a semester in the same time slot. The query will be as simple as: ```sql= select ID from all_sections group by (ID, instructor_name, sec_id, semester, "year", time_slot_id) having count( (building , room_number) ) > 1 ``` ## Materialized Views Materialized views extend the view concept to allow views to store data physically. A materialized view can cache the result of a complex or expensive query and allow us to read those results repeatedly without having to execute the query every time. We can still run the query when we want to refresh this result. Materialized views are useful in many cases that require fast data access therefore they are often used in data warehouses or business intelligent applications. To create a materialized view, you use the create materialized view statement ```sql create materialized view view_name as <<SQL query>> with [no] data; ``` If we want to load data into the materialized view at the creation time, we use with data option, otherwise we put with no data. In case we use with no data, the view is flagged as unreadable. It means that we cannot query data from the view until you load data into it. To load data into a materialized view, we use the refresh materialized view statement. For example, if we wish to materialize the results of our view “all sections” we can use: ```sql= create materialized view mat_all_sections as select ID, "name" as instructor_name , course_id, sec_id, semester, "year", time_slot_id, building , room_number from instructor natural join teaches natural join "section" with no data; ``` If we then try to query this view before refreshing it, **we get an error**. Try it and note the hint. # Dates Dates hold a special place in terms of value. Sometimes the date and time an event occurs are the most important information about it, and for that SQL gives date and time the special treatment they deserve. In addition to providing date, time, timestamp, and interval datatypes. SQL provides a rich collection of date functions we can use to achieve the exact results we need. We will first look into dates’ datatypes then we will explore the functions. ## Dates Datatypes ### DATE PostgreSQL uses 4 bytes to store a date value, so the lowest and highest values a date can have are 4713 BC and 5874897 AD. When storing a date value, PostgreSQL -like all database systems-uses the <code>yyyy-mm-dd</code> format. It also uses this format for inserting data into a date column. If you create a table that has a date column and you want to use the current date as the default value for the column, you can use the CURRENT_DATE after the default keyword. Also, if you wish to use it in your query you can write select CURRENT_DATE; To extracting day, month, or year from a date value, we use the extract function as follows: ```sql= Select CURRENT_DATE, extract (day from CURRENT_DATE) as day, extract (month from CURRENT_DATE) as month, extract (year from CURRENT_DATE) as year; ``` ### TIME The TIME data type that allows you to store the time of day values. A time value may have a precision up to 1 microsecond. The TIME data type requires 8 bytes and its allowed range is from 00:00:00 to 24:00:00. Besides the TIME data type, PostgreSQL provides the ‘TIME with time zone’ data type that allows you to store and manipulate the time of day with time zone. To get the current time with time zone, you use CURRENT_TIME. You can also use in your query and specify the precession. To get the local time, you use LOCALTIME. To extracting hours, minutes, seconds from a time value, we use the extract function as follows: ```sql= Select LOCALTIME, extract (hour from LOCALTIME) as hour, extract (minute from LOCALTIME) as minute, extract (second from LOCALTIME) as second, extract (milliseconds from LOCALTIME) as milliseconds, extract (microseconds from LOCALTIME) as microseconds; ``` ### Timestamp As we have seen in our lecture, timestamp data type allows us to store both date and time. PostgreSQL provides two temporal data types for handling timestamp, one without time zone (timestamp) and one with time zone (timestamptz). PostgreSQL stores the timestamptz in UTC value. When you insert a value into a timestamptz column, PostgreSQL converts the timestamptz value into a UTC value and stores the UTC value in the table. When you query timestamptz from the database, PostgreSQL converts the UTC value back to the time value of the time zone set by the database server, the user, or the current database connection. Both timestamp and timestamptz uses 8 bytes for storing the timestamp value. You can get the time zone of your server by issuing the command: `show TIMEZONE;` To set the time zone of your database server you can sue `set timezone = 'your new t z';` To get the current timestamp you can use the now () function for example: `select now ();` To get the time of day in the string format, you can use the `timeofday()` function. Similar to date and time, to extracting hours, minutes, seconds, day, month, year from a timestamp value, we use the extract function exactly as mentioned above; ### Interval The interval data type allows us to store and manipulate a period of time in years, months, days, hours, minutes, seconds, etc. An interval value requires 16 bytes storage size that can store a period in the range from -178,000,000 years to 178,000,000 years. In addition, an interval value can have an optional precision value p with the permitted range is from 0 to 6. The precision p is the number of fraction digits retained in the second fields. The following examples show some interval values: ```sql= select interval '2 weeks ago'; select interval '2 hours 10 minutes'; select interval '2 years 1 month 3 days 6 hours 12 minutes 30 seconds'; ``` You can apply the arithmetic operators (+, -, *, /) to the interval values, for examples: ```sql= select interval '2h 50m' + interval '10m' -- is 03:00:00 select interval '2h 50m' - interval '50m' -- is 02:00:00 select interval '2h 20m' / 2 -- is 01:10:00 select 600 * interval '1 minute' -- is 10:00:00 ``` Similar to date, time, and timestamp, to extracting hours, minutes, seconds, day, month, year from an interval value, we use the extract function. Note that the returned value here is equivalent to ‘casting’ the part of this interval as another smaller interval. It will not do time conversion for you. Discuss the following examples with your colleague: ```sql= select extract (day from interval '2 weeks ago'); -- is -14 select extract (hour from interval '2 weeks ago'); -- is 0 select extract (hour from interval '2.5 days ago'); -- is -12 select extract (year from interval '6 month'); -- is 0 select extract (year from interval '30 month'); -- is 2 select extract (month from interval '30 month'); -- is 6 ``` ### Special Date/Time Inputs | Input String | Valid Types | Description | | ------------ | ----------- | ----------- | | infinity | date, timestamp | later than all other time stamps | | -infinity | date, timestamp | earlier than all other time stamps | | Now | date, time, timestamp | current transaction's start time | | Today | date, timestamp | midnight (00:00) today | | tomorrow | date, timestamp | midnight (00:00) tomorrow | | yesterday | date, timestamp | midnight (00:00) yesterday | | allballs | time | 00:00:00.00 UTC | ## Date Functions ### Age() As the name suggests, age() function will calculate the age for you. If you pass one date value to the age() function, it will subtract that date value from the current date. If you pass two arguments to the age() function, it will subtract the second argument from the first argument. ### date_part(text, timestamp) This function is exactly equivalent to extract function we explored above. It appears in many contexts so when you see it now you know :smile:. ### make_date(year int, month int, day int) It simply makes a date. You can also write: date ‘2020-03-07’ ### date_trunc(‘field’, timestamp/interval) The function date_trunc is conceptually similar to truncating numbers. In the function you can specify a field that tells to which precision to truncate the input value. The return value is likewise of type timestamp, or interval, and it has all fields that are less significant than the selected one set to zero (or one, for day and month). Valid values for field are: `{microseconds, milliseconds, second, minute, hour, day, week, month, quarter, year, decade, century, millennium}`. Examples: ```sql= select date_trunc('hour', timestamp '2001-02-16 20:38:40'); --result: 2001-02-16 20:00:00 select date_trunc('year', timestamp '2001-02-16 20:38:40'); --result: 2001-01-01 00:00:00 select date_trunc('day', timestamp '2001-02-16 20:38:40'); --result: 2001-02-16 00:00:00 select date_trunc('hour', interval '3 days 02:47:33'); --result: 3 days 02:00:00 ``` ### SQL overlaps operator <center> ``` (start1, end1) OVERLAPS (start2, end2) or (start1, length1) OVERLAPS (start2, length2) ``` </center> This expression yields true when two time periods (defined by their endpoints) overlap, false when they do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time, or time stamp followed by an interval. When a pair of values is provided, either the start or the end can be written first; overlaps automatically takes the earlier value of the pair as the start. Each time period is considered to represent the half-open interval start <= time < end, unless start and end are equal in which case it represents that single time instant. This means for instance that 5-6 and 6-7 do not overlap. Try the following example: ```sql= select (date '2001-02-16', date '2001-12-21') overlaps (date '2001-10-30', date '2002-10-30'); -- True select (date '2001-02-16', interval '100 days') overlaps (date '2001-10-30', date '2002-10-30'); -- False select (date '2001-10-29', date '2001-10-30') overlaps (date '2001-10-30', date '2001-10-31'); -- False select (date '2001-10-30', date '2001-10-30') overlaps (date '2001-10-30', date '2001-10-31'); -- True ``` **Unfortunately**, our university Database contains no dates to experiment with. Don’t worry, in the next segment we will import a dataset that will allow us to experiment and do some exercises with dates. # More pre-Lab ## Importing data from a CSV file PostgreSQL provides a nice way of importing data into tables even from files such as csv files that can be exported from almost any program. PostgreSQL uses a streaming protocol called the copy protocol that reads data and inserts it to the table. If you are using PostgreSQL from the shell, you can use the <code>\copy</code> command. If you use DBeaver or pgAdmin, both of them provide a GUI to perform the import. We wish to import the file named `web_events.csv` uploaded for you [here](https://drive.google.com/file/d/1zG2i0YFzRwJsgCkS6t-i1Oae5YmG4KHD/view). But first we must create the table that will house the data. ```sql= create table web_events ( event_time timestamptz, event_type text, product_id text, category_id text, category_code text, brand text, price numeric, user_id text, user_session text ); ``` After creating the table and refreshing your schema you can 1. Navigate to the table inside your Database Navigator menu 2. Right click on the table and choose ‘import Data’ 3. From the newly opened menu choose ‘CSV import from CSV file(s)’ 4. Add ‘web_events.csv’ that you have downloaded as a resource. 5. In ‘Importer settings’ section, make sure the option ‘set empty strings to NULL’ is active. 6. Click Next, Next, Next, and start. After this process is completed you will have all new data imported to your new table `web_events`. This data is a subset of events happening on an online store for the entire month of October 2019. You can explore the data by running any select statements you find useful. Some useful ones I will think of are: ```sql= select * from web_events limit 10; select count(*) from web_events ; select count(distinct user_session) from web_events; select count(distinct event_time) from web_events; select count(distinct product_id) from web_events; select count(distinct category_id) from web_events; select count(distinct user_id) from web_events; select distinct event_type from web_events; ``` Try executing those quires and any others you think of. Write down your notes about his dataset and try to discuss this with your colleague. Exploring large datasets and making sense of them using SQL commands can prove to be a very important skill in your arsenal, so make sure you practice this whenever you get a chance. ## Queries with date values and functions We can now use this dataset to answer the following queries. 1. Show the number of `purchase` events that happened every day of this month ordered descending from the day with the most events to the least. ```sql= select date_trunc('day' , event_time) as day, count(*) as Number_of_purchase_events from web_events where event_type = 'purchase' group by 1 order by 2 desc; ``` Note how we used column numbers (in the order they appear in the select clause) in both the group by and order by classes. This makes writing the query easier. Be carful when you do that while adding or removing columns from your query as the order of the columns now matter. 2. Show the day of the week that contains the most purchase events (you can add limit 1 if you like). ```sql= select to_char( event_time, 'Day' ) as day_of_week, count(*) as Number_of_purchase_events from web_events where event_type = 'purchase' group by 1 order by 2 desc; ``` 3. Show the hour of the day with the least events (you can add limit 1 if you like). ```sql= select extract(hour from event_time) as hour_of_day, count(*) as Number_of_events from web_events group by 1 order by 2; ``` 4. Show the daily total sales throughout the month, ordered descending from the day with the most sales (by value) to the day with the least. ```sql= Select date_trunc('day' , event_time) as day, sum(price) as totoal_sales from web_events where event_type = 'purchase' group by 1 order by 2 desc; ``` 5. Show the day(s) that has the maximum number of purchase events. ```sql= with daily_purchase as ( select date_trunc('day' , event_time) as day, count(*) as Number_of_purchase_events from web_events where event_type = 'purchase' group by 1 ), max_purchases as ( select max(Number_of_purchase_events) as value from daily_purchase ) select * from daily_purchase, max_purchases where max_purchases.value = daily_purchase.Number_of_purchase_events; ``` 6. Show the day that has the maximum number of different visitors. ```sql= select date_trunc('day' , event_time) as day, count (distinct user_id) as visitors from web_events group by 1 order by 2 desc; ``` 7. For each week, show the trending category, the category has products that are viewed the most. ```sql= with weekly_cat_report as ( select extract(week from event_time) as week, category_code, count(*) as number_of_view_events from web_events where event_type = 'view' group by 1,2 ) select * from weekly_cat_report r1 where r1.number_of_view_events >= all ( select r2.number_of_view_events from weekly_cat_report r2 where r1.week = r2.week); ``` 8. Show the minimum, maximum, and average time interval between the view of a product, and adding it to cart by buyers. ```sql= with time_of_first_view as ( select user_id , product_id , min (event_time) as first_view from web_events where event_type = 'view' group by 1,2 ), time_of_first_cart as ( select user_id , product_id , min (event_time) as first_cart from web_events where event_type = 'cart' group by 1,2 ) select min(first_cart - first_view), max(first_cart - first_view), avg(first_cart - first_view) from time_of_first_view natural join time_of_first_cart where first_cart > first_view;-- because our dataset is not complete! ``` 9. For each week, show the product that is the most frequently viewed but not purchased. ```sql= with product_views as ( select extract(week from event_time) as week, product_id , count(*) as view_count from web_events where event_type = 'view' group by 1,2 ), purchased_products as ( select distinct product_id from web_events where event_type = 'purchase' ), weekly_views_of_not_purchased as( select week, product_id, view_count from product_views where product_id not in (select product_id from purchased_products) ) select * from weekly_views_of_not_purchased v1 where v1.view_count >= all (select view_count from weekly_views_of_not_purchased v2 where v1.week = v2.week ) ``` ###### tags: `Database Systems` `Pre-Lab` `IUG` `Computer Engineering` <center>End Of Pre-Lab 5</center>