---
title: Agenda
description: Agenda of this lecture
duration: 220
card_type: cue_card
---
## Agenda
- Compound Joins
- Types of Joins
- Cross Join
- USING
- NATURAL
- IMPLICIT JOIN
- Join with WHERE vs ON
- UNION
---
title: Compound Joins
description:
duration: 1080
card_type: cue_card
---
## Compound Joins
Till now, whenever we did a join, we joined based on only 1 condition. Like in where clause we can combine multiple conditions, in Joins as well, we can have multiple conditions on multiple columns.
Let's see an example. For every film, name all the films that were released in the range of 2 years before or after that film and there rental rate was more than the rate of the movie.
```sql
SELECT f1.name, f2.name
FROM film f1
JOIN film f2
ON (f2.year BETWEEN f1.year - 2 AND f1.year + 2) AND f2.rental > f1.rental;
```
> Note:
> 1. Join does not need to happen on equality of columns always.
> 2. Join can also have multiple conditions.
A Compound Join is one where Join has multiple conditions on different columns.
---
title: Types of Joins
description:
duration: 1620
card_type: cue_card
---
## Types of Joins
While we have pretty much discussed everything that is mostly important to know about joins, there are a few nitty gritties that we should know about.
Let's take the join query we had written a bit earlier:
```sql
SELECT s1.name, s2.name
FROM students s1
JOIN students s2
ON s1.buddy_id = s2.id;
```
Let's say there is a student that does not have a buddy, i.e., their `buddy_id` is null. What will happen in this case? Will the student be printed?
> NOTE: Give hints to get someone to say `no` and explain why.
If you remember what we discussed about CRUD , is NULL equal to anything? Nope. Thus, the row will never match with anything and not get printed. The join that we discussed earlier is also called inner join. You could have also written that as:
```sql
SELECT s1.name, s2.name
FROM students s1
INNER JOIN students s2
ON s1.buddy_id = s2.id
```
The keyword INNER is optional. By default a join is INNER join.
As you see, an INNER JOIN doesn't include a row that didn't match the condition for any combination.
Opposite of INNER JOIN is OUTER JOIN. Outer Join will include all rows, even if they don't match the condition. There are 3 types of outer joins:
- Left Join
- Right Join
- Full Join
---
title: Left Joins
description:
duration: 1620
card_type: cue_card
---
#### Please use macro for demo: https://docs.google.com/spreadsheets/d/1RHZ4T7Kmibk5InZUrsTiBGQFAbZoOyVdaYH5hbmMcTw/edit?usp=sharing
As the names convey, `left join` will include all rows from the `left` table, and include rows from `right table` which matches join condition. If there is any row for which there is no match on right side then it will be considered as `Null`.
Let's take an example to understand these well:
Assume we have 2 tables: students and batches with following data:
`batches`
| batch_id | batch_name |
|----------|------------|
| 1 | Batch A |
| 2 | Batch B |
| 3 | Batch C |
`students`
| student_id | first_name | last_name | batch_id |
|------------|------------|-----------|----------|
| 1 | John | Doe | 1 |
| 2 | Jane | Doe | 1 |
| 3 | Jim | Brown | null |
| 4 | Jenny | Smith | null |
| 5 | Jack | Johnson | 2 |
Now, let's write query for Left join:
```sql
SELECT *
FROM students s
LEFT JOIN batches b
ON s.batch_id = b.batch_id;
-- Same as
SELECT *
FROM students s
LEFT OUTER JOIN batches b
ON s.batch_id = b.batch_id;
```
---
title: Quiz 1
description:
duration: 45
card_type: quiz_card
---
# Question
In a LEFT JOIN, if there are no matching rows in the right table:
# Choices
- [ ] Those rows are excluded from the result.
- [x] Those rows are included with NULL values for columns from the right table.
- [ ] The query returns an error.
- [ ] Those rows are included with default values for columns from the right table.
---
title: Quiz 2
description:
duration: 45
card_type: quiz_card
---
# Question
Which of the following is the correct syntax for a LEFT JOIN on table1 with table2?
# Choices
- [ ] SELECT * FROM table1 JOIN table2 ON table1.id = table2.id
- [ ] SELECT * FROM table2 LEFT JOIN table1 ON table1.id = table2.id
- [x] SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id
- [ ] None of the above
---
title: Quiz 3
description:
duration: 45
card_type: quiz_card
---
# Question
In a LEFT JOIN, if a row in the left table matches multiple rows in the right table:
# Choices
- [x] All matching rows are included in the result.
- [ ] Only the first matching row is included in the result.
- [ ] Only the last matching row is included in the result.
- [ ] An error is returned.
---
title: Quiz 4
description:
duration: 45
card_type: quiz_card
---
# Question
In a LEFT JOIN, if there are no matching rows in the right table:
# Choices
- [ ] Those rows are excluded from the result.
- [X] Those rows are included with NULL values for columns from the right table.
- [ ] The query returns an error.
- [ ] Those rows are included with default values for columns from the right table.
---
title: Quiz 5
description:
duration: 45
card_type: quiz_card
---
# Question
Which statement is true about the order of tables in a LEFT JOIN?
# Choices
- [ ] It does not matter which table is specified first.
- [X] The left table must always be specified first.
- [ ] The right table must always be specified first..
- [ ] The tables must be listed in alphabetical order.
---
title: Right Joins
description:
duration: 1620
card_type: cue_card
---
As the names convey, `right join` will include all rows from the left table, and include rows from `left table` which matches join condition. If there is any row for which there is no match on left side then it will be considered as `Null value`.
```sql
SELECT *
FROM students s
RIGHT JOIN batches b
ON s.batch_id = b.batch_id;
-- Same as
SELECT *
FROM students s
RIGHT OUTER JOIN batches b
ON s.batch_id = b.batch_id;
```
---
title: Quiz 6
description:
duration: 45
card_type: quiz_card
---
# Question
In a RIGHT JOIN, if there is no match in the left table, what will be the value of columns from the left table in the result set?
# Choices
- [x] NULL
- [ ] 0
- [ ] Infinity
- [ ] None of the above
---
title: Quiz 7
description:
duration: 45
card_type: quiz_card
---
# Question
When using a RIGHT JOIN, the order of tables in the FROM clause matters.
# Choices
- [x] Yes
- [ ] No
---
title: Full Outer Join
description:
duration: 1620
card_type: cue_card
---
As the names convey, `Full join` will include all rows from the left table as well as right table, If there is any row for which there is no match on either of the sides then it will be considered as `Null value`.
```sql
SELECT *
FROM students s
FULL OUTER JOIN batches b
ON s.batch_id = b.batch_id;
```
Now let's use different types of joins and tell me which row do you think will not be a part of the join.
> NOTE: Explain each type of join again after seeing students' answers.
---
title: Quiz 8
description:
duration: 45
card_type: quiz_card
---
# Question
Which of the following rows will NOT be a part of the result set in a LEFT JOIN of the students table on the batches table on batch_id?
Columns in students table are student_id, first and last name, batch_id.
# Choices
- [ ] [1, John, Doe, 1]
- [ ] [3, Jim, Brown, null]
- [ ] [5, Jack, Johnson, 2]
- [x] None of the above
---
title: Quiz Explanation
description:
duration: 600
card_type: cue_card
---
Since the student's table is on the left side, and in the case of left joins, all rows from the left table are included, none of the provided options accurately represents this situation. Therefore, the correct answer is none of the above.
---
title: Quiz 9
description:
duration: 45
card_type: quiz_card
---
# Question
If we perform a RIGHT JOIN of the students table on the batches table on batch_id, which row from the students table will NOT be included in the result set?
Columns in students table are student_id, first and last name, batch_id.
# Choices
- [ ] [1, John, Doe, 1]
- [x] [3, Jim, Brown, null]
- [ ] [5, Jack, Johnson, 2]
- [ ] None of the above
---
title: Quiz Explanation
description:
duration: 600
card_type: cue_card
---
Since the student's table is on the left side, in the case of right joins, all rows from the right table are included, and only matching rows from the left side are included.
---
title: Quiz 10
description:
duration: 45
card_type: quiz_card
---
# Question
For an INNER JOIN of the students table on the batches table on batch_id, which of the following rows will NOT be included in the resulting set?
# Choices
- [ ] [1, John, Doe, 1]
- [x] [3, Jim, Brown, null]
- [ ] [5, Jack, Johnson, 2]
- [ ] None of the above
---
title: Quiz Explanation
description:
duration: 600
card_type: cue_card
---
In case of inner joins only matching condition rows are included in the answer. Since option B have Null value corresponding to student_id column so it will not be a part of the answer.
---
title: Quiz 11
description:
duration: 45
card_type: quiz_card
---
# Question
Which row will NOT appear in the resulting set when we perform a FULL OUTER JOIN of the students table on the batches table on batch_id?
# Choices
- [ ] [1, John, Doe, 1]
- [ ] [3, Jim, Brown, null]
- [ ] [5, Jack, Johnson, 2]
- [x] None of the above
---
title: Quiz Explanation
description:
duration: 600
card_type: cue_card
---
In case of full outer join all the rows from both the tables are included.
---
title: CROSS JOIN
description:
duration: 600
card_type: cue_card
---
## CROSS JOIN
There is one more type of join that we haven't discussed yet. It is called cross join. Cross join is a special type of join that doesn't have any condition. It just combines every row of the first table with every row of the second table. Let's see an example:
```sql
SELECT *
FROM students s
CROSS JOIN batches b;
```
Now you may wonder why might someone need this join? For example, in a clothing store's database, one table might have a list of colors, and another table might have a list of sizes. A cross join can generate all possible combinations of color and size.
Cross join produces a table where every row of one table is joined with all rows of the other table. So, the resulting table has `N*M` rows given that the two tables have N and M rows.
> NOTE: Show a demo on a table in Sakila database.
In the Sakila database, a cross join between the film and actor tables can be performed to generate all possible combinations of films and actors. Here's an example SQL query:
```sql=
SELECT f.film_id, f.title AS film_title, a.actor_id, a.first_name AS actor_first_name, a.last_name AS actor_last_name
FROM film AS f
CROSS JOIN actor AS a;
```
That's pretty much all different kind of joins that exist. There are a few more syntactic sugars that we can use to write joins. Let's see them:
---
title: USING
description:
duration: 420
card_type: cue_card
---
## USING
Let's say we want to join 2 tables on a column that has the same name in both the tables. For example, in the students and batches table, we want to join on the column `batch_id`. We can write the join as:
```sql
SELECT *
FROM students s
JOIN batches b
ON s.batch_id = b.batch_id;
```
But there is a shorter way to write this. We can write this as:
```sql
SELECT *
FROM students s
JOIN batches b
USING (batch_id);
```
---
title: NATURAL JOIN
description:
duration: 420
card_type: cue_card
---
## NATURAL JOIN
Many times it happens that when you are joining 2 tables, they are mostly on the columns with same name. If we want to join 2 tables on all the columns that have the same name, we can use NATURAL JOIN. For example, if we want to join students and batches table on all the columns that have the same name, we can write:
```sql
SELECT *
FROM students s
NATURAL JOIN batches b;
```
---
title: IMPLICIT JOIN
description:
duration: 180
card_type: cue_card
---
## IMPLICIT JOIN
There is one more way to write joins. It is called implicit join. In this, we don't use the JOIN keyword. Instead, we just write the table names and the condition. For example, if we want to write the join query that we wrote earlier as implicit join, we can write:
```sql
SELECT *
FROM students s, batches b;
```
Behind the scenes, this is same as a cross join.
---
title: Join with WHERE vs ON
description:
duration: 780
card_type: cue_card
---
## Join with WHERE v/s ON
Let's take an example to discuss this. If we consider a simple query:
```sql
SELECT *
FROM A
JOIN B
ON A.id = B.id;
```
In pseudocode, it will look like:
```python3
ans = []
for row1 in A:
for row2 in B:
if (ON condition matches):
ans.add(row1 + row2)
for row in ans:
print(row.id, row.id)
```
Here, the size of intermediary table (`ans`) will be less than `n*m` because some rows are filtered.
We can also write the above query in this way:
```sql
SELECT *
FROM A, B
WHERE A.id = B.id;
```
The above query is nothing but a CROSS JOIN behind the scenes which can be written as:
```sql
SELECT *
FROM A
CROSS JOIN B
WHERE A.id = B.id;
```
Here, the intermediary table `A CROSS JOIN B` is formed before going to WHERE condition.
In pseudocode, it will look like:
```python3
ans = []
for row1 in A:
for row2 in B:
ans.add(row1 + row2)
for row in ans:
if (WHERE condition matches):
print(row.id, row.id)
```
The size of `ans` is always `n*m` because table has cross join of A and B. The filtering (WHERE condition) happens after the table is formed.
From this example, we can see that:
1. The size of the intermediary table (`ans`) is always greater or equal when using WHERE compared to using the ON condition. Therefore, joining with ON uses less internal space.
2. The number of iterations on `ans` is higher when using WHERE compared to using ON. Therefore, joining with ON is more time efficient.
In conclusion,
1. The ON condition is applied during the creation of the intermediary table, resulting in lower memory usage and better performance.
2. The WHERE condition is applied during the final printing stage, requiring additional memory and resulting in slower performance.
3. Unless you want to create all possible pairs, avoid using CROSS JOINS.
---
title: UNION
description:
duration: 1080
card_type: cue_card
---
## UNION
Sometimes, we want to print the combination of results of multiple queries. Let's take an example of the following tables:
`students`
| id | name |
|----|------|
`employees`
| id | name |
|----|------|
`investors`
| id | name |
|----|------|
You are asked to print the names of everyone associated with Scaler. So, in the result we will have one column with all the names.
We can't have 3 SELECT name queries because it will not produce this singular column. We basically need SUM of such 3 queries. Join is used to stitch or combine rows, here we need to add the rows of one query after the other to create final result.
UNION allows you to combine the output of multiple queries one after the other.
```sql
SELECT name FROM students
UNION
SELECT name FROM employees
UNION
SELECT name FROM investors;
```
Now, as the output is added one after the other, there is a constraint: Each of these individual queries should output the same number of columns.
Note that, you can't use ORDER BY for the combined result because each of these queries are executed independently.
UNION outputs distinct values of the combined result. It stores the output of individual queries in a set and then outputs those values in final result. Hence, we get distinct values. But if we want to keep all the values, we can use UNION ALL. It stores the output of individual queries in a list and gives the output, so we get all the duplicate values.
That's all about Union and Joins! See you next time. Thanks.
---
title: Announcements:
description:
duration: 150
card_type: cue_card
---
Note: Please check out these notes for revision: https://drive.google.com/file/d/1WqKKZhZSUaKiD4FWeOVyDIrEiv5zClk9/view?usp=drive_link
Hey everyone, this session contains some good question on outer join concept. Get your hands dirty to get best out of this topic.