---
title: Lecture Agenda
description:
duration: 5400
card_type: cue_card
---
## Tableau Lecture 7: Operations on Dataset
#### Agenda
* Joins
* Building a map visualization
* Cross-database join
* Relationship
* Unions
* Blend
---
title: Joins
description:
duration: 5400
card_type: cue_card
---
### Joins
* The data that you want to analyze is often made up of a collection of tables that are related by specific fields.
* Joining is a method for combining the related data on those common fields.
* The result of combining data using a join is a table that’s typically extended horizontally by adding fields of data.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/547/original/joins.png?1693934401 height=500 width=500>
#### Business problem 1:
Find per country CO2 KT and per capita emission and filter top 10 countries by CO2 KT.
Dataset : [World_bank_CO2.xlsx](https://docs.google.com/spreadsheets/d/1zUMnxp9aPdV5kYWJSoCkVqTEurJvd86d/edit?usp=share_link&ouid=100659516601446935794&rtpof=true&sd=true)
**Steps**
1. Creating Join
* Drag the 'CO2 (kt) Pivoted' table to the Canvas
* Double click on 'CO2 (kt) Pivoted' box to get the physical layer.
* Drag 'CO2 per capita (pivoted)' to the canvas.
* Click on the join symbol
* Change key field for both tables to country name.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/552/original/co2_join.png?1693935881 height=400 width=700>
2. Visualization
* Drag Latitude to rows shelf
* Drag longitude to columns shelf
* Drag country name to label in marks card
* Drag CO2 (kt) to size in marks card
* Drag CO2 Per Capita to color in marks card
* In the marks card, change visualisation type to map.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/555/original/sheet_join.png?1693936329 height=400 width=700>
3. Applying Filter
* Drag country name to filters card
* Go to 'Top' tab, choose the CO2 (kt) field, click ok.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/556/original/filter.png?1693936962 height=400 width=700>
---
title: Quiz-1
description:
duration: 30
card_type: quiz_card
---
# Question
Can you join more than 2 tables in Tableau?
# Choices
- [x] Yes
- [ ] No
---
title: Geographic Data Type
description:
duration: 5400
card_type: cue_card
---
### Geographic Data Type
#### Format geographic data in Tableau
* Depending on the type of map you want to create, you must assign certain data types, data roles, and geographic roles to your fields (or columns).
* For example, in most cases, your latitude and longitude fields should have a data type of number (decimal), a data role of measure, and be assigned the Latitude and Longitude geographic roles.
* All other geographic fields should have a data type of string, a data role of dimension, and be assigned the appropriate geographic roles.
#### Change the data type of a column
* Tableau might incorrectly assign a Postal Code column a data type of Number (whole). To create map views, your Postal Code data must have a data type of String.
### Assign geographic roles to your geographic data
#### How does tableau know where to plot on map?
When you assign the correct geographic role to a field, Tableau assigns latitude and longitude values to each location in that field by finding a match that is already built in to the installed geocoding database.
This is how Tableau knows where to plot your locations on the map.
**Geographic roles** are sometimes **automatically assigned** to your data. You can tell a geographic role has been assigned to your data because the column includes a **globe icon**.
If a **geographic role** is not automatically assigned, you can **manually assign one to your field**.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/565/original/geo.png?1693937628 height=300 width=500>
### Importance of Maps (for instructor ref)
* You have some location data in your data source
* You think that it would enhance getting better insight from data compared to other visualization
* It makes representing data in much more beautiful manner
* To answer spatial questions like
* What is the state wise sales?
* Which states has the highest profit?
* In which state we had the highest sales for a particular product?
### Few things to keep in mind when creating a map view in Tableau
* If your data source has **latitude and longitude fields** then they should have a **data type of numeric(decimal)** and a **data role of measure** should be assigned to them
* And other **geographical fields like country, city, state** should be assigned **string data type** and a **data role of dimension**
* Tableau will automatically assign geographic roles based on the field name and values present in the data
* **If our data source does not contains latitude and longitude values we can still create maps using the geographical fields available in the data source**
### Pan, Zoom and Select in Tableau (for instructor ref)
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/578/original/pan.png?1693938122 height=200 width=600>
---
title: Quiz-2
description:
duration: 60
card_type: quiz_card
---
# Question
What's the benefit of assigning a geographic role to a field?
# Choices
- [x] Tableau can assign latitude and longitude values to each location, which are needed for plotting a map
- [ ] We can't perform any analysis without assigning a geographic role to all the fields
---
title: Cross Database Joins
description:
duration: 5400
card_type: cue_card
---
### Cross Database Joins
* In Tableau, a cross-database join is when you join two or more tables from different databases using a common field. This creates a single table in Tableau that can be used to create visualizations.
* Cross-database joins require a multi-connection data source—that is, you create a new connection to each database before you join the tables.
* We’ll try to join tables from a CSV file and an Excel file
* Dataset:
* [Retail_store_order_details.xlsx](https://docs.google.com/spreadsheets/d/1Sa5n50BOhbW0O3YDXTnL51IZj9vSj4Lk/edit?usp=share_link&ouid=100659516601446935794&rtpof=true&sd=true)
* [Retail_store_product_details.csv](https://drive.google.com/file/d/1V69qfTOTadbBfK6BOq8hJAFPvMjDJCXY/view?usp=share_link)
#### Business problem 2:
Identify subcategories that made the highest and lowest sales.
**Steps**
1. Loading the two different data sources and adding an inner join between the two tables.
Note: One is the `orders` table and the other is the `products` table.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/603/original/cross.png?1693966455 height=400 width=700>
2. Solving the business problem
* Go to a worksheet
* Drag Sub-Category to Columns
* Drag Sales to Rows
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/604/original/cross_2.png?1693966758 height=400 width=700>
---
title: Quiz-3
description:
duration: 45
card_type: quiz_card
---
# Question
Which of these are necessary for creating a cross-database join?
# Choices
- [ ] A new connection to each database
- [ ] A common field between the tables
- [x] Both
- [ ] None of these
---
title: Relationships
description:
duration: 5400
card_type: cue_card
---
### Relationships
Relationships are a dynamic, flexible way to combine data from multiple tables for analysis.
A relationship describes how two tables relate to each other, based on common fields, but does not merge the tables together.
When a relationship is created between tables, the tables remain separate, maintaining their individual level of detail and domains.
Think of a relationship as a contract between two tables. When you are building a viz with fields from these tables, Tableau brings in data from these tables using that contract to build a query with the appropriate joins.
[Advantages of Relationships](https://help.tableau.com/current/pro/desktop/en-us/relate_tables.htm#:~:text=Relationships%20provide%20several%20advantages%20over%20using%20joins%20for%20multi%2Dtable%20data%3A)
[Relationships vs Joins](https://help.tableau.com/current/server/en-us/datasource_relationships_learnmorepage.htm)
**Instructor Note:**
* We’ll just demonstrate the concept and explain how it is different from Join.
* We’ll be demonstrating the same example as we saw for Joins but instead of Joins here we’ll be building relationships for understanding the topic refer to Varma's lecture [here](https://www.scaler.com/meetings/i/beg-tableau-misc-topic-3-2/archive) [Timestamp 1:19:40 to 1:27:50]
**Steps**
* Add the World Bank CO2 dataset
* Drag CO2 (kt) Pivoted on the canvas
* Drag CO2 Per Capita Pivoted on the canvas
* In clause, set field for both tables to country name.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/612/original/rel.png?1693979362 height=400 width=700>
---
title: Union
description:
duration: 5400
card_type: cue_card
---
### Union
Though union is not a type of join, union is another method for combining two or more tables by appending rows of data from one table to another.
Make sure the tables you union have the same number of fields, the same field names, and the fields are the same data type.
Data: [World_bank_CO2.xlsx](https://docs.google.com/spreadsheets/d/1zUMnxp9aPdV5kYWJSoCkVqTEurJvd86d/edit?usp=share_link&ouid=100659516601446935794&rtpof=true&sd=true)
#### Perform Union on table CO2 (kt) Pivoted and New CO2 (kt) Pivoted.
**Steps**
* Drag CO2 (kt) Pivoted to the canvas
* Double click 'CO2 (kt) Pivoted' to open physical layer.
* Drag New CO2 (kt) Pivoted just below 'CO2 (kt) Pivoted' on the canvas.
* Close the physical layer.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/613/original/union.png?1693979713 height=400 width=700>
---
title: Quiz-4
description:
duration: 45
card_type: quiz_card
---
# Question
Which of these is NOT TRUE with respect to Union of tables?
# Choices
- [ ] The tables should have the same fields.
- [ ] The fields in the tables should be of same data type
- [x] The first table is extended horizontally on doing a union.
---
title: Blending
description:
duration: 5400
card_type: cue_card
---
### Blending
Data Blending is a method for combining data from multiple sources.
Data Blending brings in additional information from a Secondary data source and displays it with data from the Primary data source directly in the view.
**Example:**
* Consider the Sales data is present in a relational database and Sales Target data in an Excel spreadsheet.
* Now, to compare actual sales to target sales, you can blend the data based on common dimensions to get access to the Sales Target measure.
* The two sources involved in data blending are referred as **Primary** and **Secondary** data sources.
**Note:** A left join is created between the primary data source and the secondary data source.
#### Business problem 3:
For each country find the total number of olympic medals won and CO2 per capita (metric tons) emission till 2008.
Dataset : [Modified_Summer_Olympic_medallists_1896-2008.xlsx](https://docs.google.com/spreadsheets/d/1nD7DYtJlsnbKQEAvHfY13xzvBsuL3pM1/edit?usp=share_link&ouid=100659516601446935794&rtpof=true&sd=true) & [World_bank_CO2.xlsx](https://docs.google.com/spreadsheets/d/1zUMnxp9aPdV5kYWJSoCkVqTEurJvd86d/edit?usp=share_link&ouid=100659516601446935794&rtpof=true&sd=true)
**Note:**
* Use table Team event fixed from Modified_Summer_Olympic_medallists_1896-2008
* Use table CO2 per capita (Pivoted) from World_bank_CO2 dataset
**Steps**
1. Loading 2 Data sources in a Single workbook-
* Since world bank co2 dataset is already connected, drag CO2 (kt) Pivoted to the canvas
* Go to worksheet
* In Data menu, click new data source and select modified summer olympics excel file.
* Go to Data Source and from modified summer olympics file, drag the team events fixed table to the canvas.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/619/original/blend.png?1693980977 height=300 width=500>
2. Edit Blend Relationship
* Click on CO2 Per Capita in Data panel
* Go to Data menu, click Edit Blend Relationships...
* Select Custom and click on Add...
* Choose country name from Primary Data Source field, click ok on both dialog boxes.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/645/original/blend_2.png?1693985635 height=400 width=700>
3. Edit Alias
If you drag country name from CO2 data to rows shelf, an orange link appears next to 'country' in Olympics data, saying - 'Stop using country as linking field'
This is because name of some countries (like USA) is written differently in the to files. We can set an alias of a country if needed using the following steps -
* In CO2 data, from country name dropdown in the data panel, click Aliases...
* Set the alias for Unites States as USA.
* Drag country name from CO2 data and country from Olympics data to rows shelf.
* Observe that the entries for 'USA' are next two each other.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/656/original/alias.png?1693986725 height=400 width=700>
**Note:** If the same error pops up when bringing country name from one table just click on the orange link that appears.
4. Solving the business problem
* Drop country name to rows
* Drop CO2 Per Capita to columns
* Drop Total from Olympics data to columns
* Click on '>149 nulls' in bottom right corner of the sheet and select Filter data
* Drag Year in CO2 data to filters and set maximum value to 2008.
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/705/original/blend_4.png?1693999914 height=400 width=700>
### Joins vs Blend
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/706/original/j_v_b.png?1694000081 height=300 width=500>
### Cross-database Join vs Blending
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/707/original/cdj.png?1694000287 height=250 width=600>
\
<span style="color: red;">**Important:**</span> Please inform the learners that they'll get the reference notes for today's lecture as a post-lecture content on their dashboard.
---
title: Quiz-5
description:
duration: 30
card_type: quiz_card
---
# Question
What kind of Join is created in a Blend?
# Choices
- [x] Left Join
- [ ] Right Join
- [ ] Inner Join
- [ ] Outer Join
---
title: Additional information for instructor's reference
description:
duration: 5400
card_type: cue_card
---
### Additional information for instructor's reference
#### Alias
You can create aliases (alternate names) for members in a dimension so that their labels appear differently in the view.
Aliases can be created for the members of discrete dimensions only. They cannot be created for continuous dimensions, dates, or measures.
#### Options to Combine Data
<img src= https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/046/708/original/comb.png?1694000486 height=400 width=700>