Piyush Ranjan
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    --- title: Introduction description: duration: 200 card_type: cue_card --- ### **DataViz-Lecture 02-VideoGames (1 hour 30 minutes)** #### **Content** - Quizzes - Quiz 1 (Barplot) - Quiz 2 (Scatterplot) - Bivariate - Continous-Continous - Line plot - Scatterplot - Categorical-Categorical - Dodged countplot - Stacked countplot - Categorical-Continuous - Multiple BoxPlots - Barplots - Subplots --- title: Bivariate Data Visualisation intro, Line plot description: duration: 1200 card_type: cue_card --- #### **Importing the data** Code: ``` python= !gdown https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/021/299/original/final_vg1_-_final_vg_%281%29.csv?1670840166 -O vgsales.csv ``` > Output: ``` Downloading... From: https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/021/299/original/final_vg1_-_final_vg_%281%29.csv?1670840166 To: /content/vgsales.csv 100% 2.04M/2.04M [00:01<00:00, 1.76MB/s] ``` Code: ``` python= import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns ``` Code: ``` python= data = pd.read_csv('vgsales.csv') data.head() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/932/original/1.png?1695752105" width="700" height="150"> Code: ``` python= data.describe() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/933/original/2.png?1695752161" width="700" height="250"> ### **Bivariate Data Visualization** #### **Continous-Continous** (30-40 Minutes) So far we have been analyzing only a single feature. But what if we want to visualize two features at once? #### What kind of questions can we ask regarding a continous-continous pair of features? - Maybe show relation between two features, like **how does the sales vary over the years**? - Or show **how are the features associated, positively or negatively**? \...And so on Let's go back to the line plot we plotted at the very beginning #### **Line Plot** - A line chart in data visualization is a type of **graph** that displays data points as connected line segments. - It is commonly used to show **trends**,**patterns**, or changes in data over time or across categories, with the x-axis typically representing **time or categories** and the y-axis representing **values or quantities**. - Line charts are useful for visualizing continuous data and making it easier to understand how variables relate to each other. <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/322/original/Line.png?1695495606" width="400" height="300"> #### How can we plot the sales trend over the years for the longest running game? First, let's find the longest running game first Code: ``` python= data['Name'].value_counts() ``` > Output: ``` Ice Hockey 41 Baseball 17 Need for Speed: Most Wanted 12 Ratatouille 9 FIFA 14 9 .. Indy 500 1 Indy Racing 2000 1 Indycar Series 2005 1 inFAMOUS 1 Zyuden Sentai Kyoryuger: Game de Gaburincho!! 1 Name: Name, Length: 11493, dtype: int64 ``` Great, so `Ice Hockey` is longer running than most games Let's try to find the sales trend in North America of the same across the years Code: ``` python= ih = data.loc[data['Name']=='Ice Hockey'] sns.lineplot(x='Year', y='NA_Sales', data=ih) ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/945/original/4.png?1695752926" width="400" height="300"> #### What can we infer from this graph? - The sales across North America seem to have been boosted in the years of 1995-2005 - Post 2010 though, the sales seem to have taken a dip Line plot are great to represending trends such as above, over time #### Style and Labelling We already learnt in barplot how to add **titles, x-label and y-label** Let's add the same here Code: ``` python= plt.title('Ice Hockey Sales Trend') plt.xlabel('Year') plt.ylabel('Sales') sns.lineplot(x='Year', y='NA_Sales', data=ih) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/946/original/5.png?1695753007" width="400" height="300"> - It gives **meaning of values** on x and y axis in **lables** - Mention the purpose of plot using **title** #### Now what if we want to change the colour of the curve ? `sns.lineplot()` contains an argument **color** - It takes as argument a matplotlib color OR - as string for some defined colours like: - black: `k`/ `black` - red: `r`/`red` etc **But what all colours can we use ?** Matplotlib provides a lot of colours Check the documentation for more colours <https://matplotlib.org/2.0.2/api/colors_api.html> Code: ``` python= plt.title('Ice Hockey Sales Trend') plt.xlabel('Year') plt.ylabel('Sales') sns.lineplot(x='Year', y='NA_Sales', data=ih, color='r') plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/947/original/6.png?1695753295" width="400" height="300"> Now, lets say we only want to show the values from years 1990-2000 #### How can we limit our plot to only the last decade of 20th century? This requires changing the range of x-axis #### But how can we change the range of an axis in matplotlib ? We can use: - `plt.xlim()`: x-axis - `plt.ylim()`: y-axis These funcs take same 2 args: 1. `left`: Starting point of range 2. `right`: End point of range Code: ``` python= plt.title('Ice Hockey Sales Trend') plt.xlabel('Year') plt.ylabel('NA Sales') plt.xlim(left=1995,right=2010) sns.lineplot(x='Year', y='NA_Sales', data=ih) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/948/original/7.png?1695753488" width="400" height="300"> So far we have visualised a single plot to understand it **What if we want to compare it with some other plot?** Say, we want to compare the same sales trend between two games - Ice Hockey - Baseball Let's first plot the trend for \"Baseball\" Code: ``` python= baseball = data.loc[data['Name']=='Baseball'] sns.lineplot(x='Year', y='NA_Sales', data=baseball) ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/949/original/8.png?1695753579" width="400" height="300"> Now, to compare these, so we will have to draw these plots in the same figure #### How can we plot multiple plots in the same figure ? Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih) sns.lineplot(x='Year', y='NA_Sales', data=baseball) ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/950/original/9.png?1695753641" width="400" height="300"> We can use multiple `sns.lineplot()` funcs Observe: Seaborn automatically created 2 plots with **different colors** #### However how can we know which colour is of which plot ? - sns.lineplot() has another argument **label** to do so - We can simply set the label of each plot Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih, label='Ice Hockey') sns.lineplot(x='Year', y='NA_Sales', data=baseball, label='Baseball') ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/951/original/10.png?1695753784" width="400" height="300"> We can also pass these labels in plt.legend() as a list in the order plots are done Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih) sns.lineplot(x='Year', y='NA_Sales', data=baseball) plt.legend(['Ice Hockey','Baseball']) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/952/original/11.png?1695754010" width="400" height="300"> #### Now can we change the position of the legend, say, to bottom-right corner? - Matplotlib automatically decides the best position for the legends - But we can also change it using the `loc` parameter - `loc` takes input as 1 of following strings: - upper center - upper left - upper right - lower right etc Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih) sns.lineplot(x='Year', y='NA_Sales', data=baseball) plt.legend(['Ice Hockey','Baseball'], loc='lower right') plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/953/original/12.png?1695754113" width="400" height="300"> #### Now what if we want the legend to be outside the plot? Maybe the plot is too congested to show the legend We can use the same `loc` parameter for this too Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih) sns.lineplot(x='Year', y='NA_Sales', data=baseball) plt.legend(['Ice Hockey','Baseball'], loc=(-0.5,0.5)) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/954/original/13.png?1695754238" width="500" height="300"> The pair of floats signify the (x,y) coordinates for the legend ==> From this we can conclude `loc` takes **two types of arguments**: - The location in the **form of string** - The location in the **form of coordinates** #### What if we want to add other stylings to legends ? For eg: - Specify the **number of rows/cols** - Uses parameter `ncols` for this - The number of **rows are decided automatically** - Decide if we want the box of legends to be displayed - Use the bool param `frameon` and so on. Code: ``` python= sns.lineplot(x='Year', y='NA_Sales', data=ih) sns.lineplot(x='Year', y='NA_Sales', data=baseball) plt.legend(['Ice Hockey','Baseball'], loc='lower right', ncol = 2, frameon = False) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/955/original/14.png?1695754327" width="400" height="300"> Now say we want to highlight a point on our curve. For e.g. #### How can we highlight the maximum \"Ice Hockey\" sales across all years ? Let's first find this point Code: ``` python= print(max(ih['NA_Sales'])) ``` > Output: ``` 0.9 ``` --- title: Scatter Plot description: duration: 400 card_type: cue_card --- #### **Scatter Plot** - A scatter plot in data visualization is a graph that displays individual data points as dots on a two-dimensional plane. - It helps show **how two variables are related** or how they vary together, with one variable plotted on the horizontal **(x-axis)** and the other on the vertical **(y-axis)**. - This type of chart is useful for **identifying patterns, trends, or correlations** in data. > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/323/original/diag2.png?1695495919" width="400" height="350"> <br /> Now suppose we want to find the relation between `Rank` and `Sales` of all games. #### Are `Rank` and `Sales` positively or negatively correlated? In this case, unlike line plot, there maybe multiple points in y-axis for each point in x-axis ``` python= data.head() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/956/original/16.png?1695754815" width="600" height="150"> #### How can we plot the relation between `Rank` and `Global Sales`? Can we use lineplot? Let's try it out ``` python= sns.lineplot(data=data, x='Rank', y='Global_Sales') ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/957/original/17.png?1695754945" width="400" height="300"> The plot itself looks very messy and it\'s hard to find any patterns from it. #### Is there any other way we can visualize this relation? Use scatter plot Code: ``` python= sns.scatterplot(data=data, x='Rank', y='Global_Sales') ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/958/original/18.png?1695755106" width="400" height="300"> Compared to lineplot, we are able to see the patterns and points more distinctly now! Notice, - The two variables are negatively correlated with each other - With increase in ranks, the sales tend to go down, implying, lower ranked games have higher sales overall! Scatter plots help us visualize these relations and find any patterns in the data Key Takeaways: - For **Continuous-Continuous Data** =\> **Scatter Plot**,**Line Plot** Sometimes, people also like to display the linear trend between two variables - Regression Plot, do check that --- title: Quiz-1 description: duration: 60 card_type: quiz_card --- # Question Apple wanted to conduct an analysis and find the relationship price and number of units sold for it's products. Which of the following plots will we prefer ? # Choices - [x] Scatter Plot - [ ] Pie Chart - [ ] Boxplot - [ ] Line Plot --- title: Quiz-1 explanation, Categorical categorical description: duration: 1200 card_type: cue_card --- #### Quiz-1 explanation Since we are comparing two numerical variables (price and units sold) to find their relationship pattern, we will use a scatterplot ### **Categorical-Categorical** (20 Minutes) Earlier we saw how to work with continous-continuous pair of data Now let's come to the second type of pair of data: **Categorical-Categorical** #### What questions comes to your mind when we say categorical-categorical pair? Questions related to distribution of a category within another category - What is the **distribution of genres for top-3 publishers**? - Which **platforms do these top publishers use?** #### Which plot can we use to show distribution of one category with respect to another? -> We can have can **have multiple bars for each category** - These multiple bars can be stacked together - **Stacked Countplot** Or - Can be placed next to each other - **Dodged Countplot** #### **Dodged Count Plot** - A **Dodged Count Plot** in data visualization is a chart that displays the **frequency of different categories** within two or more groups **side by side**, making it easy to compare the distribution of data across these groups. - Each category is represented by a **separate set of bars or columns**, with each group\'s data visually separated for clarity. - It's commonly used to show how categorical variables are distributed across different conditions or categories. > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/325/original/diag3.png?1695496338" width="400" height="300"> #### How can we compare the top 3 platforms these publishers use? We can use a dodged countplot in this case Code: ``` python= plt.figure(figsize=(10,8)) sns.countplot(x='Publisher',hue='Platform',data=top3_data) plt.ylabel('Count of Games') ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/959/original/20.png?1695755420" width="400" height="300"> #### What can we infer from the dodged countplot? - EA releases PS2 games way more than any other publisher, or even platform! - Activision has almost the same count of games for all 3 platforms - EA is leading in PS3 and PS2, but Namco leads when it comes to DS platform #### **Stacked Countplot** - A stacked count plot in data visualization is a chart that displays the count of different categories or groups in a dataset, with each category represented as a separate bar or column. - The bars are stacked on top of each other, showing the total count while also highlighting the distribution of counts within each category. - This type of plot is useful for comparing the composition of data across multiple categories or subgroups. > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/328/original/Dav5.png?1695499000" width="400" height="300"> #### How can we visualize the distribution of genres for top-3 publishers? We can use a `stacked countplot` <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/021/545/original/download_%281%29.png?1671006217" width="400" height="300"> But stacked countplots can be misleading Some may find it difficult to understand if it starts from baseline or from on top of the bottom area #### How do we decide between a Stacked countplot and Dodged countlot? - Stacked countplots are a good way to represent totals - While dodged countplots helps us to comapare values between various categories, and within the category itself too --- title: Continuous categorical description: duration: 600 card_type: cue_card --- ### **Continous-Categorical** (10 Minutes) Now let's look at our 3rd type of data pair #### What kind of questions we may have regarding a continuous-categorical pair? - We might to want calculate some numbers category wise - Like **What is the average sales for every genre?** - Or we might be interested in checking the distribution of the data category-wise - **What is the distribution of sales for the top3 publishers?** #### What kind of plot can we make for every category? -> Either KDE plot or Box Plot per category #### **Boxplot** - A box plot, also known as a box-and-whisker plot, is a simple and effective way to visualize the distribution of a dataset. - It displays the median, quartiles, and potential outliers of the data in a box-like graph. #### Box plots show the five-number summary of data: 1. Minimum score, 2. first (lower) quartile 3. Median 4. Third (upper) quartile 5. maximum score #### **Diagram** <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/043/original/Box_Plot.png?1695311672" width="600" height="300"> #### What is the distribution of sales for the top3 publishers? Code: ``` python= sns.boxplot(x='Publisher', y='Global_Sales', data=top3_data) plt.xticks(rotation=90,fontsize=12) plt.yticks(fontsize=12) plt.title('Sales for top3 publisher', fontsize=15) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/961/original/22.png?1695755725" width="400" height="350"> #### What can we infer from this plot? - The overall sales of EA is higher, with a much larger spread than other publishers - Activision doesn't have many outliers, and if you notice, even thought the spread is lesser than EA, the median is almost the same #### **Barplot** What if we want to compare the sales between the genres? We have to use: - Genre (categorical) - Mean of global sales per genre (numerical) #### How to visualize which genres bring higher average global sales? Code: ``` python= sns.barplot(data=top3_data, x="Genre", y="Global_Sales", estimator=np.mean) ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/962/original/23.png?1695755818" width="400" height="300"> If you remember, we had earlier seen EA had a larger market share of sales Along with this fact, majority of games EA made was sports This ultimately proves the fact that Sports has a high market share in the industry, as shown in the barchart --- title: Quiz-2 description: duration: 60 card_type: quiz_card --- # Question For the company "Toyota", we want to find which type of vehicle has made the maximum sales. Which plot we will prefer to use here? # Choices - [x] Bar Plot - [ ] Pie Chart - [ ] Boxplot - [ ] Line Plot --- title: Quiz-2 explanation, Subplots description: duration: 1200 card_type: cue_card --- #### Quiz-2 explanation We are comparing a numerical (sales) and a categorical (type of vehicle) variable. Hence we will use a barplot here ### **Subplots (15-20 Minutes)** So far we have **shown only 1 plot** using `plt.show()` Say, we want to plot the trend of NA and every other region separately in a single figure #### How can we plot multiple smaller plots at the same time? We will use **subplots**, i.e., **divide the figure into smaller plots** We will be using `plt.subplots()` It takes mainly 2 arguments: 1. **No. of rows** we want to **divide our figure** into 2. **No. of columns** we want to **divide our figure** into It returns 2 things: - Figure - Numpy Matrix of subplots Code: ``` python= fig = plt.figure(figsize=(15,10)) sns.scatterplot(top3_data['NA_Sales'], top3_data['EU_Sales']) fig.suptitle('Main title') plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/963/original/24.png?1695756046" width="600" height="500"> Code: ``` python= fig = plt.figure(figsize=(15,10)) plt.subplot(2, 3, 1) sns.scatterplot(x='NA_Sales', y='EU_Sales', data=top3_data) plt.subplot(2, 3, 3) sns.scatterplot(x='NA_Sales', y='JP_Sales', data=top3_data, color='red') fig.suptitle('Main title') ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/964/original/25.png?1695756128" width="600" height="400"> Code: ``` python= fig, ax = plt.subplots(2, 2, figsize=(15,10)) ax[0,0].scatter(top3_data['NA_Sales'], top3_data['EU_Sales']) ax[0,1].scatter(top3_data['NA_Sales'], top3_data['JP_Sales']) ax[1,0].scatter(top3_data['NA_Sales'], top3_data['Other_Sales']) ax[1,1].scatter(top3_data['NA_Sales'], top3_data['Global_Sales']) ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/965/original/26.png?1695756295" width="600" height="400"> Notice, we are using 2 numbers during each plotting Think of subplots as a 2x2 grids, with the two numbers denoting `x,y`/`row,column` coordinate of each subplot #### What is this `ax` parameter exactly? Code: ``` python= print(ax) ``` > Output: ``` [[<matplotlib.axes._subplots.AxesSubplot object at 0x7f5aad891850> <matplotlib.axes._subplots.AxesSubplot object at 0x7f5aad82b340>] [<matplotlib.axes._subplots.AxesSubplot object at 0x7f5aaddefa60> <matplotlib.axes._subplots.AxesSubplot object at 0x7f5aade221c0>]] ``` Notice, - It's a 2x2 matrix of multiple axes objects We are plotting each plot on a single `axes` object. Hence, we are using a 2D notation to access each grid/axes object of the subplot Instead of accesing the individual axes using `ax[0, 0]`, `ax[1, 0]`, there is another method we can use too Code: ``` python= import matplotlib.pyplot as plt import numpy as np plt.figure(figsize=(20,12)).suptitle("NA Sales vs regions",fontsize=20) # Using a 2x3 subplot plt.subplot(2, 3, 1) sns.scatterplot(x='NA_Sales', y='EU_Sales', data=top3_data) plt.subplot(2, 3, 3) sns.scatterplot(x='NA_Sales', y='JP_Sales', data=top3_data, color='red') plt.subplot(2, 3, 4) sns.scatterplot(x='NA_Sales', y='Other_Sales', data=top3_data, color='green') plt.subplot(2, 3, 6) sns.scatterplot(x='NA_Sales', y='Global_Sales', data=top3_data, color='orange') plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/966/original/27.png?1695756574" width="600" height="400"> `Suptitle` adds a title to the whole figure #### We need to observe a few things here 1. The 3rd paramter defines the position of the plot 2. The position/numbering starts from 1 3. It goes on row-wise from start of row to its finish 4. Empty subplots don't show any axes #### But how do we know which plot belongs to which category? Basically the context of each plot We can use `title`, `x/y label` and every other functionality for the subplots too Code: ``` python= plt.figure(figsize=(20,12)).suptitle("NA Sales vs regions",fontsize=20) # Using a 2x3 subplot plt.subplot(2, 3, 1) sns.scatterplot(x='NA_Sales', y='EU_Sales', data=top3_data) plt.title('NA vs EU Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('EU', fontsize=12) plt.subplot(2, 3, 3) sns.scatterplot(x='NA_Sales', y='JP_Sales', data=top3_data, color='red') plt.title('NA vs JP Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('JP', fontsize=12) plt.subplot(2, 3, 4) sns.scatterplot(x='NA_Sales', y='Other_Sales', data=top3_data, color='green') plt.title('NA vs Other Region Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('Other', fontsize=12) plt.subplot(2, 3, 6) sns.scatterplot(x='NA_Sales', y='Global_Sales', data=top3_data, color='orange') plt.title('NA vs Global Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('Global', fontsize=12) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/967/original/28.png?1695756654" width="600" height="400"> #### What if we want to span a plot across the full length of the plot? Think of this in **terms of a grid.** Currently we are **dividing our plot into 2 rows and 3 columns** But we want our plot to be across the middle column, with grids 2 and 5 This can be said as a **single column** So, this problem can be simplified to plotting the plot across **second column in a 1 row 3 column subplot** Code: ``` python= plt.figure(figsize=(20,12)).suptitle("Video Games Sales Dashboard",fontsize=20) # Using a 2x3 subplot plt.subplot(2, 3, 1) sns.scatterplot(x='NA_Sales', y='EU_Sales', data=top3_data) plt.title('NA vs EU Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('EU', fontsize=12) plt.subplot(2, 3, 3) sns.scatterplot(x='NA_Sales', y='JP_Sales', data=top3_data, color='red') plt.title('NA vs JP Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('JP', fontsize=12) # Countplot of publishers plt.subplot(1,3,2) sns.countplot(x='Publisher', data=top3_data) plt.title('Count of games by each Publisher', fontsize=12) plt.xlabel('Publisher', fontsize=12) plt.ylabel('Count of games', fontsize=12) plt.subplot(2, 3, 4) sns.scatterplot(x='NA_Sales', y='Other_Sales', data=top3_data, color='green') plt.title('NA vs Other Region Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('Other', fontsize=12) plt.subplot(2, 3, 6) sns.scatterplot(x='NA_Sales', y='Global_Sales', data=top3_data, color='orange') plt.title('NA vs Global Sales', fontsize=12) plt.xlabel('NA', fontsize=12) plt.ylabel('Global', fontsize=12) plt.show() ``` > Output: <img src="https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/050/968/original/29.png?1695756726" width="600" height="400">

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully