--- title: Virgil - Descriptive Statistics - S41 Correlation Heatmap tags: Virgil, LearnWorld, DescriptiveStatistics --- <a target="_blank" href="https://colab.research.google.com/drive/1Yn86m2WWdwos15Na42LhKFW6WMjibEh8"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a> ```python import pandas as pd import seaborn as sns import matplotlib.pyplot as plt ``` # CORRELATION: Scatter Plot & Heatmap ```python df = pd.read_csv('https://raw.githubusercontent.com/dhminh1024/practice_datasets/master/titanic.csv') df.head() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>PassengerId</th> <th>Survived</th> <th>Pclass</th> <th>Name</th> <th>Sex</th> <th>Age</th> <th>SibSp</th> <th>Parch</th> <th>Ticket</th> <th>Fare</th> <th>Cabin</th> <th>Embarked</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>1</td> <td>0</td> <td>3</td> <td>Braund, Mr. Owen Harris</td> <td>male</td> <td>22.0</td> <td>1</td> <td>0</td> <td>A/5 21171</td> <td>7.2500</td> <td>NaN</td> <td>S</td> </tr> <tr> <th>1</th> <td>2</td> <td>1</td> <td>1</td> <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td> <td>female</td> <td>38.0</td> <td>1</td> <td>0</td> <td>PC 17599</td> <td>71.2833</td> <td>C85</td> <td>C</td> </tr> <tr> <th>2</th> <td>3</td> <td>1</td> <td>3</td> <td>Heikkinen, Miss. Laina</td> <td>female</td> <td>26.0</td> <td>0</td> <td>0</td> <td>STON/O2. 3101282</td> <td>7.9250</td> <td>NaN</td> <td>S</td> </tr> <tr> <th>3</th> <td>4</td> <td>1</td> <td>1</td> <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td> <td>female</td> <td>35.0</td> <td>1</td> <td>0</td> <td>113803</td> <td>53.1000</td> <td>C123</td> <td>S</td> </tr> <tr> <th>4</th> <td>5</td> <td>0</td> <td>3</td> <td>Allen, Mr. William Henry</td> <td>male</td> <td>35.0</td> <td>0</td> <td>0</td> <td>373450</td> <td>8.0500</td> <td>NaN</td> <td>S</td> </tr> </tbody> </table> </div> ```python num = df[['Survived', 'Pclass', 'Age', 'SibSp', 'Parch', 'Fare']] num.corr() ``` <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Survived</th> <th>Pclass</th> <th>Age</th> <th>SibSp</th> <th>Parch</th> <th>Fare</th> </tr> </thead> <tbody> <tr> <th>Survived</th> <td>1.000000</td> <td>-0.338481</td> <td>-0.077221</td> <td>-0.035322</td> <td>0.081629</td> <td>0.257307</td> </tr> <tr> <th>Pclass</th> <td>-0.338481</td> <td>1.000000</td> <td>-0.369226</td> <td>0.083081</td> <td>0.018443</td> <td>-0.549500</td> </tr> <tr> <th>Age</th> <td>-0.077221</td> <td>-0.369226</td> <td>1.000000</td> <td>-0.308247</td> <td>-0.189119</td> <td>0.096067</td> </tr> <tr> <th>SibSp</th> <td>-0.035322</td> <td>0.083081</td> <td>-0.308247</td> <td>1.000000</td> <td>0.414838</td> <td>0.159651</td> </tr> <tr> <th>Parch</th> <td>0.081629</td> <td>0.018443</td> <td>-0.189119</td> <td>0.414838</td> <td>1.000000</td> <td>0.216225</td> </tr> <tr> <th>Fare</th> <td>0.257307</td> <td>-0.549500</td> <td>0.096067</td> <td>0.159651</td> <td>0.216225</td> <td>1.000000</td> </tr> </tbody> </table> </div> ```python # Heatmap using Pandas style module num.corr().style.background_gradient(cmap='Reds') ``` ```python sns.pairplot(data=num, corner=True); ``` [MORE AMAZING EXAMPLE USING .STYLE](https://pandas.pydata.org/pandas-docs/stable/user_guide/style.html) ```python # Heatmap using Seaborn sns.heatmap(data=num.corr(), annot=True, fmt='.2f'); ```