---
title: Virgil - Intro To Pandas Seaborn - S51 Filter and Sort
tags: Virgil, LearnWorld, IntroPandasSeaborn
---
<a target="_blank" href="https://colab.research.google.com/drive/1TgLNnnTRUAKtI7TZdtNnvkmQ432krELK"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
## Filter
```python
# Filter data using one condition
# Choosing all the country that has Birth rate more than 20
df[df['Birth rate'] > 20]
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Country Code</th>
<th>Birth rate</th>
<th>Internet users</th>
<th>Income Group</th>
</tr>
<tr>
<th>Country Name</th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>Afghanistan</th>
<td>AFG</td>
<td>35.253</td>
<td>5.9</td>
<td>Low income</td>
</tr>
<tr>
<th>Angola</th>
<td>AGO</td>
<td>45.985</td>
<td>19.1</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>Burundi</th>
<td>BDI</td>
<td>44.151</td>
<td>1.3</td>
<td>Low income</td>
</tr>
<tr>
<th>Benin</th>
<td>BEN</td>
<td>36.440</td>
<td>4.9</td>
<td>Low income</td>
</tr>
<tr>
<th>Burkina Faso</th>
<td>BFA</td>
<td>40.551</td>
<td>9.1</td>
<td>Low income</td>
</tr>
<tr>
<th>...</th>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<th>Yemen, Rep.</th>
<td>YEM</td>
<td>32.947</td>
<td>20.0</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>South Africa</th>
<td>ZAF</td>
<td>20.850</td>
<td>46.5</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>Congo, Dem. Rep.</th>
<td>COD</td>
<td>42.394</td>
<td>2.2</td>
<td>Low income</td>
</tr>
<tr>
<th>Zambia</th>
<td>ZMB</td>
<td>40.471</td>
<td>15.4</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>Zimbabwe</th>
<td>ZWE</td>
<td>35.715</td>
<td>18.5</td>
<td>Low income</td>
</tr>
</tbody>
</table>
<p>95 rows × 4 columns</p>
</div>
```python
# Choose all data with Internet rate less than 40
# YOUR CODE HERE
df[df['Internet users'] < 40]
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Country Code</th>
<th>Birth rate</th>
<th>Internet users</th>
<th>Income Group</th>
</tr>
<tr>
<th>Country Name</th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>Afghanistan</th>
<td>AFG</td>
<td>35.253</td>
<td>5.9</td>
<td>Low income</td>
</tr>
<tr>
<th>Angola</th>
<td>AGO</td>
<td>45.985</td>
<td>19.1</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>Burundi</th>
<td>BDI</td>
<td>44.151</td>
<td>1.3</td>
<td>Low income</td>
</tr>
<tr>
<th>Benin</th>
<td>BEN</td>
<td>36.440</td>
<td>4.9</td>
<td>Low income</td>
</tr>
<tr>
<th>Burkina Faso</th>
<td>BFA</td>
<td>40.551</td>
<td>9.1</td>
<td>Low income</td>
</tr>
<tr>
<th>...</th>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<th>Samoa</th>
<td>WSM</td>
<td>26.172</td>
<td>15.3</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>Yemen, Rep.</th>
<td>YEM</td>
<td>32.947</td>
<td>20.0</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>Congo, Dem. Rep.</th>
<td>COD</td>
<td>42.394</td>
<td>2.2</td>
<td>Low income</td>
</tr>
<tr>
<th>Zambia</th>
<td>ZMB</td>
<td>40.471</td>
<td>15.4</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>Zimbabwe</th>
<td>ZWE</td>
<td>35.715</td>
<td>18.5</td>
<td>Low income</td>
</tr>
</tbody>
</table>
<p>95 rows × 4 columns</p>
</div>
***Comparison in Python:***
```
equal: ==
different: !=
more than: >
less than: <
more than or equal: >=
less than or equal: <=
```
```python
# Example: Average birth rate of all the countries in High Income group.
df[df['Income Group'] == 'High income']['Internet users'].mean()
```
74.23168417462685
```python
# Chọn nhiều conditions
# Lưu ý 1: and/or --> &, |
# Lưu ý 2: phải đưa condition vào trong ()
df[(df['Internet users'] > 20) & (df['Birth rate'] < 50)]
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Country Code</th>
<th>Birth rate</th>
<th>Internet users</th>
<th>Income Group</th>
</tr>
<tr>
<th>Country Name</th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>Aruba</th>
<td>ABW</td>
<td>10.244</td>
<td>78.9</td>
<td>High income</td>
</tr>
<tr>
<th>Albania</th>
<td>ALB</td>
<td>12.877</td>
<td>57.2</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>United Arab Emirates</th>
<td>ARE</td>
<td>11.044</td>
<td>88.0</td>
<td>High income</td>
</tr>
<tr>
<th>Argentina</th>
<td>ARG</td>
<td>17.716</td>
<td>59.9</td>
<td>High income</td>
</tr>
<tr>
<th>Armenia</th>
<td>ARM</td>
<td>13.308</td>
<td>41.9</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>...</th>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<th>Venezuela, RB</th>
<td>VEN</td>
<td>19.842</td>
<td>54.9</td>
<td>High income</td>
</tr>
<tr>
<th>Virgin Islands (U.S.)</th>
<td>VIR</td>
<td>10.700</td>
<td>45.3</td>
<td>High income</td>
</tr>
<tr>
<th>Vietnam</th>
<td>VNM</td>
<td>15.537</td>
<td>43.9</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>West Bank and Gaza</th>
<td>PSE</td>
<td>30.394</td>
<td>46.6</td>
<td>Lower middle income</td>
</tr>
<tr>
<th>South Africa</th>
<td>ZAF</td>
<td>20.850</td>
<td>46.5</td>
<td>Upper middle income</td>
</tr>
</tbody>
</table>
<p>129 rows × 4 columns</p>
</div>
```python
# Filter data using multiple conditions
# Remember to wrap the condition in parentheses
df[(df['Birth rate'] > 20) & (df['Internet users'] < 50)]
df[(df['Birth rate'] > 20) | (df['Internet users'] < 50)]
```
## Sort
```python
# Sort value (tăng dần)
df.sort_values('Birth rate')
# Giảm dần
df.sort_values('Birth rate', ascending=False)
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Country Code</th>
<th>Birth rate</th>
<th>Internet users</th>
<th>Income Group</th>
</tr>
<tr>
<th>Country Name</th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>Niger</th>
<td>NER</td>
<td>49.661</td>
<td>1.7000</td>
<td>Low income</td>
</tr>
<tr>
<th>Angola</th>
<td>AGO</td>
<td>45.985</td>
<td>19.1000</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>Chad</th>
<td>TCD</td>
<td>45.745</td>
<td>2.3000</td>
<td>Low income</td>
</tr>
<tr>
<th>Burundi</th>
<td>BDI</td>
<td>44.151</td>
<td>1.3000</td>
<td>Low income</td>
</tr>
<tr>
<th>Mali</th>
<td>MLI</td>
<td>44.138</td>
<td>3.5000</td>
<td>Low income</td>
</tr>
<tr>
<th>...</th>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<th>Germany</th>
<td>DEU</td>
<td>8.500</td>
<td>84.1700</td>
<td>High income</td>
</tr>
<tr>
<th>Italy</th>
<td>ITA</td>
<td>8.500</td>
<td>58.4593</td>
<td>High income</td>
</tr>
<tr>
<th>Japan</th>
<td>JPN</td>
<td>8.200</td>
<td>89.7100</td>
<td>High income</td>
</tr>
<tr>
<th>Portugal</th>
<td>PRT</td>
<td>7.900</td>
<td>62.0956</td>
<td>High income</td>
</tr>
<tr>
<th>Hong Kong SAR, China</th>
<td>HKG</td>
<td>7.900</td>
<td>74.2000</td>
<td>High income</td>
</tr>
</tbody>
</table>
<p>195 rows × 4 columns</p>
</div>
```python
# Find top 5 country by birth rate
df.sort_values('Birth rate', ascending=False).head(5)
```
<div>
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>Country Code</th>
<th>Birth rate</th>
<th>Internet users</th>
<th>Income Group</th>
</tr>
<tr>
<th>Country Name</th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<th>Niger</th>
<td>NER</td>
<td>49.661</td>
<td>1.7</td>
<td>Low income</td>
</tr>
<tr>
<th>Angola</th>
<td>AGO</td>
<td>45.985</td>
<td>19.1</td>
<td>Upper middle income</td>
</tr>
<tr>
<th>Chad</th>
<td>TCD</td>
<td>45.745</td>
<td>2.3</td>
<td>Low income</td>
</tr>
<tr>
<th>Burundi</th>
<td>BDI</td>
<td>44.151</td>
<td>1.3</td>
<td>Low income</td>
</tr>
<tr>
<th>Mali</th>
<td>MLI</td>
<td>44.138</td>
<td>3.5</td>
<td>Low income</td>
</tr>
</tbody>
</table>
</div>