# kaggle course
###### tags: `learn` `AI` `course`
## Intermediate Machine Learning course
(sklearn)
https://www.kaggle.com/maskerli05155/exercise-categorical-variables/edit
### [encoding](https://hackmd.io/8yG4c23HS4KvednEYO0A0w)
https://www.facebook.com/PythonDataScience/posts/321499564961676/
### [pipelines](https://hackmd.io/0hOiVC-7RgaDQvz0b5KD6A)
### [cross-validation](https://hackmd.io/Pcg5BUjKTt2kavV7-2o84w)
### [XGBoost(extreme gradient boosting -> gradient boosting) ](https://hackmd.io/T8ca-WISSPWKDzYLRUvOvw)
### [data leakage](https://hackmd.io/gtQ7ViKdScCVbItUBMGzyg)
## Pandas
### Pandas vs sql
https://towardsdatascience.com/pandas-vs-sql-compared-with-examples-3f14db65c06f#_=_
both Pandas and SQL operate on tabular data, similar operations or queries can be done using both
### [Getting started](https://hackmd.io/jMLnZxmwRf-pP_JdGgEJew)
### [Indexing, Selecting & Assigning](https://hackmd.io/T4yaL3kGTimdB0pfgrDhZQ)
### [Summary Functions and Maps](https://hackmd.io/bgOoixlVTh2J_aDFT6MKIw)
### [Grouping and Sorting](https://hackmd.io/GHW6WJGlSNyVVc7U_2wIqw)
### [Data Types and Missing Values](https://hackmd.io/_tmLlZISS1-WW0mdl5U_yw)
### [Renaming and Combining](https://hackmd.io/BupgA5a2Tp-pvf5ZHK6-pQ)
## Deep Learning
### [What is Deep Learning?](https://hackmd.io/K1M91LxMSdKUp0uu1Usrbg)
### [Deep Neural Networks](https://hackmd.io/oxG9zn6rScuMHkL_vSdP-A)
### [Stochastic(隨機) Gradient Descent](https://hackmd.io/a3hc46Q4ShCsuyBlAw-wMA)
### [Overfitting and Underfitting](https://hackmd.io/pqv_vUuuR3SS_RtF05GvHg)
### [Dropout and Batch Normalization](https://hackmd.io/i-xNB8PuSkWk8bhlTRpHyg)
### [Binary Classification](https://hackmd.io/cHgMLK6vSTyy-6wPbJ6wrQ)
## Feature Engineering!
### [The Goal of Feature Engineering](https://hackmd.io/p91R2GirSZyoPHS-AOdK0Q)
### [Creating Features](https://hackmd.io/LSLHay2GR7qNYRfaQ8E-_w)
### [Clustering With K-Means](https://hackmd.io/F-T4-eByTvednGgJL-b0CA)
### [Principal Component Analysis](https://hackmd.io/F2T_jgKbQa60-bA1khc49Q)
### [target encoding](https://hackmd.io/6vLPvPP0RBaKB1K3r0-J7A)
## Visualizat
setup
```
import pandas as pd
pd.plotting.register_matplotlib_converters()
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")
```
畫圖:
```
# Set the width and height of the figure
plt.figure(figsize=(12,6))
# Line chart showing the number of visitors to each museum over time
sns.lineplot(data=museum_data)
# Add title
plt.title("Monthly Visitors to Los Angeles City Museums")
```
### Assess seasonality