# kaggle course ###### tags: `learn` `AI` `course` ## Intermediate Machine Learning course (sklearn) https://www.kaggle.com/maskerli05155/exercise-categorical-variables/edit ### [encoding](https://hackmd.io/8yG4c23HS4KvednEYO0A0w) https://www.facebook.com/PythonDataScience/posts/321499564961676/ ### [pipelines](https://hackmd.io/0hOiVC-7RgaDQvz0b5KD6A) ### [cross-validation](https://hackmd.io/Pcg5BUjKTt2kavV7-2o84w) ### [XGBoost(extreme gradient boosting -> gradient boosting) ](https://hackmd.io/T8ca-WISSPWKDzYLRUvOvw) ### [data leakage](https://hackmd.io/gtQ7ViKdScCVbItUBMGzyg) ## Pandas ### Pandas vs sql https://towardsdatascience.com/pandas-vs-sql-compared-with-examples-3f14db65c06f#_=_ both Pandas and SQL operate on tabular data, similar operations or queries can be done using both ### [Getting started](https://hackmd.io/jMLnZxmwRf-pP_JdGgEJew) ### [Indexing, Selecting & Assigning](https://hackmd.io/T4yaL3kGTimdB0pfgrDhZQ) ### [Summary Functions and Maps](https://hackmd.io/bgOoixlVTh2J_aDFT6MKIw) ### [Grouping and Sorting](https://hackmd.io/GHW6WJGlSNyVVc7U_2wIqw) ### [Data Types and Missing Values](https://hackmd.io/_tmLlZISS1-WW0mdl5U_yw) ### [Renaming and Combining](https://hackmd.io/BupgA5a2Tp-pvf5ZHK6-pQ) ## Deep Learning ### [What is Deep Learning?](https://hackmd.io/K1M91LxMSdKUp0uu1Usrbg) ### [Deep Neural Networks](https://hackmd.io/oxG9zn6rScuMHkL_vSdP-A) ### [Stochastic(隨機) Gradient Descent](https://hackmd.io/a3hc46Q4ShCsuyBlAw-wMA) ### [Overfitting and Underfitting](https://hackmd.io/pqv_vUuuR3SS_RtF05GvHg) ### [Dropout and Batch Normalization](https://hackmd.io/i-xNB8PuSkWk8bhlTRpHyg) ### [Binary Classification](https://hackmd.io/cHgMLK6vSTyy-6wPbJ6wrQ) ## Feature Engineering! ### [The Goal of Feature Engineering](https://hackmd.io/p91R2GirSZyoPHS-AOdK0Q) ### [Creating Features](https://hackmd.io/LSLHay2GR7qNYRfaQ8E-_w) ### [Clustering With K-Means](https://hackmd.io/F-T4-eByTvednGgJL-b0CA) ### [Principal Component Analysis](https://hackmd.io/F2T_jgKbQa60-bA1khc49Q) ### [target encoding](https://hackmd.io/6vLPvPP0RBaKB1K3r0-J7A) ## Visualizat setup ``` import pandas as pd pd.plotting.register_matplotlib_converters() import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns print("Setup Complete") ``` 畫圖: ``` # Set the width and height of the figure plt.figure(figsize=(12,6)) # Line chart showing the number of visitors to each museum over time sns.lineplot(data=museum_data) # Add title plt.title("Monthly Visitors to Los Angeles City Museums") ``` ### Assess seasonality