# ml2019fall-hw1 Linear Regression ## Data Preprocessing 1. str.extrace('(Regular Expression)') 2. Concatanate ## Feature Selection 1. Covariance Matrix 2. L1 Norm Regularization => Implies Feature Selection => Sparse * Seaborn heat map => weight $\approx$ 0.0 => useless feature will hurt the accuracy. * ![](https://i.imgur.com/4MHPw2h.png) 3. L2 Norm Regularization => A Feasiable Function Set + [Feature Selection](https://taweihuang.hpd.io/2016/09/12/讀者提問:多元迴歸分析的變數選擇/) ## Improve ### In order to solve the overfitting problem 1. Regularization * Elastic Network 3. Data augmentation 4. Early Stopping 5. Loss function 6. adam arguments ## Debug 1. sns heat map - cov matrix, weight 2. train loss v.s. val loss 3. Find some background knowledge (ex: 統計學、R 語言、PM 2.5、Github...) 4. Trial and Error ## Score 1. Public - Rank 45 - Score: 5.58173 2. Private - Rank 2 - Score: 5.24210