# IIIEDU-BDSE11-ML-DL-with-Python 本頁連結: https://bit.ly/2o0RK8L  ## 檔案下載 * [講義](https://www.dropbox.com/s/oikm7bgxuyhvcr4/BDSE11ML-and-DL.pdf?dl=0) * [Notebooks & Datasets](https://www.dropbox.com/s/rlnepy45epue397/bdse11_20190928.tar.gz?dl=0) ## Spark example https://www.dropbox.com/s/a62cnm7ove8rrdr/sparkMachineLearning.html?dl=0 ## 環境 若為Windows環境 (無GPU),請安裝Anaconda後,加裝如下套件: * 基本套件 ```bash pip install pandas matplotlib seaborn numpy scipy ``` * 機器學習套件 ```bash pip install tensorflow sklearn xgboost==0.90 ``` * 其它套件 ```bash pip install pillow graphviz pip install mlxtend pip install numexpr conda install -y shapely pip install imgaug opencv-python==3.4.7.28 pip install jupyterlab ``` ## 上課範例 * 00-quick-review [[html]](http://211.75.15.16:60000/bdse11-00-quick-review.html) [[ipynb]](http://211.75.15.16:60000/bdse11-00-quick-review.ipynb) * 01-TFGradientTapeAndLinearRegression [[html]](http://211.75.15.16:60000/bdse11-01-TFGradientTapeAndLinearRegression.html) [[ipynb]](http://211.75.15.16:60000/bdse11-01-TFGradientTapeAndLinearRegression.ipynb) * 02-SklearnLinearRegression [[html]](http://211.75.15.16:60000/bdse11-02-SklearnLinearRegression.html) [[ipynb]](http://211.75.15.16:60000/bdse11-02-SklearnLinearRegression.ipynb) * 03-SklearnQuickTour [[html]](http://211.75.15.16:60000/bdse11-03-SklearnQuickTour.html) [[ipynb]](http://211.75.15.16:60000/bdse11-03-SklearnQuickTour.ipynb) * 04-TitanicSurvivalEDA [[html]](http://211.75.15.16:60000/bdse11-04-TitanicSurvivalEDA.html) [[ipynb]](http://211.75.15.16:60000/bdse11-04-TitanicSurvivalEDA.ipynb) * 05-TitanicSurvivalAnalysis[[html]](http://211.75.15.16:60000/bdse11-05-TitanicSurvivalAnalysis.html) [[ipynb]](http://211.75.15.16:60000/bdse11-05-TitanicSurvivalAnalysis.ipynb) * 06-XGBoostBlFriday [[html]](http://211.75.15.16:60000/bdse11-06-XGBoostBlFriday.html) [[ipynb]](http://211.75.15.16:60000/bdse11-06-XGBoostBlFriday.ipynb) * 07-SklearnPCA [[html]](http://211.75.15.16:60000/bdse11-07-SklearnPCA.html) [[ipynb]](http://211.75.15.16:60000/bdse11-07-SklearnPCA.ipynb) * 08-SklearnDBSCAN [[html]](http://211.75.15.16:60000/bdse11-08-SklearnDBSCAN.html) [[ipynb]](http://211.75.15.16:60000/bdse11-08-SklearnDBSCAN.ipynb) * 09-KerasLayersIO [[html]](http://211.75.15.16:60000/bdse11-09-KerasLayersIO.html) [[ipynb]](http://211.75.15.16:60000/bdse11-09-KerasLayersIO.ipynb) --- * 15-KerasSimpleRNNManyToOne [[html]](http://211.75.15.16:60000/bdse11-15-KerasSimpleRNNManyToOne.html) [[ipynb]](http://211.75.15.16:60000/bdse11-15-KerasSimpleRNNManyToOne.ipynb) * 16-KerasSimpleRNNManyToMany [[html]](http://211.75.15.16:60000/bdse11-16-KerasSimpleRNNManyToMany.html) [[ipynb]](http://211.75.15.16:60000/bdse11-16-KerasSimpleRNNManyToMany.ipynb) * 17-KerasLSTMSurvivalPredICU [[html]](http://211.75.15.16:60000/bdse11-17-KerasLSTMSurvivalPredICU.html) [[ipynb]](http://211.75.15.16:60000/bdse11-17-KerasLSTMSurvivalPredICU.ipynb) --- * 09-KerasLayersIO [[html]](http://211.75.15.16:60000/bdse11-09-KerasLayersIO.html) [[ipynb]](http://211.75.15.16:60000/bdse11-09-KerasLayersIO.ipynb) * 10-KerasMNIST[[html]](http://211.75.15.16:60000/bdse11-10-KerasMNIST.html) [[ipynb]](http://211.75.15.16:60000/bdse11-10-KerasMNIST.ipynb) * 11-KerasInceptionAndVGG[[html]](http://211.75.15.16:60000/bdse11-11-KerasInceptionAndVGG.html) [[ipynb]](http://211.75.15.16:60000/bdse11-11-KerasInceptionAndVGG.ipynb) * 12-KerasResNetAndDenseNet[[html]](http://211.75.15.16:60000/bdse11-12-KerasResNetAndDenseNet.html) [[ipynb]](http://211.75.15.16:60000/bdse11-12-KerasResNetAndDenseNet.ipynb) --- ## 補充連結 * [Python Machine Learning 課本的筆記本教材](https://github.com/rasbt/python-machine-learning-book-2nd-edition) * [Pandas 指令](https://studylib.net/doc/25268801/pandas-cheat-sheet) * [有大量dummy variables時,可改用H2O做隨機森林/決策樹](https://roamanalytics.com/2016/10/28/are-categorical-variables-getting-lost-in-your-random-forests/) (Scikit-learn內建的樹不適合用來處理大量dummy variables) * [XGBoost parameters](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters) ## 作業 * **hw1**: Linear regression: 離群值會讓模型學得比較不好。試著去掉離群值,然後再訓練一次模型,看$R^2$有沒有變好。 * **hw2**: XGBoost: 控制L1/L2規範項的強度,看模型準確率有無變化 ## 補充投影片 * 9/29  * 10/5  * 10/6     * 上述例子是將一個類別欄位變成了一百個dummy欄位。這一百個欄位任選一個來做決策,只能降低約1%的整體不純度。 * 所以這個類別欄位,不管重要或不重要,都容易被決策樹/森林忽略。 * 做決策樹/森林時,建議不要將類別欄位變成dummy欄位。可惜的是目前Scikit-learn還是要求類別型欄位必須做dummy。建議改用XGBoost或是H2O來實作樹類演算法。 * 10/16   * 10/28    ## GPU環境 * [01](http://192.168.20.89:32601/?token=152218be33c6ca86813d11ccb55061579a2e74fa6971a014) * [02](http://192.168.20.88:30252/?token=a210e456b53fdbea1e6680cc8e3e65cbaf87a21e3d4aae1a) * [03](http://192.168.20.88:31595/?token=f9cf12f0b1d57820c664c1c46e59b84e98e9755dbbb099b9) * [04](http://192.168.20.89:30821/?token=a778e986dd5043c53277e74457f64d13ec41f3e38e75bb61) * [05](http://192.168.20.89:31706/?token=cfe294e8fd7f8a03f6096b6e618b200d79602a3650f6b80f) --- 翁啟閎 chi-hung.weng@gmx.de
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up