--- title: Project tags: teach:MF --- # 202109 機器學習與金融科技期末專案報告 請同學詳閱[說明文件](https://hackmd.io/RyNu7V3nToaT6Qb2WQujjw?both):目標,格式,以及資料來源。讓我們發揮同行致遠的精神,讓今天的自己比昨天的自己更好! - 可以個人做project。 - 也可以團隊project,每組最多3人。 - 預設全組每位同學貢獻,所以同樣成績。 - 口頭報告以一個project為單位。 ## A project list - :star: 班代協助確認課程事務是否如期規劃 - :apple: denotes if he/she has consulted with Prof. Teng. If I miss your apples, please inform me and I will correct it. - :melon: indicates a project that will be modified as a sample project for future course. | Index | Name| BackUp | Files Uploaded| Bonus| |---|---|---|----|---| |00|童秉儀|[Final]||  | 01 | [李奕萱](https://hackmd.io/@erika-lee1217/BkKSijRXK)| [Final](https://hackmd.io/09ijIxufTLi2S7spoL3CIQ) | | :apple:| | 02 | [杜知翰](https://hackmd.io/DzxDjLWJQES1L5rTgGEpAg) | [Final](https://hackmd.io/8jp0u91QTYOn5VLwWYxWXg) | | | 03 | [吳念主](https://hackmd.io/@UoGDAvQGTKOsX47tonoryQ/BJgnjjC7F) (物理碩) | [Final](https://hackmd.io/mmU9B_8DTLmh1E0WGXORLQ)|| | 04 | [陳敬文](https://hackmd.io/@6Lf2KfC_RVW8b9cdz3zI6A/SJxtn507K) (機械碩)|[Final](https://hackmd.io/7-iK3QNRTE-yaGRIz5p4dg)| | | 05 | [陳以衡](https://hackmd.io/@dpmDf1pyTiGWgt0D6wf0vQ/SJcioj07t) |[Final](https://hackmd.io/TUJpLxWqQ4iXMkuu0JJfww)|| | 06 | [何建緯](https://hackmd.io/@1VjYHUs1Tu6Uv4YAP9aB-w/Byy6sjAQt) (管科) |[Final](https://hackmd.io/RCnBCOOTRoCnoiL2R1jC3A)| | 07 | [盧逸凡](https://hackmd.io/@YI-FAN/HJXJ_h6fF) |[Final](https://hackmd.io/EeEOC4YzRLi-LdnddbxmAA)||:melon:| | 08 | [徐培修](https://hackmd.io/@BuckHsu/BJMsVjCQK) | [Final](https://hackmd.io/IXuOme-3RPOxVA5Sy4Imtg)||:star::melon: | 09 | [許智超](https://hackmd.io/@witty27818a/S1VL2o07t) (數據所) |[Final](https://hackmd.io/flSHSXUPRSmwy3OH8MDPDQ)| | 10 | [李柏毅](https://hackmd.io/@x2LAQ7HQR8i8O7THsvn_DA/HkIgt50mY) |[Final](https://hackmd.io/4LCDOXHbTcehQVTp-EPmCw?both) ||:apple::apple: | 11 | [王聖晴](https://hackmd.io/@9mY24h7eRkuRXfYJ2chYkQ/SyrS9oCmY) | [Final](https://hackmd.io/CKMW8bs7R8ikcJs_6Ew-RA) ||:melon: | 12 | [陳姸榛](https://hackmd.io/@AT99JQAdQtuMAbv_MM1l3g/rkfF69CmK) | [Final](https://hackmd.io/KiHyY-ByQI6ncAYYFenK2A)| | 13 | [黃子軒](https://hackmd.io/Hi_ea2JTSK-bLjy7DOnfFg?view) |[Final](https://hackmd.io/7IIJykvDQzCtqQjp8BsnVw)| | 14 | [宋昱奇](https://hackmd.io/Bnb3BJmJStW6RqQTY58fow?view)|[Final](https://hackmd.io/fK5MiB8UQSW6aBzo4RWUKg) | 15 | [杜翔新](https://hackmd.io/@XfanakqjQ_KN5wuNvrJe-A/BJ652oA7t) | Miss | | 16 | [尤皇倫](https://hackmd.io/@XVCGNox-St26K0lNOoVo4Q/Byhhij0mY) | [Final](https://hackmd.io/xreYeG5nRVa_kKkmqLtq9A) | |:apple:| | 17 | [胡藝馨](https://hackmd.io/@YIHSINHU/H1VStsCXY) | [Final](https://hackmd.io/JVTMJ2-EQ-edXAP53GezuQ)| | 18 | [黎國俊](https://hackmd.io/vJrtGV08S1GR5ejgViGAnw?view) |  [Final](https://hackmd.io/Io6ZSO80QXeyYzOt_DrYBw) ||:melon:| | 19 | [李彥璋](https://hackmd.io/@MZrHyxdqRfydBJLzIahm4A/HJ4ZnoCQt) (電機背景)| [Final](https://hackmd.io/qNs0Q2c7RaeBcjkZS07knQ)| | 20 |[賴冠諭, 陳廷威, 宋忻祖](https://hackmd.io/mh_qDMc2R0ishDnDII6yiw?view)|[Final](https://hackmd.io/2vBaqwsxSg2T10EI7MzjeA)| |:apple: 玉山| | 21 |[王怡茹, 官顥, 薛育鴻](https://hackmd.io/@osmSEd5USX-Sq8Qy6tUDNg/S1yhCw7dY) | [Final](https://hackmd.io/YOeeSf__Q_-Crvzqv0qujA)||:apple::apple::melon: 玉山| |22| [杜承遠, 郭家妤, 孫崇棨](https://hackmd.io/A7_33uLITjeMVZbnDRXq9w?view) | [Final](https://hackmd.io/_WkNZ59vTCehQ45k8dEzNg) ||:apple::apple: 玉山| ## Talk 1 (9/27): Motivations In this talk, please describe your motivations clear. I was happy that almost every student did this part great! ## Talk 2 (11/22): EDA and Proposal ### EDA 1. Refresh your motivations: Update title and keywords 2. Data visualization (EDA) - Feature descriptions: Provide explanations of each feature - Data visualization: Need to provide title and descriptions for each figure or table. - 在你們的HackMD精選有意義的結果。把剩下的繁瑣的細節,當作補充資料,提供link,放在Appendix。 ### Proposal 這裡,同學必須清楚地指出: 1. Benchmark method 2. Measures for comparison ### 1. Benchmark method Simply speaking, model builing is to find a function $f$ to connect the explantory variable $x$ to the response variable $y$: $$y = f(x,\theta),$$ where $x$ is the set of explanatory variables, and $f$ is the form of the model. So, briefly speaking, you need to decide: 1. which $x$ to use? 2. which $f$ to use? In your project, you can compare which feature engineering outperforms the others? In this cases: - benchmark: features 沒有做任何轉換 - features 是否有做log-transformation, - featuers 是否做了PCA 然後你比較,以上三種情況,結果如何。 Or, you are focusing on supervised learning, the benchmark models would be possibly one of the following: - MLR: multiple linear regression - LR: logistic regression - [Multinomial Logistic Regression With Python](https://machinelearningmastery.com/multinomial-logistic-regression-with-python/) ### 2. Measures for comparisons? 在以上兩種情境時,必須要說明清楚,你的benchmark model為何,並且如何利用Machine learning裡的方法,精進你的模型,以系統性的精進並且比較模型。 - 必須清楚說明比較的measures是什麼,並且列出各個模型做出來的數值結果。符號請用latex語言直接在HackMD上面寫。請不要貼snapshots,為這是不專業的!如果是做binary classification,比較的measures,常見的有Accuracy, precision, recall, F-1 score, AUC, etc. 如果是做trading or portfolio management, 常見的有: annualized return, standard deviationl, Sharp ratio, maximum drawdown, etc. 究竟是使用什麼measures,是和domain knowledge 息息相關。 - 需要具體說明有沒有做train-test split? in-sample versus out-of-sample? 比如在 binary classification,我們需要做80-20 train-test split,而在做交易,資產配置等問題,我們會做rolling-window appraoch. 究竟如何做out-of-sample comparisions,仍舊是和domain knowledge 息息相關。 ## Talk 3 (12/27): Final presentations Machine learning (i.e., supervised learning) provides a wide range of flexible and non-linear models. In your final presentations, you have to explicitly compare if you use machine learning methods to beat the benchmark method. <!--- | Index | Name|Comments (12/13) | keywords | |---|---|---|----| | 01 | [李奕萱](https://hackmd.io/@erika-lee1217/BkKSijRXK)| benchmark? measures? | credit default risk, binary classification, logistic regression | | | | 02 | [杜知翰](https://hackmd.io/DzxDjLWJQES1L5rTgGEpAg) | EDA? You should summarize results in HackMD (adding a colab link is not okay!)| Stock Prediction、Machine Learning Portfolio、LSTM Model | | | | 05 | [吳念主](https://hackmd.io/@UoGDAvQGTKOsX47tonoryQ/BJgnjjC7F) | Benchmark? (1/N portfolio) measures for performance? | 南海泡沫事件, 股價預測 | | 06 | [陳敬文](https://hackmd.io/@6Lf2KfC_RVW8b9cdz3zI6A/SJxtn507K) | benchmark? measures? | Stock price forecast, Predictive model, Explainable model, covid-19 | | | 07 | [陳以衡](https://hackmd.io/@dpmDf1pyTiGWgt0D6wf0vQ/SJcioj07t) |Benchmark model: Multiple linear regression. ML: neural network. :question:| Stock price prediction model,Tensorflow,keras,Pytorch | | | | 08 | [何建緯](https://hackmd.io/@1VjYHUs1Tu6Uv4YAP9aB-w/Byy6sjAQt) | benchmark? measures? | 主動選股, 股價預測 | | 09 | [盧逸凡](https://hackmd.io/@YI-FAN/HJXJ_h6fF)| benchmark? measures? | Customer Transaction, lightgbm | | | |10 | [徐培修](https://hackmd.io/@BuckHsu/BJMsVjCQK) | benchmark? measures? | 證券交易、股價、籌碼、券商分點 | | 11 | [許智超](https://hackmd.io/@witty27818a/S1VL2o07t) | Almost done (Do not use "spoiler"! it's difficult for the audience to read. ) | fraud detection, binary classification, audit risk | | 12 | [李柏毅](https://hackmd.io/@x2LAQ7HQR8i8O7THsvn_DA/HkIgt50mY)| Need to provide a benchmark! | ML/DL, Financial Market, Automated Trading | | 13 | [王聖晴](https://hackmd.io/@9mY24h7eRkuRXfYJ2chYkQ/SyrS9oCmY) | Need to provide a Logistic regression as a benchmark. | credit card fraud detection,binary classification | | | | 14 | [陳姸榛](https://hackmd.io/@AT99JQAdQtuMAbv_MM1l3g/rkfF69CmK) | But, what is stochastic Gradient Descent Regression Formula? |加密貨幣預測 trading, machine learning, cryptocurrency| | 15 | [黃子軒](https://hackmd.io/Hi_ea2JTSK-bLjy7DOnfFg?view) | benchmark? measures? | 小額貸款, 違約風險評估 | | 16 | [宋昱奇](https://hackmd.io/Bnb3BJmJStW6RqQTY58fow?view) | benchmark? measures? | 酒店取消. LR property、Machine Learning | | 17 | [杜翔新](https://hackmd.io/@XfanakqjQ_KN5wuNvrJe-A/BJ652oA7t) | Many materials are missing. benchmark? measures? | | 18 | [尤皇倫](https://hackmd.io/@XVCGNox-St26K0lNOoVo4Q/Byhhij0mY) | Benchmark method? | big data analysis, Machine picking, Financial risk management, return forecast | | | | 19 | [胡藝馨](https://hackmd.io/@YIHSINHU/H1VStsCXY) | Benchmark? (Benchmark model is not exactly the same as benchmark result. We need to identify benchmark model so that we can recognize how to improve models.) | IPO企業 | | 20 | [黎國俊](https://hackmd.io/vJrtGV08S1GR5ejgViGAnw?view) | You are almost done (with an acaedmic paper flavor)! Merge Tables 3 and 4. | Financial safety, Investment Bank, Early detection, Capital adequacy, Overall operation | | | | 21 | [李彥璋](https://hackmd.io/@MZrHyxdqRfydBJLzIahm4A/HJ4ZnoCQt) | Don't paste snapshot of formulas, you should type formulas directly using latex. |Momentum portfolios for comparisons | machine learning, future data, slidewindow, degree of freedom | | | | 22, 23, 24|[賴冠諭, 陳廷威, 宋忻祖](https://hackmd.io/mh_qDMc2R0ishDnDII6yiw?view)| benchmark? measures? | 玉山人工智慧公開挑戰賽-第一組 | | 25, 26, 27|[王怡茹, 官顥, 薛育鴻](https://hackmd.io/@osmSEd5USX-Sq8Qy6tUDNg/S1yhCw7dY) | EDA? (Can you paste the import information in HackMD direclty, and leave less information as summplementry materials using a link?) Problem formulation? Benchmark model? You should write down the measures but not paste the snapshots. |玉山人工智慧公開挑戰賽-第二組 benchmark? | |28, 29, 30 |[杜承遠, 郭家妤, 孫崇棨](https://hackmd.io/A7_33uLITjeMVZbnDRXq9w?view) |benchmark? measures? | 玉山人工智慧公開挑戰賽-第三組 | -->