--- title: Project16 tags: teach:MF --- --- title: ML&F tags: teaching --- # ML and FinTech: Project by 尤皇倫 ### key word Machine picking Financial risk management return forecast --- ## 1. Motivations --- ### Motivation and why 我想藉由市場過去之歷史價格資料及其他股市之基本面之資訊分析並預測個股之未來報酬及風險,再藉由對投資者願承受風險及要求報酬,給予其投資組合建議。 近期人們強調養成投資的習慣,然而並非所有人都擁有足夠的知識自行操作,透過此程式可以快速的過濾投資者所需要的標的,給予他們除了基金和定存以外的投資選項 ## 2. EDA ### Link of data https://www.tpex.org.tw/web/index.php?l=zh-tw https://www.twse.com.tw/zh/ https://mops.twse.com.tw/mops/web/index I collect the data of stocks in Taiwan in 2018,2019,2020 and 2021 respectively. ### Data describtion #### Explainatory | No | variable | explanation | | -------- | -------- | -------- | | $x_{1}$ | Return | annually | | $x_2$ | Volatility | annually | | $x_3$ | EPS | earning per share | | $x_4$ | P/E ratio | price/EPS | | $x_5$ | Profit margin | Net Income/Sales | | $x_6$ | Market value | | #### output Expect return for 1 year Expect volitility for 1 year ### Missing value 2018 ![](https://i.imgur.com/Idfd1Nw.png) 2019 ![](https://i.imgur.com/cIB3RzS.png) 2020 ![](https://i.imgur.com/ltzC75b.png) ### disturbution of return and volitility 2018 ![](https://i.imgur.com/Utc4KPk.png) 2019 ![](https://i.imgur.com/wZkAFqN.png) 2020 ![](https://i.imgur.com/0WqkCBd.png) 2021 ![](https://i.imgur.com/4gb0IzS.png) ### Correlation heatmap 2018 ![](https://i.imgur.com/n3MPdA1.png) 2019 ![](https://i.imgur.com/TPQD6AU.png) 2020 ![](https://i.imgur.com/WRdzhlJ.png) ## 3. Problem formulation #### Model setting Tesorflow:Keras Loss function: MSE Optimizer: Adam Activation function: Relu ![](https://i.imgur.com/9anviX6.png) #### Benchmark model We consider a pool model with muliple linear regression: $$r_{t+1} = \beta_0+\beta_1r_{t}+\beta_2\sigma_{t}+\beta_3 x_{3,t}+\beta_4 x_{4,t} +\beta_5 x_{5,t}+\varepsilon_{t},\;\;t=1, 2$$ $$\sigma_{t+1} = \beta_0+\beta_1r_{t}+\beta_2\sigma_{t}+\beta_3 x_{3,t}+\beta_4 x_{4,t} +\beta_5 x_{5,t}+\varepsilon_{t},\;\;t=1, 2$$ #### Process We first use the data in 2018, and return and volitility in 2019 to trian the model by ANN and LR respectively ![](https://i.imgur.com/ggQNHlz.png) Performance of ANN model I ![](https://i.imgur.com/ZuTAobp.png) Performance of LR model I(benchmark) ![](https://i.imgur.com/ZVdogjb.png) Then we use the data in 2019, and return and volitility in 2020 to trian the model by ANN and LR respectively ![](https://i.imgur.com/O8RHPDl.png) Performance of ANN model II ![](https://i.imgur.com/rQp2fQv.png) Performance of LR model II(benchmark) ![](https://i.imgur.com/s25rS7c.png) ### Backtest then we use the model to predict the result and compare with the actual result as backtest Performance of ANN model I ![](https://i.imgur.com/xrtfi6s.png) Performance of MLR model I(benchmark) ![](https://i.imgur.com/1CoBJAm.png) Result of ANN model I ![](https://i.imgur.com/ciGtvYn.png) Result of MLR model I(benchmark) ![](https://i.imgur.com/qSYgIqg.png) Actual result I ![](https://i.imgur.com/DYdYXI0.png) Performance of ANN model II ![](https://i.imgur.com/JuUdEyK.png) Performance of LR model II(benchmark) ![](https://i.imgur.com/V8OQpzT.png) Result of ANN model II ![](https://i.imgur.com/NPe8xRD.png) Result of LR model II(benchmark) ![](https://i.imgur.com/PGOzjFO.png) Actual result II ![](https://i.imgur.com/8BYhxBl.png) Build portfolio I use the predict result to build portfolio by models respectively. I choose the stocks that volatility < mean return > mean ![](https://i.imgur.com/nWuUSYr.png) ![](https://i.imgur.com/YZYmFz3.png) ANN model I ![](https://i.imgur.com/liHAsFe.png) return=0.166 mean of volatility=0.4123 portfolio volatility=0.16682866 LR model I(benchmark) ![](https://i.imgur.com/VWsuMN5.png) return=0.054 mean of volatility=0.4532 portfolio volatility=0.0793308 ANN model II ![](https://i.imgur.com/yFskQ85.png) return=0.24 mean of volatility=0.369 portfolio volatility=0.12767422 LR model II(benchmark) ![](https://i.imgur.com/jbGI6dG.png) return=-0.029 mean of volatility=0.282 portfolio volatility=0.12128712 ## 4. Analysis and Conclusion 1.The MSE of ANN model is not outperform than LR model. 2.The portfolio built by ANN has higher return than LR model. 3.The portfolio built by ANN has higher volatility than LR model however with the effect of diversification the portfolio volatility would decrease. ## Reference --- ---