# 凱利法則的運用
###### tags: `school`
> **組員:**
- 楊舒雅 07155205
- 洪珮嵐 07155219
- 鄭敦齊 07155236
- 莊秉勳 07155145
> **貢獻度:**
- 莊秉勳 :smirk:
- 楊舒雅 洪珮嵐 鄭敦齊 :hear_no_evil:
## 目標
利用凱利法則篩選買賣的結果與該股票的股價的差別,那我們探討[DJIA](https://zh.wikipedia.org/wiki/%E9%81%93%E7%90%BC%E6%96%AF%E5%B7%A5%E4%B8%9A%E5%B9%B3%E5%9D%87%E6%8C%87%E6%95%B0)(道瓊工業平均指數),[NDX](https://zh.wikipedia.org/wiki/%E7%B4%8D%E6%96%AF%E9%81%94%E5%85%8B100%E6%8C%87%E6%95%B8)(納茲達克100),[dax](https://zh.wikipedia.org/wiki/%E5%BE%B7%E5%9B%BDDAX%E6%8C%87%E6%95%B0)(法蘭克福指數),以及NASDAQ, AMEX and NYSE,在這裡以us_stock表示
## 需要的東西
>環境的套件:arrow_right: [**環境套件**](https://drive.google.com/file/d/1ijI44b0VaWtx7BlfN-wgISoAuiiqEeVB/view?usp=sharing)
python版本:arrow_right: **3.8**
需要的資料(.csv) :arrow_right:[**各類股價**](https://drive.google.com/drive/folders/11Pbe-OuomyYJDge14oLqxcwmyIPTH5xs?usp=sharing)
#### 用到的套件
```python=
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
from pathlib import Path
import requests
from io import BytesIO
from zipfile import ZipFile, BadZipFile
import numpy as np
from numpy.linalg import inv
from numpy.random import dirichlet
import pandas as pd
import pandas_datareader.data as web
from sklearn.datasets import fetch_openml
from sympy import symbols, solve, log, diff
from scipy.optimize import minimize_scalar, newton, minimize
from scipy.integrate import quad
from scipy.stats import norm
import matplotlib.pyplot as plt
import seaborn as sns
```
## 將各種指標加入assets.h5
將[sp500](https://zh.wikipedia.org/wiki/%E6%A0%87%E5%87%86%E6%99%AE%E5%B0%94500%E6%8C%87%E6%95%B0)加入assets.h5,其中含美國的500家上市公司,,並把不需要的行刪除。
```python=
sp500_stooq = (pd.read_csv('^sp500_d.csv', index_col=0,parse_dates=True).loc['1950':'2019'].rename(columns=str.lower))
with pd.HDFStore(DATA_STORE) as store:
store.put('sp500/stooq', sp500_stooq)
url = 'https://en.wikipedia.org/wiki/List_of_S%26P_500_companies'
df = pd.read_html(url, header=0)[0]
df.columns = ['ticker', 'name', 'sec_filings', 'gics_sector', 'gics_sub_industry',
'location', 'first_added', 'cik', 'founded']
df = df.drop('sec_filings', axis=1).set_index('ticker')
from pathlib import Path
with pd.HDFStore(DATA_STORE) as store:
store.put('sp500/stocks', df)
```
將[DJIA](https://zh.wikipedia.org/wiki/%E9%81%93%E7%90%BC%E6%96%AF%E5%B7%A5%E4%B8%9A%E5%B9%B3%E5%9D%87%E6%8C%87%E6%95%B0)(道瓊工業平均指數)加入assets.h5,其中包括美國最大、最知名的三十家上市公司,並把不需要的行刪除。
```python=
dow_jones = (pd.read_csv('^dji_d.csv', index_col=0,parse_dates=True).loc['1950':'2019'].rename(columns=str.lower))
with pd.HDFStore(DATA_STORE) as store:
store.put('dji/stooq', dow_jones)
url='https://en.wikipedia.org/wiki/Dow_Jones_Industrial_Average'
df = pd.read_html(url, header=0)[1]
df.columns =['name','exchange','ticker','industry','added','notes','index_weighting']
df = df.drop('notes',axis=1).set_index('ticker')
with pd.HDFStore(DATA_STORE) as store:store.put('dji/stooq', df)
```
將[NDX](https://zh.wikipedia.org/wiki/%E7%B4%8D%E6%96%AF%E9%81%94%E5%85%8B100%E6%8C%87%E6%95%B8)(納茲達克100)加入assets.h5,為美國納斯達克100支最大型本地及國際非金融類上市公司組成的股市指數,並把不需要的行刪除。
```python=
nasdaq = (pd.read_csv('^ndx_d.csv', index_col=0,parse_dates=True).loc['1950':'2019'].rename(columns=str.lower))
with pd.HDFStore(DATA_STORE) as store:
store.put('ndx/stooq', nasdaq)
url = 'https://en.wikipedia.org/wiki/Nasdaq-100'
df = pd.read_html(url, header=0)[3]
df.columns = [ 'name','ticker','sector','industry']
df = df.set_index('ticker')
with pd.HDFStore(DATA_STORE) as store:
store.put('ndx/stooq', df)
```
將[dax](https://zh.wikipedia.org/wiki/%E5%BE%B7%E5%9B%BDDAX%E6%8C%87%E6%95%B0)(法蘭克福指數)加入assets.h5,包含有30家主要的德國公司,並把不需要的行刪除。
```python=
with pd.HDFStore(DATA_STORE) as store:
store.put('dji/stooq', df)
dax = (pd.read_csv('^dax_d.csv', index_col=0,parse_dates=True).loc['1950':'2019'].rename(columns=str.lower))
with pd.HDFStore(DATA_STORE) as store:
store.put('dax/stooq', dax)
url = 'https://en.wikipedia.org/wiki/DAX'
df = pd.read_html(url, header=0)[3]
df.columns = ['Unnamed','name','sector','ticker','Index_weighting','employee','founded']
remove = ['Unnamed','employee']
df= df[df.columns.difference(remove)]
df = df.set_index('ticker')
with pd.HDFStore(DATA_STORE) as store:
store.put('dax/stooq', df)
```
## 執行凱利法則
```python=
with pd.HDFStore(DATA_STORE) as store:
sp500_stocks = store['sp500/stocks'].index
a = store['us_equities/stocks'].index
c = store['dji/stooq'].index
d = store['dax/stooq'].index
e = store['ndx/stooq'].index
prices = store['quandl/wiki/prices'].adj_close.unstack('ticker').filter(e)
```
將資料讀進來,其中us_equities/stocks原作者就有放入assets.h5中了
```python=
monthly_returns = prices.loc['1988':'2017'].resample('M').last().pct_change().dropna(how='all').dropna(axis=0)
stocks = monthly_returns.columns
```
設定讀取範圍以及刪除空缺值
```python=
cov = monthly_returns.cov()
precision_matrix = pd.DataFrame(inv(cov), index=stocks, columns=stocks)
kelly_allocation = monthly_returns.mean().dot(precision_matrix)
```
設定矩陣的行以及列,縱軸橫軸,算出在凱利法下每個上市公司的股價
```python=
kelly_allocation.describe()
kelly_allocation.sum()
list(kelly_allocation)
```
輸出結果
```python=
kelly_allocation[kelly_allocation.abs()>5].sort_values(ascending=False).plot.barh(figsize=(20, 5))
plt.yticks(fontsize=12)
sns.despine()
plt.tight_layout();
```
以值方圖表示權重絕對值大於5倍的原權重的公司
```python=
ax = monthly_returns.loc['2010':].mul(kelly_allocation.div(kelly_allocation.sum())).sum(1).to_frame('Kelly').add(1).cumprod().sub(1).plot(figsize=(14,4));
sp500.filter(monthly_returns.loc['2010':].index).pct_change().add(1).cumprod().sub(1).to_frame('').plot(ax=ax, legend=True)
plt.tight_layout()
sns.despine();
```
畫出公司的股價以及有經過凱利公式進行買賣的結果
## 結果
<font color = orange size=4>sp500結果</font>


<font color = orange size=4>us_equities結果</font>


<font color =orange size = 4> DJI結果</font>


<font color =orange size = 4> NDX結果</font>

<font color =orange size = 4> DAX結果</font>

可以看到他的值為空的,回頭看程式碼
```python=
with pd.HDFStore(DATA_STORE) as store:
sp500_stocks = store['sp500/stocks'].index
a = store['us_equities/stocks'].index
c = store['dji/stooq'].index
d = store['dax/stooq'].index
e = store['ndx/stooq'].index
prices = store['quandl/wiki/prices'].adj_close.unstack('ticker').filter(d)
```
我們發現原來資料是從asset.h5中的<font color = blue>**quandl/wiki/prices**</font>找出他的歷年股票價格,但它裡面只有3000+檔美股資料而已,沒有德國的,那我們試著在assets.h5中加入其他國家的股票價格
-

我們在[quandl](https://www.quandl.com/)中尋找,找到一個從[WVB Dossier](https://wvb.com/dossier.htm)(WVB Global Fundamentals Dossier)抓取全球超過55,000+公司的股價,但很可惜需要訂閱才能下載,所以沒能成功
# 遇到的問題
```python=
with pd.HDFStore(DATA_STORE) as store:
sp500_stocks = store['sp500/stocks'].index
a = store['us_equities/stocks'].index
c = store['dji/stooq'].index
d = store['dax/stooq'].index
e = store['ndx/stooq'].index
prices = store['quandl/wiki/prices'].adj_close.unstack('ticker').filter(c)
```
再換資料的時候sp500,a是沒問題的,但當將資料換為c(dji),e(ndx)時就會跑出以下結果

當在c(dji)時,嘗試將
```python=
monthly_returns = prices.loc['1988':'2017'].resample('M').last().pct_change().dropna(how='all').dropna(axis=0)
stocks = monthly_returns.columns
monthly_returns.info()
```
中的axis=0改為1,也就是如果遇到缺控值時,原本會刪除包含缺失值的行,改為刪除包含缺失值的列,就可以跑出結果了

但當資料改為e(ndx)時,怎麼都沒辦法跑出結果,覺得非常的困擾