Autoencoder asset pricing models

###### tags: `畢專` `paper` # Autoencoder asset pricing models [TOC] --- https://github.com/dkyol/Asset-Pricing-Model ## Paper Reference [Autoencoder_paper](https://poseidon01.ssrn.com/delivery.php?ID=342110087101018125106084088099111006002092063023032070066028028098081080119101007117121123020058055102054114067082093014000119049055017040015001068101104122068122059053095086101069117101089079000075099083084098002024097120009022096020099073003118124&EXT=pdf&INDEX=TRUE) [KPS_paper](http://utahwfc.org/uploads/2018_paper_02b.pdf) [GKX_paper](https://dachxiu.chicagobooth.edu/download/ML.pdf) --- * Autoencoder Asset Pricing Models * Shihao Gu * University of Chicago Booth School of Business * Bryan Kelly * Yale University, AQR Capital Management, and NBER * Dacheng Xiu * University of Chicago Booth School of Business --- ## Abstract * our model allows for latent factors and factor exposures that depend on covariates such as asset characteristics. > * exposure factor 是一個百分比值，它代表當某項資產遭受某個風險的襲擊時，該項資產的「受損率」是多少。[name=林鈺恩] * 此模型考慮了取決於共變量（例如資產特徵）的潛在因素和factor exposures。 * model factor exposures as a flexible nonlinear function of covariates. * factor exposures 作為共變量的靈活非線性函數。 * retrofits the workhorse un- supervised dimension reduction device from the machine learning literature—autoencoder neural networks * This delivers estimates of nonlinear conditional exposures and the associated latent factors * 對非線性條件暴露及其相關潛在因子的估計 * imposes the economic restriction of no-arbitrage * 施加無套利的經濟限制 ## 1 Introduction * A recent asset pricing literature has emerged challenging the “anomaly” view of characteristic-based asset return prediction. * 近期 asset pricing 的文獻，大多都是基於資產收益預測(asset return prediction)特徵的 amomaly 觀點 * The anomaly view suggests that certain asset attributes have the power to forecast returns above and beyond the expected return variation warranted as compensation for aggregate risk exposures. * 異常的觀點顯示出：某些資產屬性有能力預測超出預期收益變化範圍的收益，從而可以補償總 risk exposures。 * KPS provide empirical evidence that these so-called anomaly asset characteristics in fact proxy for unobservable and time-varying exposures to risk factors, and shows that characteristics contain little (if any) anomalous return predictability once their explanatory power for factor exposures has been accounted for. * KPS 證明異常資產的特徵可以替代無法觀察且隨時間變化的 risk factor exposures，且表明一旦考量到 factor exposures 的解釋力，那些特徵就幾乎沒有異常的報酬可預測性 * **換言之，特徵可以拿來預測收益，因為他們有助於確定補償後的總 risk exposure** * KPS(Kelly, Pruitt, and Su)他們的資產定價模型的回傳值 $r_{i,t}$ 是 $K$-factor structure > $r_{i,t} = \beta(z_{i, t-1})'+u_{i,t}$ >> * 因數 $f_t$ 是 latent(單字翻起來很玄) >> * $\beta(z_{i, t-1})$ 是一個 $K\times 1$ conditional factor exposure >> * 資產特徵 $z_{i, t-1}$ 是一個 $P\times 1$ 的向量, $P$ 是高維度的且嚴謹的, 會大於 $K$ > > $\beta(z_{i, t-1})' = z_{i,t-1}' \ \Gamma$ >> * 這邊是簡介KPS的公式先跳過 * KPS 就是將 P 特徵變成 K, $\beta$ 是線性ㄉ, 會讓整個便很簡單 * PCA 跟 autoencoder 是對立面的 model * 都是 unsupervised methods * 都沒有使用共變量的資訊來做降維 * IPCA 有使用共變量的資訊來做降維，但是還是線性模型 * 這篇論文使用新的條件式 autoencoder ，用在個別的股票報酬上 * 和 IPCA 一樣使用共變量來降維 * 可以使股票的特徵共變量對 factor exposures 有非線性和交互的影響 * factors -> portfolios (投資組合) * nonlinear conditional asset pricing model * 非線性的部分，是通過神經網路將共變量映射到 beta | | paper | IPCA | Fama-French | |:-------------------------------------------------------:|:-----:|:-----:|:-----------:| | $monthly \space total \space R^2$ | 12.6% | 13.3% | 3.4% | | $predictive \space R^2$ | 0.50% | 0.23% | $negative$ | | $porfolios(Sharpe \space Ratio)--equal \space weighted$ | 2.16 | 1.26 | -0.40 | | $porfolios(Sharpe \space Ratio)--value \space weighted$ | 0.92 | 0.59 | -0.69 | * 這篇論文是 KPS 的延伸 * KPS 是 linear latent factor APT models * 這篇允許 nonlinear ## 2 Methodology ### 2.1 Standard Autoencoder * $l$ 第幾層 * $K^{(l)}$ 第 L 層的神經元數量 * $r_k^{(l)}$ 第 L 層的第 k 個神經元的 output * $r^{(l)}$ 第 L 層的輸出 (vector) * nonlinear activation function is `ReLU` * $$ #### 2.1.1 Static Linear Factor Models as a Special Case * leverages covariates as conditioning information * 利用共變量作為條件信息 * 與PCA相比, 這個 model 允許輸入的非線性壓縮 * 當 autoencoder 的 hidden layer 使用的是線性的 activity function 時，就和 PCA 是等價ㄉ * 金融文獻中, 可以看到最常研究的資產收益模型採用的是具有 factor loading 的線性潛在因子規範 ### 2.2 Extending the Autoencoder Model to Include Covariates > ![](https://i.imgur.com/ZxTXEP8.jpg) > * P 表示的是股票的特徵, 有 94 個 > * 左邊(beta model)所敘述的事情為時間 t-1 , factor lodeing $\beta_{t-1}$ >> * network 左半邊將 factor loading 作為共變量的非線性函數(e.q. 資產特徵) >> * network 右半邊則是將 factor 作為單一股票的投資組合 > * 在最高層的數學公式可寫成: $r_{i,t} = \beta_{i, t-1}'f_t+u_{i,t}$ >> * $\beta_{i, t-1}$ as a neural network model of lagged firm characteristics ($K\times 1$ vector), $z_{i,t−1}$ >> * OAO >>> model 的左半邊 (beta model) >>> * $z_{i, t-1}^{(0)} = z_{i,t-1}$ 根據基準特徵資料 $z_{i,t-1}$ 來初始化 network >>> * $z_{i, t-1}^{(l)} = g(b^{(l-1)} +W^{(l-1)} z_{i, t-1}^{l-1}), l = 1, \dotsm, L_\beta$ 在 hidden layer 中, 非線性的特徵轉換 >>> * $\beta_{i,t-1} = b^{(L_\beta)} + W^{(L_\beta)}z_{i, t-1}^{L_\beta}$ 在輸出層會輸出一個 $K$ 維的 factor beta >> >>> model 的右半邊 (factor model) >>> * $r_t^{(0)} = r_t$ 初始化單一資產收益的向量, 當有 P 個資產收益就要作 P 次 >>> * $r_t^{(l)} = \tilde{g}(\tilde{b}^{(l-1) }+\tilde{W}^{(l-1)}r_t^{(l-1)}), l=1, \dotsm, L_f$ 在隱藏對收益進行轉置和降維 >>> * $f_t = \tilde{b}^{(L_f)} + \tilde{W}^{(L_f)}r_t^{(L_f)}$ 最後輸出 $K$ factor 的 set >> >> * 在 factor network(右半邊) 其實他的輸出就是一種投資組合, 因為他還包含了一些經濟意義 >> * 最後會讓 beta network 的輸出($N\times K$)和 factor network 的輸出進行 dot product, 會得出最合適的資產收益 > > * 在 factor model 遇到的問題為: > 1. 公司數量非常的多, 大約為 3萬家, 時間為 60x12 =720 月 > 2. 平均每個月只有6000個股票是穩定的存在的, 其他股票都必須在短時間內就要估算好權重 > * 因此針對 factor model 進行修改 >> * 右邊(factor model)是敘述在 t 時間, factor output layer 會是輸入層的加權組合 >> * P 個特徵管理的投資組合 >> * 將初始化公式改成 $r_t^{(0)} = x_t,$ >> $x_t = (Z_{t-1}'Z_{t-1})^{-1}Z_{t-1}r_t$ >> * or N 個不同的資產收益 >> * 那其意義就是希望輸出層可以跟輸入層相近 * The static linear factor model has been an extremely productive tool for studying asset returns, but recent research highlights a number of its limitations * 靜態線性因子模型一直是研究資產收益有好的成效的工具，但最近的研究強調了其局限性 * 資產收益的分佈是隨時間變化的 * 靜態因子模型是從大量相關的條件信息中抽像出來的。 * KPS 證明了共變量可以改善 the estimation of loadings 和 estimates of the latent factors * autoencoder 和 PCA 都有一個共同的缺點: 無法使用條件變量來識別因子結構，而是僅依靠收益本身 * 所以這篇 paper 將 autoencoder 和共變量結合來設計新的 neural network structure #### 2.2.1 Conditional Linear Factor Models as a Special Case ### 2.3 Regularized Autoencoder Learning * 大概是說他們仔細設計策略和廣泛使用正規化，來解決 overfitting #### 2.3.1 Training, Validation, and Testing * 將樣本依照時間切成3段 * Training * 18年 (1957 - 1974) * 用來估計受 hyperparameter 約束的 model * Validation * 12年 (1975 - 1986) * 用來tuning hyperparameter * fitted value * 根據 train 出來的結果確定驗證樣本的點 * 用來看 out-of-sample 的 performance，但並不完全是，因為他有用來 tuning * Testing * 30年 (1987 - 2016) * 完全用來看 out-of-sample 的 performance #### 2.3.2 Regularization Techniques * 通常我們會對目標函數加上 penalty 來防止 overfitting * 3 種 regularization technique * 1 $LASSO(l_i)$ > ![](https://i.imgur.com/IZiPDEb.png) > > * L是目標函數 > > * $\theta$是在前面公式的weight parameters的總結 > > * $\phi(\theta)$ 是拿來正規化的 penalty function > > * 他們使用 $LASSO \space or \space (l_i)$ ![](https://i.imgur.com/2sZvvbV.png) > > * 凸顯特徵，不重要的->0 * 2 early stopping * 一開始有 initial parameter (猜?)去施加簡約參數化 * 在每個步驟中，parameter 的猜測最後會被更新來降低 fitting errors * 在每次猜測中，也會對 validation sample 來計算 * 這個最佳化會在 validation sample 的 errors 開始上升時停止 > By ending the parameter search early, parameters are shrunken toward the initial guess, and this is how early stopping regularizes against overfit. It is a popular substitute to “l2”-penalization of θ parameters because it achieves regularization at a much lower computational cost. * 3 ensemble approach * 用多個 random seed 來 initialize 神經網路 * 將所有網路中的計算值的平均來 construct model prediction * > This enhances the stability of the results because the stochastic nature of the optimization can cause different seeds to settle at different optima. * 這樣增加了穩定性，因為優化的隨機性會導致不同的 seed 沉降? 在不同的最佳狀態 #### 2.3.3 Optimization Algorithms * 前情提要 * 高度 nonlinearity 和 nonconvexity * 大量的 parameterization * 暴力優化會算很久 * 解決方案：用 stochastic gradient descent (SGD) 來 train * 一般的 gradient descent 是用整個 training sample 來算 gradient * SGD 是用隨機的一小部分來算 * 這樣會犧牲準確性，但是可以換來大量的加速 * 在 SGD 中最重要的 tuning parameter 是 `learning rate` * 在 gradient 接近零的時候，learning rate 也需要縮到零，不然在計算 gradient 的時候 noise 會開始主導其方向訊號 * batch normalization * 用來控制對於網路中的不同區域和不同 dataset 的預測變量 * > For each hidden layer in each training step (a “batch”), the algorithm crosssectionally de-means and variance standardizes the batch inputs to restore the representation power of the unit * 每個步驟中，會對 batch inputs 做 de-means and variance standardizes ，恢復單元的 representation 的力量 ## 3 An Empirical Study of US Equity ### 3.1 Data * 從Center for Research in Securities Prices (CRSP) * 3個主要的東西（指數） * https://www.nasdaq.com/market-activity/stocks/screener * NYSE * AMEX * NASDAQ * Our sample begins in `March 1957` (the start date of the S&P 500) and ends in `December 2016`, totaling `60` years * we build a large collection of stock-level predictive characteristics based on the cross section of stock returns literature * 94 characteristic * 61 updated annually * delayed at most 6 months * 13 updated quarterly * delayed at most 4 months * 20 updated monthly * delayed at most 1 month * 如果 characteristic missing ，用當月的 characteristic 的平均取代 * 將每個月的 characteristic normalize 到 (-1, 1) * 以前的文獻都會對股票代號或是股價做篩選，因為他們對低價股票、不常見的股票的報酬的調和有困難 * 他們則因為有豐富的 feature sets 所以沒有這個困難 * total sample = 30000 * average number of stocks per month > 6200 ### 3.2 Models Comparison Set * **PCA**(principal components analysis) * linear functional form * constant beta * no conditioning information * **IPCA**(instrumented principal components analysis) * follows KPS * linear factor structure * conditional betas * **CA**(conditional autoencoder) * $CA_0$ * 1 layer beta + factor network * similar to IPCA * $CA_1$ * beta network * add a hidden layer with 32 neurons * $CA_2$ * beta network * add 16 neurons * $CA_3$ * beta network * add 8 neurons * 這邊的 factor side 都只有一層，但是 neuron 的數量會根據 factor 的數量改變，允許(1~6) * 6 factors * [Carhart 4 factor](#小筆記) 1. excess market return 2. add SMB 3. add HML 4. add UMD 5. market, SMB, HML, CMA, and RMW 6. market, SMB, HML, CMA, and RMW, UMD :::info * SMB * 規模因素 * HML * 價值因素 * UMD * 動量因素 * CMA * investment factor * RMW * profitability factor * Market, SMB, HML, CMA, RMW, and UMD factor returns are from Ken French’s website ::: ### 3.3 Statistical Performance Evaluation * Total $R^2$ * The total R2 summarizes how well the systematic factor risk in a given model specification describes the realized riskiness in the panel of individual stocks. * Predictive $R^2$ * When Γα = 0 is imposed, the predictive R2 summarizes the model’s ability to describe risk compensation solely through exposure to systematic risk. For the unrestricted model, the predictive R2 describes how well characteristics explain expected returns in any form—be it through loadings or through anomaly intercepts * $\lambda$ ### in total $R^2$ >![](https://i.imgur.com/NI394ya.jpg) >![](https://i.imgur.com/ptjE7KM.jpg) * IPCA 6-factor(14.5%) $\leftarrow$ $CA_1$ 6-factor(14.3%) * FF 會爛的原因 * 不常重新計算 model 的參數 * IPCA 真低厲害、很穩定 ### in predictive $R^2$ >![](https://i.imgur.com/aFX5OTa.jpg) >![](https://i.imgur.com/LPqdUgq.jpg) * CA 在 $r_t$ 是 IPCA 幾乎 2 倍 * PCA 和 FF 在 predictive 幾乎沒任何尊嚴可說 ### 3.4 Economic Performance Evaluation >* Out-of-Sample Sharpe Ratios of Long-Short Portfolios >![](https://i.imgur.com/VdI6uiP.jpg) > * $CA_2 \leftarrow CA_1、 CA_3 \leftarrow IPCA \leftarrow CA_0$ > * 佐證了 $R^2$ 的 FF 和 PCA is bad >* Out-of-Sample Factor Tangency Portfolio Sharpe Ratios >![](https://i.imgur.com/elCVd8F.jpg) > * 最強的是 $CA_3$ 的 5-factor > * 此圖表僅用於從經濟角度對模型的均方差效率進行定量比較還是很有幫助的 * we evaluate how return predictions from each model translate into Sharpe ratios for portfolios formed based on those predictions. * construct a zero-net-investment portfolio that buys the highest expected return stocks (decile 10) and sells the lowest (decile 1). We rebalance portfolios each month, and consider both equal-weighted and value-weighted portfolios. * The tangency portfolio return for a set of factors is constructed on a purely out-of-sample basis by using the mean and covariance matrix of estimated factors through t and tracking the post-formation t + 1 return. * 藉由 t 時間算出來的 mean 和共變異數矩陣來追蹤 t+1 的 tangency potfolio return ### 3.5 Risk Premia vs. Mispricing > * Out-of-sample Pricing Errors Across Models >![](https://i.imgur.com/1VqOjnn.jpg) >![](https://i.imgur.com/9Abv6cc.jpg) >![](https://i.imgur.com/akA46I0.jpg) > - 95 characteristic-managed portfolios $x_t$ > - $x_t$ 比 $r_t$ 還要好來解釋, 因為降過維度ㄌ > - 這些定價錯誤可以解釋為避險投資組合的平均收益，其風險在任何系統因素下皆為零；在無套利模式下，零風險資產應獲得零超額收益 * 將GKX(另一篇paper)的model 用於此篇 paper (的 data)的 $R^2$所得到的每月股票預測收益為 0.58% * GKX 只是一個單純用於股票預測的模型, 所以模型無法區分通過 risk exposure * 在沒有套利的情況下，GKX 和 CA2 一樣大約都是0.58% * it suggests that stock characteristics predict returns not because they capture “anomalous” compensation without risk, but rather because the characteristics proxy for (and help identify) compensated factor risk exposures * 因此 data 以 zero-intercept 和無套利的情況下，其計算出來的誤差應接近 0 * alpha 用於預測誤差 * $\alpha_i := E(u_{i,t})= E(r_{i,t})-E(\beta_{i, t-1}'f_t)$ * 股票收益的期望值 - model 算出來的期望值 = 預測誤差 ### 3.6 Characteristics Importance > * Top Twenty Characteristics by Variable Importance >![](https://i.imgur.com/9GaArO1.jpg) >![](https://i.imgur.com/He2SVYW.jpg) > * top 20 charactristic，占了$CA_0$的80%，$CA_1 to\ CA_3$的90% > * 將特徵分三類 > * 最有影響力的是 price trend category > * mom1m (short-term reversal): 即過去一個月或一周內收益相對較低的股票在接下來的一個月或一周內獲得正的異常收益，而收益高的股票則具有負的異常收益。 > * mom12m (stock momentum) > * chmom (momentum change) > * indmom (industry momentum) > * maxret (recent maximum return) > * mom36m (long-term reversal) > * 第二類是 liquidity variables > * turn (turn) > * std_turn (turn volatility) > * mvel1 ( log market equity) > * dolvol (dollar volume) > * ill (Amihud illiquidity) > * zerotrade (number of zero trading days) > * baspread (bid-ask spread) > * 第三類是 Risk measures > * retvol (total return volatility) > * idiovol (idiosyncratic return volatility) > * beta (market beta) > * betasq (beta-squared) * 變量重要性的定義是將給定特徵的所有值設置為零，同時保持其餘模型估計值固定 ### 3.7 Robustness Check(穩固性的檢測) > * Using Subsamples of Stocks Split by Odd or Even Permnos >![](https://i.imgur.com/nmAlQIW.jpg) > * permno : 股票代號 > * N = 29892, odd = 14984, even = 14908 > * 針對 5 factor $CA_2$ 進行計算 > * 上圖是 total $R^2$ > * 下圖是每單位風險的報酬 * 即使把資料分開也可以知道其結果很穩定 Last but not least, we demonstrate the robustness with respect to the choice of assets in the training and testing samples. In particular, we re-train the $CA_2$ model using subsamples of stocks comprised of odd or even permnos, respectively. We report the out-of-sample total $R^2$(%), predictive $R^2$(%), equal-weight and value-weight Sharpe ratios for the subsamples in Table 5. Throughout, the $CA_2$ model performs almost equally well, even when the assets used in the training and testing samples are completely non-overlapped. ## 4 Monte Carlo Simulations > * (a) Linear factor loadings > ![](https://i.imgur.com/HER0uwz.jpg) >* (b) Non-linear factor loadings >![](https://i.imgur.com/Z4UfVsG.jpg) * sample * 依照時間切成相同長度的 subsample * training, validation, testing * PCA, IPCA 是 training += validation，因為他們沒有tuning parameter * 平均的 OOS total $R^2$ and predictive $R^2$ * 100 Monte Carlo * 結果 * model (a) * IPCA total, pred 都最好 * 因為 input 共變異數是疏鬆而且 linear * $CA_1 to\ CA_3$ overfit，所以稍低(可能想表示他們的模型學習力太好ㄌ) * model (b) * CAs 打敗 IPCA * 因為 IPCA 不能捕捉 nonliearity in the model * PCA 總是 overfitting，所以很爛 * 表示在 model flexibility 和 implementation difficulty 的權衡 * shallower conditional autoencoder **強** * CA 在 linear 和 nonlinear 都成功 * 因為他們用了更複雜的功能 ## 5 Conclusion >We propose a new approach to latent factor modeling for asset pricing that draws on autoencoder methods from the machine learning literature. We adapt the standard autoencoder to allow latent factors and factor exposures to depend on asset characteristic conditioning variables. 他們提出了ㄧ種利用autoencoder對資產定價進行潛在因子建模的新方法。他們採用標準的autoencoder以允許潛在因素和因素敞口取決於資產特徵條件變量。 >The result is a nonlinear conditional asset pricing model that embeds the economic restriction of no-arbitrage within a broader neural network framework. 結果是一個在更廣泛的神經網絡框架內，嵌入無套利的經濟限制的非線性有條件asset pricing model。 >In the empirical context of monthly US stock returns, our conditional autoencoder model dominates competing asset pricing models, including Fama-French models, PCA methods, and linear conditioning methods such as IPCA. 在美國股票月度收益率的經驗背景下，他們的conditional autoencoder model 占主導地位競爭性資產定價模型，包括Fama-French模型，PCA方法和線性調節方法，例如IPCA。 >long-short decile spread portfolios sorted on stock return predictions from our preferred autoencoder produces an annualized value-weighted Sharpe ratio of 1.53, beating the next closest competitor (IPCA, with Sharpe ratio 0.96) by a wide margin, and on a purely out-of-sample basis. 多空十分位點差投資組合按股票回報率排序他們首選自動編碼器的預測產生的年度加權值夏普比率為 1.53，僅在樣本量之外，就擊敗了第二大競爭對手（IPCA，夏普比率為0.96）。 > Finally, the pricing errors in our model (likewise measured on an out-of-sample basis) are a fraction of the magnitude of those from traditional Fama-French factor models. >最後，我們模型中的定價誤差（同樣是根據樣本外）是傳統Fama-French因子的很小一部分楷模。 --- # 小筆記 ***Exposure factor*** 是一個百分比值，它代表當某項資產遭受某個風險的襲擊時，該項資產的「受損率」是多少。 Covariate(共變量) : in a study, a variable (= a quantity that can change) that may affect the result of what is being studied Exposure factor (EF) : 是評估風險的人必須定義的主觀價值。 arbitrage(套利) : 通常指在某種實物資產或金融資產（在同一市場或不同市場）擁有兩個價格的情況下，以較低的價格買進，較高的價格賣出，從而獲取低風險的收益。 factor loading(因素負荷量) : 個別變數與因素之間的相關性, 數值介於-1至1之間 ; 這些變數在這個因素裡面的 weight 有多少，或是這個變數多接近這個因素。 Lag : 1. a late payment 2. 在定量模型中，過去我們在預測未來未來變量時所考慮的時間長度 firm characteristic : 有關一家公司在人口統計學的和管理上的變數，就是一些組成公司的資訊 e.q. 公司的大小、銷售、周轉率等等 Net Asset(淨資產) : 企業的資產減去負債, 為淨資產 Price-to-Book Ratio(市賬率) : 每股的市價/每股的淨資產＝ Price-to-Book Ratio Factor model : 用來找出「某因素」與回報率之間的關係 idiosyncratic risks : 又稱為 unsystematic, 由特定證券的獨特情況所引起的價格變動風險。可以藉由將資金分散到各種資產中來最大程度地降低某些風險，但不能消除所有風險。 idiosyncratic error : 影響 dependance variable 的不可觀察因素 e.q. 隨時間和跨單位（個人，公司，城市等）變化的 panel data。 Long position : 投資人買入金融產品部位後，持有不出脫的操作方式。通常**預設後市看好看漲, 在低位买入高位卖出** * 預期價值上升的證券為Long Position的證券組合 Short position(短倉、賣空(short sell)) : 股票投資者當某種股票價格看跌時，便從經紀人手中借入該股票拋出，日後該股票價格果然下落時，再以更低的價格買進股票歸還經紀人，從而賺取中間差價。 * 預期價值下降的證券為SP的證券組合。 demeaned returns : 減掉平均數的數 (x - u) > Demeaned returns are the stream of returns over a measurement period after subtracting the mean return over the period. Demeaned returns are used for the calculation of variance, standard deviation, covariance and correlation alpha : 為1％表示在選定的時間段內該投資的投資回報率比同期的市場高1％； Frobenius norm(弗羅貝尼烏斯範數) : GG Carhart four-factor model : * Carhart四因素模型指的是為了控制系統性風險對股票的影響，對原始回報進行調整，取得控制了風險因素後的超常回報。 * 基於 Fama French * [參考一](https://wiki.mbalib.com/zh-tw/Carhart%E5%9B%9B%E5%9B%A0%E7%B4%A0%E6%A8%A1%E5%9E%8B) * [參考二](https://en.wikipedia.org/wiki/Carhart_four-factor_model) * [參考三](https://ah.nccu.edu.tw/bitstream/140.119/34088/9/57009109.pdf) Sharpe Ratios : 它代表投資者額外承受的每一單位風險所獲得的額外收益 Risk Premium(風險溢酬) : 高風險高報酬 * [wiki](https://zh.wikipedia.org/wiki/%E9%A2%A8%E9%9A%AA%E6%BA%A2%E5%83%B9) t-dist : 用於根據小樣本來估計母體呈常態分布且標準差未知的期望值。若母體標準差已知，或是樣本數足夠大時（依據Centrol Limit Theorem 漸進常態分布），則應使用normal distribution 來進行估計。 LASSO : 找最有用的特徵，沒用的縮小其值 {%youtube tgn1jdt2I9M %} exposure : 它代表投資者可能因投資而損失的金額。判定係數 : [wiki](https://zh.wikipedia.org/wiki/%E5%86%B3%E5%AE%9A%E7%B3%BB%E6%95%B0) [參考](https://zhuanlan.zhihu.com/p/99403775) # 大問題 :::danger [Autoencoder_paper](https://poseidon01.ssrn.com/delivery.php?ID=342110087101018125106084088099111006002092063023032070066028028098081080119101007117121123020058055102054114067082093014000119049055017040015001068101104122068122059053095086101069117101089079000075099083084098002024097120009022096020099073003118124&EXT=pdf&INDEX=TRUE) * characteristic-managed portfolios * conditional information * P.10 公式 16 的 $Z_{t-1}$ 看不太懂是用來做什麼的 * This preprocessing step in (16) can be viewed as adding a new initial layer to the factor neural network that dynamically (as a function of $Z_{t−1}$) collapses the $N$ returns, $r_t$ , down to $P$ neurons, $x_t$ , before proceeding with the rest of the network propagation. * 在第8頁的圖 beta model 和 factor model 都有用到 N、P 而N的意義是指股票或是公司，P是特徵； * factor model 將 N 個 return 壓成 P 個，將公司變成特徵不太了解這樣的意義 * P.13 * To avoid a forward-looking bias, we assume that monthly characteristics are delayed by at most one month, quarterly releases are delayed with a four month lag, and annual releases with a six month lag. Thus, we match realized returns at month t with the most recent monthly characteristics at the end of month t − 1, the most recent quarterly data as of t − 4, and most recent annual data as of t − 6. * 這邊的 delay 和 lag 應該是指延遲的意思? * 那為什麼 t 月要跟 t-1, t-4, t-6 做 match? * missing characteristic * 感覺是用當月的 characteristic 的平均? * P.14 * 3.2 第 2 段，第 2 行的$CA_0$，他說 `uses a single linear layer` 是指 hidden layer 有只有一層，還是說他是 input layer 後就直接 output layer? * P.19 * 3.5 的第 8 行That is, the conditional autoencoder models are all specified without an `intercept`(截距?), thus they impose the economic restriction of no-arbitrage * 這邊的截距是指什麼呢? * P.20 * alpha(錯誤率?) * t > 3 * t < 3 * 不知道是什麼? ::: ## 需要內容 * 時間的參與 * 變數的意義 * 資料 * 效力的證明