Notes on Financial Analytics

--- tags: finance, investment, time series analysis, statistics --- # Notes on Financial Analytics ## Instructor * [繆維中](http://www.fn.ntust.edu.tw/03_08.htm)教授 ## Lecture notes ### 20190429 * Seminar (speaker: Dr. Max Chen) * Black-Litterman Model * [https://zhuanlan.zhihu.com/p/38282835](https://zhuanlan.zhihu.com/p/38282835) * Litterman的書籍 * [Modern Investment Management: An Equilibrium Approach](https://www.amazon.com/Modern-Investment-Management-Equilibrium-Approach/dp/0471124109) * [Global Asset Allocation: New Methods and Applications](https://www.amazon.com/Global-Asset-Allocation-Methods-Applications/dp/0471264261) * 為什麼要學理論？ * 第一個層級：解釋現象 $r_t = f(x_t) + \epsilon_t$ * 第二個層級：預測現象 $r_{t + 1} = f'(x_{t}) + \epsilon_t$ ==而經驗發現，$f$和$f'$其實非常接近，最大的困難點在於$\epsilon_t$。有時候(也就是賠錢的時候)只是$\epsilon_t$不與你同在== * Question: 多數金融模型的$R^2$都很低。Why? * 透過假設的世界(模型選擇)，尋找市場價格與理論價格的差異，考量風險與交易成本之後，開始進行套利，直到兩個價格收斂。==萬劍歸宗?== > ``All models are wrong, but useful.'' by George Box * The father of time series analysis: [George Box](https://en.wikipedia.org/wiki/George_E._P._Box) * 錯了怎麼辦？謙卑面對市場，及時修正錯誤。==停損策略? 避險策略?== * 評價、避險、套利 ==學校只教了前兩個qq== * BARRA: multi-factor model * 中國人已經建好模型？ ==check線上的平台== * Some famous firms: * [MorningStar](https://www.morningstar.com/): 評比公司 * [https://www.simonsfoundation.org/](https://www.simonsfoundation.org/) * [D. E. Shaw group](https://www.deshaw.com/) * Check [currency overlay](https://en.wikipedia.org/wiki/Currency_overlay). * The framework of option pricing is built by [Fischer Black](https://en.wikipedia.org/wiki/Fischer_Black) & [Myron Scholes](https://en.wikipedia.org/wiki/Myron_Scholes), and Merton (1972). * 只有Scholes和Merton才拿到經濟學獎。 * Black是理論物理學家，Scholes是寫程式的，Merton是天才數學家。Black一直用物理的直觀去描述經濟現象，也因此在當時是沒有辦法被經濟領域所接受。他以自己的方式讓世界記住他的貢獻。==拜讀三篇大作與自傳== * Reading materials * [原則：生活和工作](https://www.momoshop.com.tw/goods/GoodsDetail.jsp?i_code=5582099) * [Ray Dalio的橋水基金《原則 Principles》：被下載三百多萬次的驚人秘訣](https://redhouse.statementdog.com/archives/4344) * [他是賭神，更是股神：從賭城連贏到華爾街的天才數學家，關於風險、財富和人生的第一手告白](https://www.momoshop.com.tw/goods/GoodsDetail.jsp?i_code=5874194) * [用數學，交易選擇權](https://www.momoshop.com.tw/goods/GoodsDetail.jsp?i_code=6146356) * [Fama–French three-factor model](https://en.wikipedia.org/wiki/Fama%E2%80%93French_three-factor_model) * Eugene Fama and Kenneth French (1993). * In 2013, Fama shared the Nobel Memorial Prize in Economic Sciences. * As known, the capital asset pricing model (CAPM) uses only one variable to describe the returns of a portfolio or stock with the returns of the market as a whole. * Fama and French use 3 variables: $$R_p - R_f \sim \beta (R_m - R_f) + b_s \mbox{SMB} + b_v \mbox{HML} + \alpha,$$ where SMB refers to **S**mall **M**inus **B**ig (market cap.), and HML refers to **H**igh **M**inus **L**ow (book-to-market ratio). * Question: 選擇權的beta? ### 20190506 * [Stationary process](https://en.wikipedia.org/wiki/Stationary_process) * strongly stationary * weakly stationary * N-th order stationarity * Autoregressive and Moving-Average Model (ARMA) * $\mathbf{E}(r_t)$: the mean of return rates follows an ARMA process, in form of $$y_t = \phi_0 + \sum_{i = 1} ^ p \phi_i y_{t - i} + \epsilon_{t} + \sum_{i = 1} ^ q \theta_i \epsilon_{t - i},$$ where $\epsilon_{t}$ is the [white noise](https://en.wikipedia.org/wiki/White_noise) with $\mathbf{E}(\epsilon_t) = 0$ with finite variance, iid for all $t$. ==Why MA? $y_t$的貢獻除了昨天的自己($y_{t - 1}$)以外，還有一些不能掌握的部分來自於$\epsilon_{t - 1}$ (or more terms if necessary)。== > The moving-average model should not be confused with the moving average, a distinct concept despite some similarities. (See [moving-average model](https://en.wikipedia.org/wiki/Moving-average_model) from Wikipedia.) ==For financial application, we focus on the zero-mean process because we are dealing with **RETURN**s.== * Condition for stationarity: $|\,\phi_i\,| < 1.$ * Unit root: not a stationary process, but a random walk! * Random walk possesses a unbounded variance with time evolution. * How to test a time series with unit root? See [unit root test](https://en.wikipedia.org/wiki/Unit_root_test). * Duality * AR($p$) -> MA($\infty$) for any $p$. * MA($q$) -> AR($\infty$) for any $q$. * Autocorrelation function (ACF) * To find all lags, solve [Yule-Walker equation](https://en.wikipedia.org/wiki/Autoregressive_model#Yule%E2%80%93Walker_equations) (by computers). * For AR(1) process, the magnitude of lags decays exponentially (i.e., $\phi_i^l$) for lag $l$. * When we think about a periodic process (such as business cycle), use the AR(2) process. * The period of the pseudo-periodic behavior of the time series can be done by $$K = \dfrac{2 \pi}{cos^{-1}\left(\frac{\phi_{1}}{2\sqrt{-\phi_{2}}}\right)}.$$ * Autocorrelation partial function (PACF) * The magnitude of lag $l$ can be regarded as the part of unexplainable signal by running AR($p$) model for various $p$. ==That is why **PARTIAL**.== * For PACF, $$\phi_{t, k} = \mathbf{Corr}(y_t ,y_{t - k}\,|\, y_{t - 1}, y_{t - 2}, \ldots, y_{t- k}).$$ * For ACF, $$\phi_{t, k} = \mathbf{Corr}(y_t ,y_{t - k}).$$ * For AR($p$) process, $\phi_{p + 1, p + 1}$ is null for sure. ==Cutoff at lag $p + 1$== * (Reivew) **Statistical inference** is generally regarded as comprising **hypothesis testing** and **estimation**. Hypothesis testing can be done via AIC, as discussed above. Regarding estimation, there are two types: **point estimation** and **interval estimation**. * How to determine $p$ and $q$? * In practice, we follow the parsimony principle. * [Box-Jenkins method](https://en.wikipedia.org/wiki/Box%E2%80%93Jenkins_method#Identify_p_and_q) * AIC and BIC * [Akaike Information Criterion](https://en.wikipedia.org/wiki/Akaike_information_criterion) > AIC deals with the trade-off between the goodness of fit of the model and the simplicity of the model. > AIC deals with both the risk of overfitting and the risk of underfitting. > AICc is AIC with a correction for small sample sizes. * AIC ~ residual + penalty from the number of parameters. * [Bayesian Information Criterion](https://en.wikipedia.org/wiki/Bayesian_information_criterion) * BIC = log of residual + penalty of log of likelihood function. * Gideon E. Schwarz (1978). * The BIC suffers from two main limitations: * the above approximation is only valid for sample size $n$ much larger than the number $k$ of parameters in the model. * the BIC cannot handle complex collections of models as in the variable selection (or feature selection) problem in high-dimension * MLE and baysian estimator * Read [Maximum Likelihood Estimation v.s. Bayesian Estimation](https://medium.com/datadriveninvestor/maximum-likelihood-estimation-v-s-bayesian-estimation-bfac171a8b85). * [Comparisons with other model selection methods](https://en.wikipedia.org/wiki/Akaike_information_criterion#Comparisons_with_other_model_selection_methods) * Model checking * Let $\text{SS}_{tot}$ be the variance of the data, and $\text{SS}_{res}$ be the total of squared residual. * The coefficient of determination, $$R ^ 2 = 1 - \dfrac{\text{SS}_{res}}{\text{SS}_{tot}},$$ is the proportion of the variance in the dependent variable that is predictable from the independent variable(s). * https://www.statisticshowto.datasciencecentral.com/adjusted-r2/ * Usually we assume that the residual is a white noise, following a normal distribution. * Construction of ARMA model * Stage 1: unit root test * If reject, ACF/PACF. * If true, go ARIMA. * Stage 2: estimation * Model selected criteria: try all combinations between relatively large ac's. * Stage 3: ? * $\mathbf{Var}(r_t)$: the variance of return rates follows a GARCH (generalized autoregressive conditional heteroskedasticity) process. * Robert F. Engle (1982), a winner of the Nobel Memorial Prize for Economics (2003). * Goal: explain **volatility clustering**? * Construction? > The first is to estimate a best-fitting autoregressive model. The second is to compute autocorrelations of the error term. The third step is to test for significance. Two other widely used approaches to estimating and predicting financial volatility are the classic historical volatility (VolSD) method and the exponentially weighted moving average volatility (VolEWMA) method. * AR($p$) vs. MA($q$) * ACF / PAFC ![](https://i.imgur.com/NiGDRVX.jpg =400x) * We prefer AR($p$). ==你相信你的報酬來自於過去的一堆雜訊嗎???== * MA($q$) is unconditionally stable. ### 20190513: seminar * Instructor info: * Dr. 董夢雲 * Lecture notes: [github](https://github.com/andydong1209) * Programming capability ==你報的價格是基於勞動成本?還是市場價值?== * [C++ Boost](https://en.wikipedia.org/wiki/Boost_(C%2B%2B_libraries)) * [https://www.boost.org/](https://www.boost.org/) * [A Short Introduction to Selected Classes of the Boost C++ Library](https://www.quantlib.org/slides/dima-boost-intro.pdf) * [boost: Getting Started on Unix Variants](https://www.boost.org/doc/libs/1_70_0/more/getting_started/unix-variants.html * Online examples from [Boost C++ Application Development Cookbook](https://github.com/apolukhin/Boost-Cookbook) * [QuantLib](https://www.quantlib.org/) * C++ * [How I discovered the C++ algorithm library and learned not to reinvent the wheel](https://medium.freecodecamp.org/how-i-discovered-the-c-algorithm-library-and-learned-not-to-reinvent-the-wheel-2398a34e23e3) * Python (by [SWIG](http://swig.org/)) * Prof. Miao補充 * 金融商品大類 * Equity, FX * Credit * Interest rate * [Hull-White model](https://en.wikipedia.org/wiki/Hull%E2%80%93White_model), which is a short rate model * [LIBOR market model](https://en.wikipedia.org/wiki/LIBOR_market_model), which is a forward rate model * Structured product? * Celebrity * [Damiano Brigo](https://www.imperial.ac.uk/people/damiano.brigo) * [Mark S. Joshi](https://en.wikipedia.org/wiki/Mark_S._Joshi) * Random number generator * [Sobol sequence](https://en.wikipedia.org/wiki/Sobol_sequence) * [Construction and Comparison of High-Dimensional Sobol's Generators](http://www.broda.co.uk/doc/HD_SobolGenerator.pdf) * [Mersenne Twister](https://en.wikipedia.org/wiki/Mersenne_Twister) * MT19937, $2^{19937} - 1.$ * LCG > Any Monte Carlo simulation should use an LCG with a modulus greater and preferably much greater than the cube root of the number of random samples which are required. This means, for example, that a (good) 32-bit LCG can be used to obtain about a thousand random numbers; a 64-bit LCG is good for about $2^{21}$ random samples which is a little over two million, etc. For this reason, LCGs are in practice not suitable for large scale Monte Carlo simulations. > LCGs are not intended, and must not be used, for cryptographic applications; use a cryptographically secure pseudorandom number generator for such applications. * How to test the quality of random number sequences? [Randomness tests](https://en.wikipedia.org/wiki/Randomness_tests). * 台灣的銀行如何找出折現曲線? * [Taiwan TAIBOR: Fixing Rate: Month End: 3 Months](https://www.ceicdata.com/en/taiwan/taibor-and-interbank-call-loan-rates/taibor-fixing-rate-month-end-3-months) * [Interest rate swap](https://en.wikipedia.org/wiki/Interest_rate_swap) * TAIBOR + IRS curve + credit ranking? * 台灣分九級 * 多數銀行/券商都是buy side，少數大銀行是sell side，其中sell side除了報價以外，還需要做盤中避險，所以需要大量的計算能力。 * 兆豐: 六兆部位 ### 20190520 * Review: what is stationary? * For any time, the mean and variance are both constant (independent of time?) * For any lag, the correlation is dependent on lags. * Review: necessary and sufficient condition for AR$(p)$ process being stationary? * $|\,$ roots of AR characteristic equation $\,| > 1$ * Note that the *root* of characteristic equations is the reciprocal of the characteristic root. ==這是慣例== * If not? Divergent! * Why MA$(q)$ process is always stationary? * MA: iid (white noise!!) * **Invertibility**: MA$(q) \rightarrow$ AR$(\infty)$ * Concept: 過去的自己對現在的自己影響很小 * Assume that $u_t = y_t + \theta_1 u_{t - 1}$. * Then we have $u_t = y_t + \sum_{i}^{\infty} \theta_1^i y_{t - i}.$ * So $u_t$ is bounded iff $|\, \theta_i \,| < 1$. * Necessary and sufficient condition for MA$(q)$ process being invertible? * $|\,$ roots of MA characteristic equation $\,| > 1$ ==MA和AR的對偶關係：MA可逆的條件 vs. AR穩定的條件== * ARMA$(1, 1)$ * $y_t - \phi_1 y_{t - 1} = u_t + \theta_{1} u_{t - 1}.$ * $(1 - \phi_1 L)y_{t} = (1 + \theta_{1} L) u_t.$ * **Forecasting** * $\hat{r}_h(l):$ the predictor of return rate on time $h + l$ at time $h$. * It is trivial to predict a long-term result since this is a stationary process (**mean-reverting** process). * So it is useful in a short-term prediction. A question is, **how long?** * AR$(p)$, MA($q$), or ARMA($p, q$)? * You can always convert ARMA$(p, q)$ into an AR process or MA process. * For MA process, the coefficents of MA process are the magnitudes of impules. * **Impulse response function (IRF)** * See https://www.mathworks.com/help/econ/armairf.html * Application * https://onlinelibrary.wiley.com/doi/abs/10.1111/ecin.12053 * 通篇論文並沒有提及ARMA model。 * 此篇論文採用MA的手段，其中MA的成分是由跟油價自身無關但oil-sensitive stock prices構成： $$y_t = \alpha + \sum_h \beta_{h} x_{t - h} + u_t.$$ ==ARMA是一個框架，使用ARMA要帶入經濟上或者財務上的insight== * **G**eneralized **A**uto**r**egressive **C**onditional **H**eteroskedasticity (GARCH) model * Proposed by Engle and Granger, who win the Nobel prize in 2003. * Engle: **time-varying volatility** * Granger: **cointegration** * Recall that the ARMA process describes the mean of the time series, but not inscribe the variance property. * Consider $y_t - \boxed{\mbox{mean equation}} = u_t \sim N(0, \sigma^2).$ * Take $u_t = \sigma_t v_t$, where $v_t \sim N(0, 1)$ and $$\sigma_t^2 = \mathbf{Var}(u_t\,|\, y_{t - 1}).$$ * Special case: AR(1)-GARCH(1, 1) * $\sigma_t ^ 2 = \boxed{\alpha_0 + \alpha_1 u_{t - 1} ^ 2 }+ \beta_1 \sigma_{t - 1} ^ 2.$ * The box indicates the ARCH part. * The condition of stationarity is $1 - \alpha_1 - \beta_1 < 1.$ * 實務上估計風險的兩個模型： * EWMA: exponentially-weighted moving-average $$\sigma_t ^ 2 = (1 - \lambda) u_{t - 1} ^2 + \lambda \sigma_{t - 1} ^2$$ * 不穩定(有單根)，但是很有用。==example?== * GARCH(1, 1) * Engle假設variance是穩態。 ==example?== * Why not ARCH but GRACH? * 因為只有ARCH的時候，ARCH要取很多lag。 * 用前一期的估值可以得到不差的效果!!! ==有一說，金融資料基本上是AR(0)-GARCH(1, 1)，代表市場是超級有效率，過去的資料是沒有任何幫助的== ### 20190527 * **Mean equation** vs. **variance equation** * $\mathbf{E}(\boxed{y_t}\,|\,\mathbb{F}_{t - 1}) = \mu_t$中的主角是$y_t$，而$y_t = \phi_0 + \phi_{t - 1} y_{t - 1} + \boxed{\epsilon_t}$中的最後一項是時間$t$的時候才得知。==主角是未知的== * $\mathbf{Var}(y_t\,|\,\mathbb{F}_{t - 1}) = \boxed{\sigma_t^2} = \alpha_0 + \alpha_{1} \sigma_{t - 1}^2 + \beta_1 \sigma_{t -1} ^ 2$不用加殘差項，因為我們要做的就是估計根據過去的資訊所得到的$\sigma_t ^ 2$。 ==繆：要知道哪一條式子才需要殘差項！== * Kurtosis of ARCH(1) * ARCH(1)的kurtosis會大於三，讓ARCH可以具備厚尾的效果。 * Maximum likelihood estimation (MLE) * 回歸式$y_t = \boxed{\alpha + \beta x_t} + u_t$中，我們假設$u_t \sim \text{norm}(0, \sigma^2)$，而就是mean equation，所以$y_t \sim \text{norm}(\alpha + \beta x, \sigma^2)$。 ==這也是為什麼多數我們都在說normal distribution!!!== * Difference between EWMA and GARCH: * EWMA: $$\sigma_t^2 = (1 - \lambda) u_{t - 1} ^2 + \lambda \sigma_{t - 1} ^ 2$$ * GARCH: $$\sigma_t ^ 2 = \alpha_0 + \alpha_1 u_{t - 1}^2 + \beta_1 \sigma_{t - 1} ^ 2 \\ = \boxed{1 - \alpha_1 - \beta_1} \sigma^2 + \alpha_1 u_{t - 1}^2 + \beta_1 \sigma_{t - 1} ^ 2,$$ where $\sigma^2$ is the long-term variance!!! * Recall that $\mathbf{Var}(y_t) = \sigma^2.$ ==概念上的差異就是GARCH是三項的加權平均，而EWMA是過去一期的加權平均== ![](https://i.imgur.com/Vh305UJ.jpg =400x) * On construction of ARMA model ==自己畫流程圖== * 繆：$R^2 < 0.05$是預期中的事情。 * Model checking: **如何才是好的配適？** * ARMA配得好，殘差應該是iid，也就是ACF/PACF沒有自我相關。 * GARCH配得好，**標準化**後的殘差應該也要遵從iid，因為GARCH就是把異質的variance描述得當。 * Recall: GRACH描述的是ARMA裡殘差項的variance。 * Ljung-Box test要做兩次 * mean equation of standardized residuals * variance equation of standardized residuals ![](https://i.imgur.com/GTfjIRj.jpg =400x) * Summary * 時間序列分析的最終形式應該要回到人類可以直接感受的結論，例如：VaR。 ![](https://i.imgur.com/vkTi4wo.jpg =400x) ### 20190603 * Report 1 * For unit test, $$\Delta y = y_{t} - y_{t - 1} = \ldots + \boxed{\phi} y_{t - 1} + \ldots,$$ and we choose $H_0: \phi = 0$ (has a unit root). * $H_1: \phi < 0$ has no unit root. * If $H_0$ is rejected, then the time series $\Delta y$ is stationary. * For LB test, ? * For model diagnosis, remember to **check if the confidence interval covers the zero point**. * [Theil-Sen estimator](https://en.wikipedia.org/wiki/Theil%E2%80%93Sen_estimator) * In non-parametric statistics, the Theil-Sen estimator is a method for robustly fitting a line to sample points in the plane (simple linear regression) by **choosing the median of the slopes of all lines through pairs of points**. * Report 2 * White noise: no autocorrelation * 採取MLE的時候其實已經假設residue是常態分配，所以事後檢查的對象就必須要用KS test去看是否符合。 * Report 3 * Stationary? 如果沒有明顯的trend，則是看是否有不均勻的std；如果有，將其序列加上一個GARCH來描述variance equation。==取兩次以上的差分非常少見 (繆：我沒有看過)== * 兩個時間序列的關係如何描述？ * correlation? 但是沒有時間的概念！ * pairs trading: **cointegration** * Order of integration * $I(0)$: stationary process (no unit root) * $I(1)$: 取一次差分之後變成穩態數列者，例如：股價取一次差分之後變成報酬，我們會說報酬是$I(1)$數列。 * $I(2)$: 取差分兩次才是穩態。==繆：打兩下才乖的數列== ![](https://i.imgur.com/f0qzau9.jpg =400x) * Stationary process: mean-reverting (MR) * Engle-Granger 2-step procedure (cointegration test) * Consider the time series: $$\hat{u}_t = \log(P_t^Y) - \left[\alpha + \beta \log(P_t^X)\right].$$ ![](https://i.imgur.com/3FvdBho.jpg =400x) ![](https://i.imgur.com/md9HQrC.jpg =400x) * 時間序列版本的相關係數 ==繆：善用GARCH的精神可以幫很多不同的時間序列建立模型== ![](https://i.imgur.com/LDH8iWG.jpg =400x) * CCC-GARCH * https://www.value-at-risk.net/ccc-garch/ * http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/sfehtmlnode68.html * DCC-GARCH * https://core.ac.uk/download/pdf/52106361.pdf ### 20190610 * Dr. 曾凱逸：傳遞效應GARCH及VAR的實證應用 * 基金管理人 * Vector autoregression (VAR) * https://en.wikipedia.org/wiki/Vector_autoregression * 2008金融海嘯之後，2009歐債危機 (PIGS) * https://en.wikipedia.org/wiki/European_debt_crisis ![](https://upload.wikimedia.org/wikipedia/commons/c/c3/Long-term_interest_rates_%28eurozone%29.png =400x) * 總經因子 * 金價 vs. 美元 * 原油價格 vs. 美元 * 原物料現貨指數 vs. 美元 * MSCI vs. 美元 * 美債十年殖利率 vs. 美元 * S&P 500 vs. 美元 * 美元指數 * https://en.wikipedia.org/wiki/U.S._Dollar_Index ![](https://upload.wikimedia.org/wikipedia/commons/5/54/US_Dollar_Index_from_Stooq_dot_com.png =500x) * Granger causality (格蘭傑因果關係檢驗) * https://en.wikipedia.org/wiki/Granger_causality * 全球股票市場可視為同一市場；原物料市場表現比較接近供需原理，與股票市場行為不同。 * 可參考投影片列舉的文獻，集中在2011~2012。 * Financialization phenomena: 同質性增加，導致馬可維茲的投資組合理論的效果變差。 * **Lehman Brothers Bankruptcy**在學術上扮演重要的時間點，例如：商品期貨市場(e.g. VIX, CDS)的行為開始和股票市場同質化。 * Sovereign CDS * http://www.worldgovernmentbonds.com/sovereign-cds/ ![](https://i.imgur.com/wsNxUqp.png =600x) * **商業本票市場**在雷曼兄弟破產之前開始出現流動性風險。 * 商業本票用在籌措短期資金。(See http://www.tbfa.org.tw/business/short_main_01.html) * 繆的提問 * 金融/交易員偏好的維度是intraday time series/high frequency data * 經濟/總經的偏好是日甚至是週以上的scale * 陳旭昇老師表示：高頻資料很**髒** * Volatility Spillover Effect * VAR -> Granger's causality -> IRF -> variance decomposition * 外溢效應 * My question: why not S&P500? ### 20190617 * Demonstration of final reports * 用GARCH/EWMA來估計VaR與其效果 ==Why estimating VaR? Risk management.== * Try 台指期選擇權. * Miao's comment: 除了GARCH以外，extreme value theory也是一種方法。 * 被邀請作為discussant的態度：how to criticize a work properly * 股價的時間序列是穩態是可能發生的。 * 觀察的時間長度、股性 * S&P500 VIX和TXF VIX的關係 * VIX的編撰 * VIX代表的經濟意義 * VIX > 40%: 悲觀的開始？ * VIX < 15%: 過度樂觀 * VIX的指數可以用ARMA(2, 2)-GARCH(1, 2)的模型來捕捉！==繆：風險的風險？！== * 配對交易 * Fama-French 5-Factor Model * https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/f-f_5_factors_2x3.html * misc * Sophisticated, automated derivatives trading: https://www.itiviti.com/orc * miniP fwer * Family-wise error rate: https://en.wikipedia.org/wiki/Family-wise_error_rate ### Quiz 3 Take Home Exam #### ARMA * 請整理AR(1)的unconditional mean, unconditional variance, 與ACF的公式，並說明如何推導。 * 請整理並推導AR($p$)的unconditional mean，並說明PACF為何在p期之後會cut off。 * 請整理MA(1)的unconditional mean, unconditional variance, 與ACF的公式，並說明如何推導。 * 請整理並推導MA($q$)的unconditional mean，並說明ACF為何在q期之後會cut off。 * 請整理ARMA$(1, 1)$的unconditional mean, unconditional variance, 與ACF的公式，並說明如如何推導。 * 請說明stationary的意義，並解釋為何MA永遠stationary。 * 請說明invertible的意義，並解釋為何AR永遠invertible。 * 請問IRF (impulse response function)如何定義? #### GARCH * 在ARCH(1)中，其unconditional variance的公式為何?請推導之。 * 在ARCH(1)中，其unconditional kurtosis的公式為何?請推導之。 * 以上兩部分的推導，說明了ARCH(1)的參數有甚麼限制? * 當對ARCH(1)的參數進行估計時，我們採用MLE法。請依Tsay課本 p.189，寫下likelihood function，並輸名MLE的精神為何。 * 參考Tsay課本 p.200的推導，說明GARCH(1, 1)下的報酬亦具有厚尾現象 (即kurtosis > 3)。 * 請說明GARCH(1, 1)的unconditional variance的公式為何? * 請說明GARCG(1, 1)相對於ARCH(1)的優勢為何。 * 請說明GARCH(1. 1)與EWMA的差異。 ### MISC * Pseudo inverse * $ABA = I$ where $AB$ is the left pseudo inverse of $A$ and $BA$ is the right pseudo inverse of $A$. * $B$ is not unique! * Estimation of sample size in cointegration * $n$ degrees of freedom lead to $x ^ n$ where $x$ is the sample size for estimation on single varialbe. * Degrees of freedom: assets, lags, hypotheses (five?) ## Notes of Time Series Analysis ### References * [http://www.phdeconomics.sssup.it/documents/Lesson1.pdf](http://www.phdeconomics.sssup.it/documents/Lesson1.pdf) (lesson 1 to 19) ### Summary * Lesson 1-2 * Data Generating Process (DGP) ![](https://i.imgur.com/irm9ZOc.png =300x) * The major task of time series analysis is **to construct a model of the DGP**. ![](https://i.imgur.com/Tlhyev9.png =300x) * **Yule**'s researches led to the notion of the **autoregressive** scheme. * **Slutsky**'s researches led to the notion of a **moving average** scheme. * In Lesson 3, the author talked about the stochastic process (a quick review of probability measures and etc). Well... skip it for now and come back for math later. * Lesson 4-5 * Stationarity is a rather intuitive concept: the statistical properties of the process do not change over time. * **Strong stationarity** concerns the shift-invariance (in time) of its finite-dimensional distributions. * **An iid process is a strongly stationary process.** * So the knowledge of the past has no value for predicting the future. * An iid process is unpredictable. * For example, the stock price change is an iid process, under efficient capital market hypothesis. * **Weak stationarity** only concerns the shift-invariance (in time) of first and second moments of a process. ==好像有個定理是a.s. $\rightarrow$ convergence in probability $\rightarrow$ convergence in distribution? Doublecheck== * The process $\{x_t; t \in \mathbb{Z}^+\}$ is weakly stationary, or covariance-stationary if 1. $\mathbf{E}|\,x_t\,|^2 < \infty$ for all $t$; 2. $\mathbf{E}(x_t) = \mu$ for all $t$; 3. $\textbf{Cov}(x_{t_1}, x_{t_2}) = \textbf{Cov}(x_{t_{1 + h}}, x_{t_{2 + h}})$ for all $t_1, t_2, h$. * For example, the white noise process (zero-mean finite-variance, zero covairnce). ==white noise不一定iid!! 只有Gaussian white noise is an iid process!!!! iid可以是white noise!!!== * Recall that **zero corr $\rightarrow$ independent** but not hold in the reverse direction. * **Strong stationarity does not necessarily imply weak stationarity.** * For example, an iid process following the Cauchy distribution is strongly but not weakly stationary because of the infiniteness of variance in Cauchy distribution. * **Strong stationarity with finite variance $\rightarrow$ weak stationarity.** * **A weakly stationary process is not necessarily strongly stationary.** ==Example? Intuition?== * **Gaussian stochastic process is both weakly and strongly stationary.** * Example for non-stationary process: random walk ($y_t = y_{t - 1} + \epsilon_t$). ==check its first two moments, especially the variance== * Lesson 6 ==very very important!!!== * Estimation of mean/autocovariance function * **Averaging over the realizations** is hard. * Averaging along time axis is good **if the stationary process is ergodic**. ![](https://i.imgur.com/2TPF1mG.png =500x) * **Covariance-ergodicity implies mean-ergodicity, but not the reverse.** * **Not all stationary processes are ergodic.** * **The ergodicity is a matter of information contained in a single realization of a long duration of the process.** (see p. 21.) * If the process is not too persistent (ergodicity), so that each element of the realization $x_1, x_2, \ldots, x_T$ will contain some information not available from the other elements, then a single realization of a long duration will be sufficient to obtain a good estimate of its moments. * Ergodic Theorems * Slutsky's Theorem: A stationary process $x_t$ with mean and autocovariance function $\gamma_x(k)$ is mean-ergodic if and only if $$\lim_{T \rightarrow \infty} \frac{1}{T}\sum_{k = 0}^{T - 1}\gamma_x(k) = 0.$$ * .... ==等有空再回來看完orz== * Conclusions * For stationary ergodic processes, we do not need to observe separate independent realizations of the process in order to obtain a consistent estimate of its mean value or other moments. * A good estimate of the moments of the process can be obtained considering only one sufficiently long realization of the process. ==白話文：只要滿足stationary & ergodic，則觀察足夠長的資料之後，單一realization所得到的估計量就足夠好。== * Lesson 7: Estimation of Autocorrelation and Partial Autocorrelation Function * ACF * PACF * Lesson 8: Hypothesis Testing * Lesson 9: ARMA Model * Lesson 10: Covariances of Causal ARMA Processes * Lesson 11: The Wold Decomposition Theorem * Lesson 12: Parameter Estimation of ARMA Model * Lesson 13: Box-Jenkins Modeling Strategy for Building ARMA Model * 經典著作 ==Box, G.E.P. and G.M. Jenkins (1970) Time series analysis: Forecasting and control, San Francisco: Holden-Day== * Model identification * The Box-Jenkins methodology requires that the ARMA($p, q$) process to be used in describing the DGP to be both **stationary** and **invertible**. * Hint: no trend, no systematic change in variance (Box-Cox transformation? GARCH?), no periodic variation * Box-Cox transformation ![](https://i.imgur.com/lzE3Mi2.png =400x) * If the variance of the series appears to increase quadratically with the mean, the logarithmic transformation ($\lambda = 0$) is appropriate; * If the variance increases linearly with the mean, we should use ($\lambda = 0.5$), that is the square-root trasformation. * The guidelines for the choice of p and q come from the shape of two sample functions: * SACF * SPACF ==選擇方式就是繆老師上課整理後的表== * **Akaike Information Criterion** (AIC) and **Bayes Information Criteria** (BIC) * **BIC is consistent in the sense that the probability of selecting the true model approaches 1** (if the true model is in the candidate list), but AIC is not. * Model estimation * Model checking (next chapter) * Lesson 14: Model Checking * If $p$ and $q$ are well specified (the model chosen is correct), and if the estimated parameters are close to the actual values, then **the residuals should be a realization of a white noise**. ![](https://i.imgur.com/funLEtM.png =500x) * Lesson 15: Examples * Lesson 16: Forecasting Stationary Time Series * Lesson 17: Vector AutoRegressive Models * Lesson 18: Building Vector Autoregressive Model * Lesson 19: Comparing Predictive Accuracy of two Forecasts: The Diebold-Mariano Test ## Cointegrated VAR * Pioneered by Engle and Granger (1987) and Johansen (1988). * Cointegration is not correlation. * Cointegration need the property of stationarity. * Example of nonstationary time series: Brownian motion * $\mathbf{Var}(X_t) = t$ & $\mathbf{cov}(X_t, X_{t - s}) = t - s$ ### Vector Error Correction (VEC) Model ### Application: Pairs Trading * Pioneered: Gerry Bamberger and Nunzio Tartaglia, Quantitative group at Morgan Stanley, around 1980s. * D.E. Shaw & Co. is famous for this strategy. * Pairs trading is * a **market-neutral** trading strategy; * one of approaches to statistical arbitrage. > **Arbitrage**: free lunch; earning extra profit without taking additional risk. > **Statistical arbitrage**: attempts to profit from the likelihood that prices will trend together toward a historical norm. **Unlike arbitrage, statistical arbitrage is not risk-free.** * Stable vs. stationary? * Stable: 從任一點上開始的time series都是stationray ### Reference * https://medium.com/auquan * https://www.sta.cuhk.edu.hk/hywong/ * Cointegration in Matlab: https://www.mathworks.com/matlabcentral/fileexchange/31060-cointegration-and-pairs-trading-with-econometrics-toolbox