Covariance_Predictor

# Covariance Predictors in Financial Portfolio Management ## Introduction Accurate prediction of covariance structures is essential in financial portfolio management, impacting risk assessment and asset allocation decisions. This note focuses on three prominent models: * **Exponentially Weighted Moving Average (EWMA):** Emphasizes recent data, adapting to dynamic market conditions. * **Iterated EWMA (IEWMA):** Refines EWMA through a two-step iterative process for enhanced stability and precision. * **Combined Model IEWMA (CM-IEWMA):** Optimizes multiple IEWMA models with varying parameters to maximize prediction accuracy. By understanding these advanced models, portfolio managers can gain a robust foundation for navigating changing market conditions and enhancing overall portfolio performance. ## Covariance Predictors Covariance predictors are statistical models or algorithms used to forecast the covariance matrix of asset returns. The covariance matrix is a critical input in many financial applications, most notably in portfolio optimization and risk management. It quantifies the relationships between asset returns, providing insights into how they move together (or independently). ### Mathematical Setup 1. **Return Vector $r_t$:** * An $n$-dimensional vector representing the returns of $n$ assets over a specific time period. * Denoted as $r_t \in \mathbb{R}^n$. 2. **Covariance Matrix $\Sigma_t$:** * A symmetric positive definite matrix capturing the covariance of asset returns. * Denoted as $\Sigma_t \in S_n^{++}$ (the set of symmetric positive definite matrices). * Mathematically defined as $$\Sigma_t = \text{cov}(r_t) = \mathbb{E}[r_t r_t^T] - (\mathbb{E}[r_t])(\mathbb{E}[r_t])^T)$$ 3. **Predicted Volatilities $\hat{\sigma}_t$:** * Derived from the diagonal elements of the estimated covariance matrix $\text{diag}(\hat{\Sigma}_t)$, which represent variances. * Calculated as $$\hat{\sigma}_t = \text{diag}(\hat{\Sigma}_t)^{1/2}$$ 4. **Predicted Correlations $\hat{R}_t$:** * Obtained by normalizing the estimated covariance matrix with the inverse of the predicted volatilities. * Calculated as $$\hat{R}_t = \text{diag}(\hat{\sigma}_t)^{-1} \hat{\Sigma}_t \text{diag}(\hat{\sigma}_t)^{-1}$$ 5. **Log-Likelihood Function $l_t(\hat{\Sigma}_t)$**: * Assumes returns follow a multivariate Gaussian distribution. * Used to assess the quality of the estimated covariance matrix. * Higher log-likelihood values indicate better model fit. * Defined as: $$l_t(\hat{\Sigma}_t) = \frac{1}{2} \left( -n \log(2\pi) - \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right)$$ ### Rolling Window The rolling window predictor, with a window length or memory $M$, estimates the covariance matrix $\hat{\Sigma}_t$ using the **average** of the last $M \geq n$ outer products. The formula for $\hat{\Sigma}_t$ is given by: $$ \hat{\Sigma}_t = \alpha_t \sum_{\tau=t-M}^{t-1} r_{\tau} r_{\tau}^T, \quad t = 2, 3, \ldots $$ where * $\hat{\Sigma}_t$: The estimated covariance matrix at time $t$. * $\alpha_t = \frac{1}{\min \{t-1,M\}}$: A normalization factor or weighting coefficient applied at time $t$. * $\sum_{\tau=t-M}^{t-1}$: Summation over a moving window of size $M$ periods, from $t-M$ to $t-1$. * $r_\tau$: The return vector at time $\tau$. * $r_\tau r_\tau^T$: The outer product of the return vector $r_\tau$ with itself, resulting in a covariance matrix for that period. ```python= def rolling_window(returns, memory, min_periods=20): min_periods = max(min_periods, 1) times = returns.index assets = returns.columns returns = returns.values Sigmas = np.zeros((returns.shape[0], returns.shape[1], returns.shape[1])) Sigmas[0] = np.outer(returns[0], returns[0]) for t in range(1, returns.shape[0]): alpha_old = 1 / min(t + 1, memory) alpha_new = 1 / min(t + 2, memory) if t >= memory: Sigmas[t] = alpha_new / alpha_old * Sigmas[t - 1] + alpha_new * ( np.outer(returns[t], returns[t]) - np.outer(returns[t - memory], returns[t - memory]) ) else: Sigmas[t] = alpha_new / alpha_old * Sigmas[t - 1] + alpha_new * ( np.outer(returns[t], returns[t]) ) Sigmas = Sigmas[min_periods - 1 :] times = times[min_periods - 1 :] return { times[t]: pd.DataFrame(Sigmas[t], index=assets, columns=assets) for t in range(len(times)) } ``` ### Multivariate GARCH Add contents and code implementation ### EWMA Covariance Predictors The Exponentially Weighted Moving Average (EWMA) model is a widely used method to estimate volatility and covariance. It gives more weight to recent observations, making it adaptable to changing market conditions. * **Formula:** $$\hat{\Sigma}_t = \alpha_t \sum_{\tau=1}^{t-1} \beta^{t-1-\tau} r_\tau r_\tau^T$$ where * $\hat{\Sigma}_t$: The estimated covariance matrix at time $t$. * $\alpha_t$: A normalization constant ensuring consistent weights across time. * $\beta$: The forgetting factor $0 < \beta < 1$, controlling the rate of weight decay. Higher $\beta$ emphasizes recent observations. * **Half-Life $H$:** The number of periods for the weight of an observation to halve. It's related to the forgetting factor by: $$H = - \frac{\log 2}{\log \beta}$$ #### Interpretation and Example Consider a forgetting factor of $\beta = 0.94$. This corresponds to a half-life of approximately 11 days: $$ H = \frac{-\log(2)}{\log(0.94)} \approx 11 \text{ days} $$ This means that the influence of returns on the covariance estimate is halved every 11 days, highlighting the model's emphasis on recent data. ![forgetting factor](https://hackmd.io/_uploads/ByEzdS9Q0.png) #### Code Implementation ```python= def _general(y, times, halflife=None, alpha=None, fct=lambda x: x, min_periods=0, clip_at=None): """ Calculates EWMA for a given function (fct) applied to input data (y). """ def f(k): if k < min_periods - 1: return times[k], _ewma * np.nan return times[k], _ewma if halflife: alpha = 1 - np.exp(-np.log(2) / halflife) beta = 1 - alpha # first row, important to initialize the _ewma variable _ewma = fct(y[0]) yield f(k=0) # iterate through all the remaining rows of y. Start in the 2nd row for n, row in enumerate(y[1:], start=1): # replace NaNs in row with non-NaN values from _ewma and vice versa next_val = fct(row) # update the moving average if clip_at and n >= min_periods + 1: _ewma = _ewma + (1 - beta) * ( np.clip(next_val, -clip_at * _ewma, clip_at * _ewma) - _ewma ) / (1 - np.power(beta, n + 1)) else: _ewma = _ewma + (1 - beta) * (next_val - _ewma) / ( 1 - np.power(beta, n + 1) ) yield f(k=n) def _ewma_cov(data, halflife, min_periods=0): """ Calculates the EWMA covariance matrix. """ for t, ewma in _general( data.values, times=data.index, halflife=halflife, fct=lambda x: np.outer(x, x), min_periods=min_periods, ): if not np.isnan(ewma).all(): yield t, pd.DataFrame(index=data.columns, columns=data.columns, data=ewma) ``` ### IEWMA Covariance Predictor The Iterated Exponentially Weighted Moving Average (IEWMA) model enhances the traditional EWMA by incorporating a two-step iterative refinement process. This approach improves the accuracy of covariance matrix estimation, particularly the correlation structure. 1. **Volatility Estimation:** * Volatilities (standard deviations) are estimated from the diagonal elements of the initial EWMA covariance matrix, denoted as $\hat{\Sigma}_t^{(1)}$. * Mathematically: $$\hat{\sigma}_t = \text{diag}(\hat{\Sigma}_t^{(1)})^{1/2}$$ This yields a diagonal matrix, $\hat{\sigma}_t$, containing the estimated volatilities for each asset. 2. **Correlation Matrix Estimation:** * Returns are "whitened" by normalizing them with the estimated volatilities from the first step: $$\tilde{r}_t = (\hat{\Sigma}_t^{(1)})^{-1/2} r_t$$ * A new covariance matrix, $\hat{\Sigma}_t^{(2)}$, is calculated using these whitened returns. This matrix captures the correlations without being influenced by the magnitudes of the volatilities. 3. **Reconstruction of Covariance Matrix:** * The final covariance matrix, $\hat{\Sigma}_t$, is reconstructed by incorporating the refined correlation structure $\hat{\Sigma}_t^{(2)}$ and the initial volatility estimates $\hat{\sigma}_t$: $$\hat{\Sigma}_t = (\hat{\Sigma}_t^{(1)})^{1/2} \hat{\Sigma}_t^{(2)} (\hat{\Sigma}_t^{(1)})^{1/2}$$ #### Why Whitening Returns is Important Whitening is a critical step in IEWMA due to the following reasons: * **Normalization:** It standardizes the returns to have unit variance, ensuring that correlation estimates are not biased by differences in the scales of asset volatilities. * **Improved Accuracy:** Whitening allows for the estimation of pure correlations in the second step, leading to more accurate and reliable covariance matrix estimates. * **Modeling Flexibility:** The standardized returns can be used in a wider range of statistical models and applications. #### Code Implementation ```python= def iterated_ewma( returns, vola_halflife, cov_halflife, min_periods_vola=20, min_periods_cov=20, mean=False, mu_halflife1=None, mu_halflife2=None, clip_at=None, nan_to_num=True, ): """ param returns: pandas dataframe with returns for each asset param vola_halflife: half life for volatility param cov_halflife: half life for covariance param min_preiods_vola: minimum number of periods to use for volatility ewma estimate param min_periods_cov: minimum number of periods to use for covariance ewma estimate param mean: whether to estimate mean; if False, assumes zero mean data param mu_halflife1: half life for the first mean adjustment; if None, it is set to vola_halflife; only applies if mean=True param mu_halflife2: half life for the second mean adjustment; if None, it is set to cov_halflife; only applies if mean=True param clip_at: winsorizes ewma update at +-clip_at*(current ewma) in ewma; if None, no winsorization is performed nan_to_num: if True, replace NaNs in returns with 0.0 """ mu_halflife1 = mu_halflife1 or vola_halflife mu_halflife2 = mu_halflife2 or cov_halflife def scale_cov(vola, matrix): index = matrix.index columns = matrix.columns matrix = matrix.values # Convert (covariance) matrix to correlation matrix # v = 1 / np.sqrt(np.diagonal(matrix)).reshape(-1, 1) diag = np.diagonal(matrix).copy() one_inds = np.where(diag == 0)[0] if one_inds.size > 0: diag[one_inds] = 1 # temporarily, to avoid division by zero v = 1 / np.sqrt(diag).reshape(-1, 1) v[one_inds] = np.nan matrix = v * matrix * v.T cov = vola.reshape(-1, 1) * matrix * vola.reshape(1, -1) return pd.DataFrame(cov, index=index, columns=columns) def scale_mean(vola, vec1, vec2): return vec1 + vola * vec2 if nan_to_num: if returns.isna().any().any(): returns = returns.fillna(0.0) # compute the moving mean of the returns returns, returns_mean = center( returns=returns, halflife=mu_halflife1, min_periods=0, mean_adj=mean ) # estimate the volatility, clip some returns before they enter the # estimation vola = volatility( returns=returns, halflife=vola_halflife, min_periods=min_periods_vola, clip_at=clip_at, ) # adj the returns # make sure adj is NaN where vola is zero or returns are # NaN. This handles NaNs (which in some sense is the same as a constant # return series, i.e., vola = 0) adj = clip((returns / vola), clip_at=clip_at) # if vola is zero, adj is NaN adj = adj.fillna(0.0) # center the adj returns again adj, adj_mean = center(adj, halflife=mu_halflife2, min_periods=0, mean_adj=mean) for t, matrix in _ewma_cov( data=adj, halflife=cov_halflife, min_periods=min_periods_cov, ): if mean: m = scale_mean( vola=vola.loc[t].values, vec1=returns_mean.loc[t], vec2=adj_mean.loc[t] ) else: m = pd.Series(np.zeros_like(returns.shape[1]), index=returns.columns) yield IEWMA( time=t, mean=m, covariance=scale_cov(vola=vola.loc[t].values, matrix=matrix), volatility=vola.loc[t], ) ``` ### CM-IEWMA: Combining Multiple Covariance Predictors The Combined Model IEWMA (CM-IEWMA) leverages the strengths of multiple IEWMA predictors to create a more robust and accurate covariance matrix estimation. This approach recognizes that different IEWMA models, with varying parameters (like half-lives), might capture different aspects of the underlying market dynamics. #### Setup The CM-IEWMA starts with $K$ different IEWMA covariance predictors, denoted as $\hat{\Sigma}_t^{(k)}$ (where ($k = 1, 2, \dots, K$). Each predictor can have distinct half-lives for volatility and covariance estimation, allowing for a diversified set of predictions. The goal is to find the optimal weights $w_{k,t}$ for each predictor at time $t$ to create a combined covariance matrix: $$ \hat{\Sigma}_t = \sum_{k=1}^K w_{k,t} \hat{\Sigma}_t^{(k)} $$ #### Cholesky Factorization and Optimization To achieve a stable and efficient combination of these predictors, the CM-IEWMA employs a method based on Cholesky factorization and convex optimization. Here's the breakdown: 1. **Cholesky Factorization of Precision Matrices:** Each individual precision matrix (inverse covariance matrix), denoted as $(\hat{\Sigma}_t^{(k)})^{-1}$, is factorized into a lower triangular matrix $\hat{L}_t^{(k)}$ and its transpose: $$\hat{\Sigma}_t^{(k)} = (\hat{L}_t^{(k)} (\hat{L}_t^{(k)})^T)^{-1}$$ 2. **Combining Cholesky Factors:** The Cholesky factors $\hat{L}_t^{(k)}$ are combined using a set of non-negative weights $\pi_k$ (where $\sum_k \pi_k = 1)$. This weighted sum forms a new lower triangular matrix $\hat{L}_t$: $$\hat{L}_t = \sum_{k=1}^K \pi_k \hat{L}_t^{(k)}$$ 3. **Recovering the Combined Predictor:** The final combined covariance matrix is obtained by inverting the product of the combined Cholesky factor and its transpose: $$\hat{\Sigma}_t = (\hat{L}_t \hat{L}_t^T)^{-1}$$ #### Why Use Cholesky Factorization? * **Numerical Stability:** Cholesky factorization ensures that the combined covariance matrix is always positive definite, which is essential for valid statistical properties and applications like portfolio optimization. * **Computational Efficiency:** Working with lower triangular matrices simplifies calculations and improves computational speed compared to direct matrix inversion. * **Flexibility:** The model can be easily adapted to new data or changing market conditions by updating the weights $\pi_k$. #### Log-Likelihood and Weight Selection To find the optimal weights $\pi_k$, the CM-IEWMA uses a log-likelihood function derived from the Cholesky decomposition. This function measures how well the combined precision matrix explains the observed returns. The weights are then selected by solving a convex optimization problem that maximizes the aggregate log-likelihood over a look-back window. \begin{split} \max_\pi &\ \sum_{j=1}^N \left( \sum_{i=1}^n \log \hat{L}_{t-j,ii} - \frac{1}{2} \|\hat{L}_{t-j}^T r_{t-j}\|^2 \right) \\ \text{subject to} &\ \hat{L}_\tau = \sum_{k=1}^K \pi_k \hat{L}_\tau^{(k)}, \quad \tau=1,\dots,T-N \\ & \pi \geq0, \ \mathbf{1}^T\pi = 1 \end{split} #### Code Implementation ```python= def _cholesky_precision(cov): """ Computes Cholesky factor of the inverse of each covariance matrix in cov param cov: dictionary of covariance matrices {time: Sigma} """ return { time: np.linalg.cholesky(np.linalg.inv(item.values)) for time, item in cov.items() } def _diag_part(cholesky): return {key: np.diag(L) for key, L in cholesky.items()} ``` ```python= class _CombinationProblem: def __init__(self, keys, n, window): self.keys = keys K = len(keys) self._weight = cvx.Variable(len(keys), name="weights") self.A_param = cvx.Parameter((n * window, K)) self.P_chol_param = cvx.Parameter((K, K)) @property def _constraints(self): return [cvx.sum(self._weight) == 1, self._weight >= 0] @property def _objective(self): return cvx.sum(cvx.log(self.A_param @ self._weight)) - 0.5 * cvx.sum_squares( self.P_chol_param.T @ self._weight ) def _construct_problem(self): self.prob = cvx.Problem(cvx.Maximize(self._objective), self._constraints) def solve(self, **kwargs): return self.prob.solve(**kwargs) @property def weights(self): return pd.Series(index=self.keys, data=self._weight.value) @property def status(self): return self.prob.status ``` ```python= class _CovarianceCombination: def __init__(self, sigmas, returns, means=None): """ Computes the covariance combination of a set of covariance matrices param sigmas: dictionary of covariance matrices {key: {time: sigma}} param returns: pandas DataFrame of returns param means: dictionary of means {key: {time: mu}}, optional Note: sigmas[key][time] is a covariance matrix prediction for time+1 """ # Assert Sigmas and means have same keys if means not None n = returns.shape[1] if means is not None: for key, sigma in sigmas.items(): # Assert sigmas and means have same keys assert ( sigma.keys() == means[key].keys() ), "sigmas and means must have same keys" else: # Set means to zero if not provided means = { k: {time: np.zeros(n) for time in sigma.keys()} for k, sigma in sigmas.items() } self.__means = means self.__sigmas = sigmas self.__returns = returns # all those quantities don't depend on the window size self.__Ls = pd.DataFrame( {k: _cholesky_precision(sigma) for k, sigma in self.sigmas.items()} ) self.__Ls_shifted = self.__Ls.shift(1).dropna() self.__nus = pd.DataFrame( {key: _nu(Ls, self.means[key]) for key, Ls in self.__Ls.items()} ) self.__nus_shifted = self.__nus.shift(1).dropna() @property def sigmas(self): return self.__sigmas @property def means(self): return self.__means @property def returns(self): return self.__returns @property def K(self): """ Returns the number of expert predictors """ return len(self.sigmas) @property def assets(self): """ Returns the assets in the covariance combination problem """ return self.returns.columns def solve(self, window=None, times=None, **kwargs): """ The size of the window is crucial to specify the size of the parameters for the cvxpy problem. Hence those computations are not in the __init__ method Solves the covariance combination problem at a time steps given in times param window: number of previous time steps to use in the covariance combination problem param times: list of time steps to solve the problem at; if None, solve at all available time steps """ # If window is None, use all available data; cap window at length of data window = window or len(self.__Ls_shifted) window = min(window, len(self.__Ls_shifted)) # Compute P matrix and its Cholesky factor Lts_at_r = pd.DataFrame( { key: _B_t_col(Ls, self.__nus_shifted[key], self.returns) for key, Ls in self.__Ls_shifted.items() } ) # time steps to solve the problem at all_available_times = Lts_at_r.index if times is None: times = Lts_at_r.index else: times = list(times) # assert times are consecutive in all_available_times zero_index = all_available_times.get_loc(times[0]) assert list( all_available_times[zero_index : zero_index + len(times)] ) == list(times) "times must be consecutive" # add window - 1 previous time steps to times times = ( list(all_available_times[zero_index - window + 1 : zero_index]) + times ) Bs = {time: np.column_stack(Lts_at_r.loc[time]) for time in times} prod_Bs = pd.Series({time: B.T @ B for time, B in Bs.items()}) # times = prod_Bs.index P = { times[i]: sum(prod_Bs.loc[times[i - window + 1] : times[i]]) for i in range(window - 1, len(times)) } P_chol = {time: np.linalg.cholesky(matrix) for time, matrix in P.items()} # Compute A matrix Ls_diag = pd.DataFrame({k: _diag_part(L) for k, L in self.__Ls_shifted.items()}) A = { times[i]: _A( Ls_diag.truncate(before=times[i - window + 1], after=times[i]), self.K ) for i in range(window - 1, len(times)) } problem = _CombinationProblem( keys=self.sigmas.keys(), n=len(self.assets), window=window ) problem._construct_problem() for time, AA in A.items(): problem.A_param.value = AA problem.P_chol_param.value = P_chol[time] try: yield self._solve(time=time, problem=problem, **kwargs) except cvx.SolverError: print(f"Solver did not converge at time {time}") yield None def _solve(self, time, problem, **kwargs): """ Solves the covariance combination problem at a given time t """ problem.solve(**kwargs) weights = problem.weights # Get non-shifted L L = sum(self.__Ls.loc[time] * weights.values) # prediction for time+1 nu = sum(self.__nus.loc[time] * weights.values) # prediction for time+1 mean = pd.Series(index=self.assets, data=np.linalg.inv(L.T) @ nu) sigma = pd.DataFrame( index=self.assets, columns=self.assets, data=np.linalg.inv(L @ L.T) ) mean = pd.Series(index=self.assets, data=np.linalg.inv(L.T) @ nu) sigma = pd.DataFrame( index=self.assets, columns=self.assets, data=np.linalg.inv(L @ L.T) ) return Result(time=time, mean=mean, covariance=sigma, weights=weights) ``` ## Evaluating Covariance Predictors ### Log-Likelihood as a Performance Metric A straightforward way to assess the performance of a covariance predictor is by examining its average log-likelihood on realized returns. The log-likelihood function, under the assumption of multivariate Gaussian returns, is given by: $$ \frac{1}{2T} \sum_{t=1}^{T} \left( -n \log(2\pi) - \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right), $$ Higher values indicate a better fit of the predicted covariance matrix $\hat{\Sigma}_t$ to the observed returns $r_t$. This metric allows us to compare different predictors and identify periods of varying performance. #### Code Implementation ```python= def log_likelihood(r_t, sigma_t): n = len(r_t) log_det_sigma = np.log(np.linalg.det(sigma_t)) mahalanobis_dist = r_t.T @ np.linalg.inv(sigma_t) @ r_t log_likelihood_value = -0.5 * (n * np.log(2 * np.pi) + log_det_sigma + mahalanobis_dist) return log_likelihood_value ``` ```python= regrets = {} for name in log_likelihoods: regrets[name] = log_likelihoods["PRESCIENT"] - log_likelihoods[name] ``` ### Log-Likelihood Regret: A Benchmark Comparison To gain deeper insights into a predictor's performance, we can introduce the concept of log-likelihood regret. The regret compares the predictor's log-likelihood to that of the best possible constant predictor (using a fixed covariance matrix) over the entire dataset. The best constant predictor is typically the empirical sample covariance matrix: $$ \frac{1}{2} \left( -n \log(2\pi) + 1 \right) - \log \det \Sigma^{\text{emp}}. $$ The average log-likelihood regret is then defined as the difference between the log-likelihood of the empirical covariance and that of the predictor: $$ \text{Regret} = \frac{1}{2T} \sum_{t=1}^{T} \left( \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right) - \frac{1}{2} \left( -n \log(2\pi) + 1 \right) - \log \det \Sigma^{\text{emp}}. $$ The regret quantifies how much the predictor underperforms compared to the best constant predictor. Minimizing regret indicates that the predictor adapts well to changing market conditions. ### Example: Stock Portfolio Covariance Prediction #### Data Setup ```python= def get_stock_price(ticker, startd, endd): prices = yf.download(ticker, start=startd, end=endd) prices = prices["Adj Close"].dropna(how="all") #drop missing value return prices ``` ```python= START_DATE = '2000-07-01' END_DATE = '2023-12-31' ticker = ['XOM','WMT','AAPL','PG','JNJ','CHL','IBM','SBC','GE','CHV', 'PFE','NOB','NCB','KO','ORCL','HWP','INTC','MRK','PEP','BEL', 'ABT','SLB','P','PA','MCD'] ``` ![Nor Price](https://hackmd.io/_uploads/HJFk5Uq7R.png) #### Half-Life Parameters and Covariance Predictors ```python= ewma_halflife=125 iewma_pair = (63,125) cm_iewma_pairs = [(10, 21), (21, 63), (63, 125), (125, 250), (250, 500)] #EWMA ewma = dict(_ewma_cov(returns, halflife=ewma_halflife)) iewma = list(iterated_ewma(returns, vola_halflife=iewma_pair[0], cov_halflife=iewma_pair[1])) iewma = {iterate.time: iterate.covariance for iterate in iewma} #IEWMA iewmas = {f"{pair[0]}-{pair[1]}": list(iterated_ewma(returns, vola_halflife=pair[0], cov_halflife=pair[1], min_periods_vola=63, min_periods_cov=63)) for pair in cm_iewma_pairs} Sigmas = {key: {item.time: item.covariance for item in iewma} for key, iewma in iewmas.items()} #CM-IEWMA # Regularize the first covariance matrix fast = cm_iewma_pairs[0]; fast = f"{fast[0]}-{fast[1]}" Sigmas[fast] = add_to_diagonal(Sigmas[fast], lamda=0.05) results = list(from_sigmas(Sigmas, returns, means=None).solve(window=10)) cm_iewma = {result.time: result.covariance for result in results} weights = {result.time: result.weights for result in results} ``` Add graph ![image](https://hackmd.io/_uploads/ByV4ii4HC.png) ![image](https://hackmd.io/_uploads/ryXHjj4r0.png) #### Change of Weight for Different Halflime Parameter in CM-IEWMA ```python= quarterly_weights = pd.DataFrame(weights).T.resample("Y").mean().loc[start_date:end_date] plt.stackplot(quarterly_weights.index, quarterly_weights.values.T, labels=[f"{pair[0]}-{pair[1]}" for pair in cm_iewma_pairs]) plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.); ``` ![weight](https://hackmd.io/_uploads/S1XaQFoXR.png) #### Log Likelihood and Log-likelihood Regret **Log Likelihood** The experiment result indicate that the PRESCIENT predictor consistently outperforms the other methods, providing the best fit to the observed data. The other predictors, while generally close in performance, show more variability and are less robust during periods of high market volatility. **Log-likelihood Regret** The experiment results, indicate that all predictors show comparable performance in terms of log-likelihood regret, with lower values indicating better performance. The spike in regret around 2020 highlights the challenges faced by these predictors during periods of high market volatility. ## Portfolio Experiments A practical way to evaluate covariance predictors is to assess their impact on real-world portfolio construction and performance. In this section, we'll conduct experiments using a portfolio of 16 stocks, comparing the results obtained using different covariance prediction methods. ### Portfolio Construction Strategies We'll consider four common portfolio construction strategies and evaluate how different covariance predictors affect their performance: 1. **Equal Weight Portfolio:** * **Strategy:** Allocates equal weights to all assets in the portfolio, i.e., $w_t = \frac{1}{n} \mathbf{1}$. * **Pros:** Simple and intuitive to implement. * **Cons:** May not be optimal in terms of risk-return tradeoff, as it does not consider individual asset risk or correlations. 2. **Minimum Variance Portfolio:** * **Strategy:** Aims to minimize the portfolio's variance $w^T \hat{\Sigma}_t w$, subject to constraints: * **Full investment:** $w^T \mathbf{1} = 1$ * **Leverage limit:** $\|w\|_1 \leq L_{\max}$ * **Individual weight bounds:** $w_{\min} \leq w \leq w_{\max}$ * **Pros:** Focuses on risk reduction, potentially leading to a more stable portfolio. * **Cons:** May miss out on potential returns by overweighting low-volatility assets and underweighting high-return assets. 3. **Risk Parity Portfolio:** * **Strategy:** Seeks to equalize the risk contribution of each asset to the overall portfolio volatility. The risk contribution of asset $i$ is defined as: $$\frac{\partial \log \sigma(w)}{\partial w_i} = \frac{w_i (\hat{\Sigma}_t w)_i}{w^T \hat{\Sigma}_t w}$$ * **Optimization:** This is achieved by solving the following convex optimization problem: $$\min_{x} \frac{1}{2} x^T \hat{\Sigma}_t x - \sum_{i=1}^n \frac{1}{n} \log x_i$$ where the portfolio weights are then obtained as $w_t = \frac{x^\star}{1^T x^\star}$. * **Pros:** Offers a diversified approach to risk management, as it avoids concentration of risk in a few assets. * **Cons:** Requires estimation of the covariance matrix and may be sensitive to changes in correlations. 4. **Maximum Diversification Portfolio:** * **Strategy:** Maximizes the diversification ratio, balancing individual asset volatilities $\hat{\sigma}_t$ and correlations $\hat{\Sigma}_t$: $$D(w) = \frac{\hat{\sigma}_t^T w}{(w^T \hat{\Sigma}_t w)^{1/2}}$$ * **Optimization:** This is achieved by solving the following convex optimization problem: \begin{split} \min_x& \quad x^T \hat{\Sigma}_t x \\ \text{subject to}& \quad \hat{\sigma}_t^T x = 1, \quad x \geq 0 \end{split} The portfolio weights are then obtained as $w_t = \frac{x^\star}{1^T x^\star}$. * **Pros:** Aims for high diversification and can potentially enhance risk-adjusted returns. * **Cons:** May lead to unintended biases towards certain assets or sectors, depending on the covariance matrix estimation. 5. **Mean-Variance Portfolio** * **Strategy**: Maximizes the expected return for a given level of risk (volatility) or minimizes the risk for a given level of expected return by balancing individual asset returns and covariances. * **Optimization**: This is achieved by solving the following convex optimization problem: $$ \begin{aligned} &\text{maximize} \quad \hat{r}_t^T w \\ &\text{subject to} \quad \| \hat{\Sigma}_t^{1/2} w \|_2 \leq \sigma^{\text{tar}} \\ &\quad \quad \quad \quad \quad \quad 1^T w + c = 1, \\ &\quad \quad \quad \quad \quad \quad \| w \|_1 \leq L_{\max}, \\ &\quad \quad \quad \quad \quad \quad w_{\min} \leq w \leq w_{\max}, \\ &\quad \quad \quad \quad \quad \quad c_{\min} \leq c \leq c_{\max} \end{aligned} $$ #### Code Implementation Make each portoflio strategy as a single function ```python= import cvxpy as cp from scipy.optimize import minimize from scipy.optimize import Bounds, LinearConstraint, NonlinearConstraint def equal_weight_portfolio(n_assets): return np.ones(n_assets) / n_assets def min_variance_portfolio(cov_matrix, L_max=1, w_min=0, w_max=1): n = cov_matrix.shape[0] # Define variables w = cp.Variable(n) # Define objective objective = cp.Minimize(cp.quad_form(w, cov_matrix)) # Define constraints constraints = [ cp.sum(w) == 1, cp.norm(w, 1) <= L_max, w >= w_min, w <= w_max ] # Solve the problem problem = cp.Problem(objective, constraints) problem.solve() return w.value def risk_parity_portfolio(cov_matrix): n = cov_matrix.shape[0] # Define variables x = cp.Variable(n) # Define objective function objective = cp.Minimize((1/2) * cp.quad_form(x, cov_matrix) - cp.sum(cp.log(x)) / n) # Define constraints constraints = [x >= 0] # Ensure all elements in x are non-negative # Solve the problem problem = cp.Problem(objective, constraints) problem.solve() # Normalize the weights w = x.value / np.sum(x.value) return w def max_diversification_portfolio(cov_matrix): """ Calculate the maximum diversification portfolio. Parameters: cov_matrix (pd.DataFrame): Covariance matrix of asset returns. volatilities (pd.Series): Volatilities of the individual assets. Returns: pd.Series: Weights of the maximum diversification portfolio. """ #calculate volatilities volatilities = returns.std() # Number of assets n = len(volatilities) # Define the optimization variables x = cp.Variable(n, nonneg=True) # Define the objective function obj = cp.Minimize(cp.quad_form(x, cov_matrix.values)) # Define the constraints constraints = [volatilities.values @ x == 1, x >= 0] # Define the problem prob = cp.Problem(obj, constraints) # Solve the problem prob.solve() # Get the optimal weights optimal_weights = x.value / np.sum(x.value) # Convert the weights to a pandas Series weights = pd.Series(data=optimal_weights, index=cov_matrix.columns) return weights def mean_variance_portfolio(r_hat, Sigma_hat, sigma_tar, L_max, w_min, w_max, c_min, c_max): # Define the objective function def objective(w): return -np.dot(r_hat, w[:-1]) # Exclude the last element (c) in the dot product # Define the risk constraint def risk_constraint(w): risk_value = np.dot(np.dot(w[:-1].T, Sigma_hat), w[:-1]) if risk_value < 0: raise ValueError("Risk value calculated as negative, which is invalid.") return sigma_tar - np.sqrt(risk_value) # Define the budget constraint def budget_constraint(w): return 1 - np.sum(w[:-1]) - w[-1] # Define the leverage constraint def leverage_constraint(w): return L_max - np.sum(np.abs(w[:-1])) # Set bounds for weights and c bounds_w = [(w_min, w_max)] * len(r_hat) bounds_c = [(c_min, c_max)] bounds = bounds_w + bounds_c # Define constraints cons = [ NonlinearConstraint(risk_constraint, 0, np.inf), LinearConstraint(np.ones(len(r_hat) + 1), 1 - c_max, 1 - c_min), NonlinearConstraint(leverage_constraint, 0, np.inf) ] # Initial guess initial_guess = np.ones(len(r_hat) + 1) / (len(r_hat) + 1) # Perform optimization result = minimize( objective, initial_guess, method='SLSQP', bounds=bounds, constraints=cons, options={'maxiter': 1000, 'disp': True} ) if not result.success: raise ValueError("Optimization did not converge: " + result.message) optimal_weights = result.x[:-1] optimal_c = result.x[-1] return optimal_weights, optimal_c ``` ### Volatility Control with Cash To ensure a fair comparison across strategies and protect against excessive risk-taking, we'll implement a volatility control mechanism. This involves mixing each portfolio with cash to achieve a target level of ex-ante volatility, denoted as $\sigma_{\text{tar}}$. The adjustment is calculated as follows: $$ \theta = \frac{\sigma_{\text{tar}}}{\sigma_t} $$ where $\sigma_t$ is the ex-ante volatility of the portfolio before adjustment. The resulting portfolio weights, including the cash component, are then: $$ \begin{bmatrix} \theta w_t \\ 1 - \theta \end{bmatrix} $$ where $w_t$ is the original portfolio weight vector. #### Code Implementation ```python= import numpy as np def calculate_portfolio_volatility(weights, covariance_matrix): """Calculates portfolio volatility.""" return np.sqrt(np.dot(weights.T, np.dot(covariance_matrix, weights))) def adjust_weights_for_target_volatility(weights, current_volatility, target_volatility): """Adjusts weights to achieve target volatility.""" theta = target_volatility / current_volatility if theta > 1: return np.append(weights, 0) # No cash if leveraging is needed cash_weight = 1 - theta adjusted_weights = theta * weights return np.append(adjusted_weights, cash_weight) ``` ### Experiments **Equal Weight Portfolio** ![image](https://hackmd.io/_uploads/BynUoq2rC.png) ``` RW : {'Realized Return': 0.09051058690162095, 'Volatility': 0.009367880343162886, 'Sharpe Ratio': 0.042714514810121486, 'Maximum Drawdown': -0.3241418755710692, 'Realized Volatility': 0.009375755099804763} EWMA : {'Realized Return': 0.09042090361372868, 'Volatility': 0.009304508141763411, 'Sharpe Ratio': 0.04290583707745277, 'Maximum Drawdown': -0.3241418755710687, 'Realized Volatility': 0.009312407365382204} IEWMA : {'Realized Return': 0.09203422850195375, 'Volatility': 0.009298029160324685, 'Sharpe Ratio': 0.04358964555801548, 'Maximum Drawdown': -0.32414187557106866, 'Realized Volatility': 0.009306199975786173} CM-IEWMA : {'Realized Return': 0.0963989330832491, 'Volatility': 0.009217517445246839, 'Sharpe Ratio': 0.04568726919960968, 'Maximum Drawdown': -0.3241418755710684, 'Realized Volatility': 0.009226487972777862} ``` **Minimum Volatility Portfolio** ![image](https://hackmd.io/_uploads/ryfosq3SA.png) ``` RW : {'Realized Return': 0.07965078199555364, 'Volatility': 0.008719742748481836, 'Sharpe Ratio': 0.04044971693789385, 'Maximum Drawdown': -0.2894583490404754, 'Realized Volatility': 0.0087262545051118} EWMA : {'Realized Return': 0.08127456041267855, 'Volatility': 0.009088956491420368, 'Sharpe Ratio': 0.039856316346716915, 'Maximum Drawdown': -0.31984752685494927, 'Realized Volatility': 0.009095519900858644} IEWMA : {'Realized Return': 0.08134744743512856, 'Volatility': 0.009062169808763027, 'Sharpe Ratio': 0.039978210753052176, 'Maximum Drawdown': -0.3214050983516874, 'Realized Volatility': 0.009068758730534941} CM-IEWMA : {'Realized Return': 0.08535426032504478, 'Volatility': 0.009023962138143231, 'Sharpe Ratio': 0.04180928675161649, 'Maximum Drawdown': -0.36103429719446584, 'Realized Volatility': 0.009031205082306539} ``` **Risk Parity Portfolio** ![image](https://hackmd.io/_uploads/rJU13q2r0.png) ``` EWMA : {'Realized Return': 0.089771857327134, 'Volatility': 0.009147979841862839, 'Sharpe Ratio': 0.04320887881979153, 'Maximum Drawdown': -0.309067953484985, 'Realized Volatility': 0.009155868844420226} IEWMA : {'Realized Return': 0.09144190074254399, 'Volatility': 0.009079586688271862, 'Sharpe Ratio': 0.04416639367503873, 'Maximum Drawdown': -0.30406983995964537, 'Realized Volatility': 0.009087800425460799} CM-IEWMA : {'Realized Return': 0.0957830981595369, 'Volatility': 0.008903372740689319, 'Sharpe Ratio': 0.04671467936073269, 'Maximum Drawdown': -0.2978792517080981, 'Realized Volatility': 0.008912468313557692} ``` **Maximun Diversification Portfolio** ![image](https://hackmd.io/_uploads/BkCs2C6BC.png) ``` RW : {'Realized Return': 0.09270796273922777, 'Volatility': 0.009802008739481045, 'Sharpe Ratio': 0.0421025339561631, 'Maximum Drawdown': -0.35892271232984374, 'Realized Volatility': 0.009809985636449779} EWMA : {'Realized Return': 0.08954898130510691, 'Volatility': 0.00988393872185069, 'Sharpe Ratio': 0.04062234382228661, 'Maximum Drawdown': -0.37772924430542043, 'Realized Volatility': 0.00989137157069012} IEWMA : {'Realized Return': 0.09618565249728706, 'Volatility': 0.009560498345906087, 'Sharpe Ratio': 0.04430282440330771, 'Maximum Drawdown': -0.3629176794461102, 'Realized Volatility': 0.009569197443174894} CM-IEWMA : {'Realized Return': 0.10262782834252171, 'Volatility': 0.009476868285735272, 'Sharpe Ratio': 0.04717517278978665, 'Maximum Drawdown': -0.3404822194817961, 'Realized Volatility': 0.009486745902075076} ``` **Mean Variance Portfolio** ![image](https://hackmd.io/_uploads/SJvH7kRrA.png) ``` RW : {'Realized Return': 0.06865750433300621, 'Volatility': 0.010877773378834986, 'Sharpe Ratio': 0.03047614765287506, 'Maximum Drawdown': -0.42980477878244217, 'Realized Volatility': 0.01088198522788498} EWMA : {'Realized Return': 0.07092473841308511, 'Volatility': 0.010546457872318919, 'Sharpe Ratio': 0.03192935843888231, 'Maximum Drawdown': -0.4287854912795296, 'Realized Volatility': 0.010551027359283649} IEWMA : {'Realized Return': 0.07474781207675285, 'Volatility': 0.010628921708819399, 'Sharpe Ratio': 0.03315378982817656, 'Maximum Drawdown': -0.39656380460297874, 'Realized Volatility': 0.010633953453707467} CM-IEWMA : {'Realized Return': 0.09497297359672063, 'Volatility': 0.011061210069100803, 'Sharpe Ratio': 0.039277469618947854, 'Maximum Drawdown': -0.4090265777568538, 'Realized Volatility': 0.011068913834063873} ``` # Smooth Covariance Predictions The main idea is to smooth the prediction $\hat{\Sigma}_t$ to obtain a smoother version, $\hat{\Sigma}_t^{sm}$. This is achieved by making $\hat{\Sigma}_t^{sm}$ an Exponentially Weighted Moving Average (EWMA) of $\hat{\Sigma}_t$, balancing smoother predictions and performance through a selected half-life. The approach involves minimizing the following objective: $$ \|\hat{\Sigma}_t^{sm} - \hat{\Sigma}_t\|_F^2 + \lambda \|\hat{\Sigma}_t^{sm} - \hat{\Sigma}_{t-1}^{sm}\|_F^2 $$ Here, $\lambda$ is a positive regularization parameter that adjusts the balance between smoothness and performance, similar to controlling the half-life in EWMA post-processing. The first term represents the loss, while the second term is a regularizer promoting smooth covariance predictions. Additionally, more sophisticated smoothing methods can be developed by modifying the loss or the regularizer. For instance, using the Kullback-Leibler (KL) divergence as a loss yields a piecewise constant prediction, updating predictions only when necessary. This constitutes a convex optimization problem solvable efficiently. 1. **First Term**: $\|\hat{\Sigma}_{t}^{sm} - \hat{\Sigma}_{t}\|_{F}^{2}$ - $\hat{\Sigma}_{t}^{sm}$: This is the smoothed version of the covariance matrix at time $t$. - $\hat{\Sigma}_{t}$: This is the original covariance matrix at time $t$. - $\|\cdot\|_{F}^{2}$: This denotes the Frobenius norm squared, which measures the difference between two matrices. It sums the squares of all the entries in the matrix difference. The first term measures how close the smoothed covariance matrix is to the original covariance matrix. Minimizing this term ensures that $\hat{\Sigma}_{t}^{sm}\) closely follows \(\hat{\Sigma}_{t}$. 2. **Second Term**: $\lambda\|\hat{\Sigma}_{t}^{sm} - \hat{\Sigma}_{t-1}^{sm}\|_{F}^{2}$ - $\hat{\Sigma}_{t-1}^{sm}$: This is the smoothed version of the covariance matrix at time $t-1$. - $\lambda$: This is a positive regularization parameter. It controls the trade-off between the smoothness of the predictions and how closely they follow the original covariance matrices. The second term measures the difference between the smoothed covariance matrix at time $t$ and the smoothed covariance matrix at time $t-1$. Minimizing this term ensures that $\hat{\Sigma}_{t}^{sm}$ changes smoothly over time, avoiding abrupt changes. ```python= import numpy as np from scipy.optimize import minimize def smooth_covariance(Sigma_pred, Sigma_sm_prev, lam): """ Smooth the predicted covariance matrix using the specified objective function. Parameters: - Sigma_pred: The predicted covariance matrix at time t. - Sigma_sm_prev: The smoothed covariance matrix at time t-1. - lam: The regularization parameter λ. Returns: - Sigma_sm: The smoothed covariance matrix at time t. """ # Ensure Sigma_pred and Sigma_sm_prev are numpy arrays Sigma_pred = np.asarray(Sigma_pred) Sigma_sm_prev = np.asarray(Sigma_sm_prev) # Define the objective function def objective(Sigma_sm_flat): Sigma_sm = Sigma_sm_flat.reshape(Sigma_pred.shape) loss = np.linalg.norm(Sigma_sm - Sigma_pred, 'fro')**2 regularizer = lam * np.linalg.norm(Sigma_sm - Sigma_sm_prev, 'fro')**2 return loss + regularizer # Initial guess for Sigma_sm is the predicted covariance matrix initial_guess = Sigma_pred.flatten() # Perform optimization result = minimize(objective, initial_guess, method='L-BFGS-B') if not result.success: raise ValueError("Optimization did not converge: " + result.message) # Reshape the result back to the original matrix shape Sigma_sm = result.x.reshape(Sigma_pred.shape) return Sigma_sm ``` **Experient of Smooth Covariance Predictions** We select the CM-IEWMA covvariance predictors in Mean_Variance Portfolio to test the effect of using Smooth Covariance Predictors ![image](https://hackmd.io/_uploads/Bkt6YEb8C.png) ## Reference * Johansson, K., Ogut, M. G., Pelger, M., Schmelzer, T., & Boyd, S. (2023). A simple method for predicting covariance matrices of financial returns. Foundations and Trends in Econometrics, 12(4), 324–407. * https://github.com/cvxgrp/cov_pred_finance