# Covariance Predictors in Financial Portfolio Management
## Introduction
Accurate prediction of covariance structures is essential in financial portfolio management, impacting risk assessment and asset allocation decisions. This note focuses on three prominent models:
* **Exponentially Weighted Moving Average (EWMA):** Emphasizes recent data, adapting to dynamic market conditions.
* **Iterated EWMA (IEWMA):** Refines EWMA through a two-step iterative process for enhanced stability and precision.
* **Combined Model IEWMA (CM-IEWMA):** Optimizes multiple IEWMA models with varying parameters to maximize prediction accuracy.
By understanding these advanced models, portfolio managers can gain a robust foundation for navigating changing market conditions and enhancing overall portfolio performance.
## Covariance Predictors
Covariance predictors are statistical models or algorithms used to forecast the covariance matrix of asset returns. The covariance matrix is a critical input in many financial applications, most notably in portfolio optimization and risk management. It quantifies the relationships between asset returns, providing insights into how they move together (or independently).
### Mathematical Setup
1. **Return Vector $r_t$:**
* An $n$-dimensional vector representing the returns of $n$ assets over a specific time period.
* Denoted as $r_t \in \mathbb{R}^n$.
2. **Covariance Matrix $\Sigma_t$:**
* A symmetric positive definite matrix capturing the covariance of asset returns.
* Denoted as $\Sigma_t \in S_n^{++}$ (the set of symmetric positive definite matrices).
* Mathematically defined as $$\Sigma_t = \text{cov}(r_t) = \mathbb{E}[r_t r_t^T] - (\mathbb{E}[r_t])(\mathbb{E}[r_t])^T)$$
3. **Predicted Volatilities $\hat{\sigma}_t$:**
* Derived from the diagonal elements of the estimated covariance matrix $\text{diag}(\hat{\Sigma}_t)$, which represent variances.
* Calculated as $$\hat{\sigma}_t = \text{diag}(\hat{\Sigma}_t)^{1/2}$$
4. **Predicted Correlations $\hat{R}_t$:**
* Obtained by normalizing the estimated covariance matrix with the inverse of the predicted volatilities.
* Calculated as $$\hat{R}_t = \text{diag}(\hat{\sigma}_t)^{-1} \hat{\Sigma}_t \text{diag}(\hat{\sigma}_t)^{-1}$$
5. **Log-Likelihood Function $l_t(\hat{\Sigma}_t)$**:
* Assumes returns follow a multivariate Gaussian distribution.
* Used to assess the quality of the estimated covariance matrix.
* Higher log-likelihood values indicate better model fit.
* Defined as: $$l_t(\hat{\Sigma}_t) = \frac{1}{2} \left( -n \log(2\pi) - \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right)$$
### Rolling Window
The rolling window predictor, with a window length or memory $M$, estimates the covariance matrix $\hat{\Sigma}_t$ using the **average** of the last $M \geq n$ outer products. The formula for $\hat{\Sigma}_t$ is given by:
$$ \hat{\Sigma}_t = \alpha_t \sum_{\tau=t-M}^{t-1} r_{\tau} r_{\tau}^T, \quad t = 2, 3, \ldots $$
where
* $\hat{\Sigma}_t$: The estimated covariance matrix at time $t$.
* $\alpha_t = \frac{1}{\min \{t-1,M\}}$: A normalization factor or weighting coefficient applied at time $t$.
* $\sum_{\tau=t-M}^{t-1}$: Summation over a moving window of size $M$ periods, from $t-M$ to $t-1$.
* $r_\tau$: The return vector at time $\tau$.
* $r_\tau r_\tau^T$: The outer product of the return vector $r_\tau$ with itself, resulting in a covariance matrix for that period.
```python=
def rolling_window(returns, memory, min_periods=20):
min_periods = max(min_periods, 1)
times = returns.index
assets = returns.columns
returns = returns.values
Sigmas = np.zeros((returns.shape[0], returns.shape[1], returns.shape[1]))
Sigmas[0] = np.outer(returns[0], returns[0])
for t in range(1, returns.shape[0]):
alpha_old = 1 / min(t + 1, memory)
alpha_new = 1 / min(t + 2, memory)
if t >= memory:
Sigmas[t] = alpha_new / alpha_old * Sigmas[t - 1] + alpha_new * (
np.outer(returns[t], returns[t])
- np.outer(returns[t - memory], returns[t - memory])
)
else:
Sigmas[t] = alpha_new / alpha_old * Sigmas[t - 1] + alpha_new * (
np.outer(returns[t], returns[t])
)
Sigmas = Sigmas[min_periods - 1 :]
times = times[min_periods - 1 :]
return {
times[t]: pd.DataFrame(Sigmas[t], index=assets, columns=assets)
for t in range(len(times))
}
```
### Multivariate GARCH
<font color=red>Add contents and code implementation</font>
### EWMA Covariance Predictors
The Exponentially Weighted Moving Average (EWMA) model is a widely used method to estimate volatility and covariance. It gives more weight to recent observations, making it adaptable to changing market conditions.
* **Formula:** $$\hat{\Sigma}_t = \alpha_t \sum_{\tau=1}^{t-1} \beta^{t-1-\tau} r_\tau r_\tau^T$$ where
* $\hat{\Sigma}_t$: The estimated covariance matrix at time $t$.
* $\alpha_t$: A normalization constant ensuring consistent weights across time.
* $\beta$: The forgetting factor $0 < \beta < 1$, controlling the rate of weight decay. Higher $\beta$ emphasizes recent observations.
* **Half-Life $H$:** The number of periods for the weight of an observation to halve. It's related to the forgetting factor by: $$H = - \frac{\log 2}{\log \beta}$$
#### Interpretation and Example
Consider a forgetting factor of $\beta = 0.94$. This corresponds to a half-life of approximately 11 days:
$$
H = \frac{-\log(2)}{\log(0.94)} \approx 11 \text{ days}
$$
This means that the influence of returns on the covariance estimate is halved every 11 days, highlighting the model's emphasis on recent data.

#### Code Implementation
```python=
def _general(y, times, halflife=None, alpha=None, fct=lambda x: x, min_periods=0, clip_at=None):
"""
Calculates EWMA for a given function (fct) applied to input data (y).
"""
def f(k):
if k < min_periods - 1:
return times[k], _ewma * np.nan
return times[k], _ewma
if halflife:
alpha = 1 - np.exp(-np.log(2) / halflife)
beta = 1 - alpha
# first row, important to initialize the _ewma variable
_ewma = fct(y[0])
yield f(k=0)
# iterate through all the remaining rows of y. Start in the 2nd row
for n, row in enumerate(y[1:], start=1):
# replace NaNs in row with non-NaN values from _ewma and vice versa
next_val = fct(row)
# update the moving average
if clip_at and n >= min_periods + 1:
_ewma = _ewma + (1 - beta) * (
np.clip(next_val, -clip_at * _ewma, clip_at * _ewma) - _ewma
) / (1 - np.power(beta, n + 1))
else:
_ewma = _ewma + (1 - beta) * (next_val - _ewma) / (
1 - np.power(beta, n + 1)
)
yield f(k=n)
def _ewma_cov(data, halflife, min_periods=0):
"""
Calculates the EWMA covariance matrix.
"""
for t, ewma in _general(
data.values,
times=data.index,
halflife=halflife,
fct=lambda x: np.outer(x, x),
min_periods=min_periods,
):
if not np.isnan(ewma).all():
yield t, pd.DataFrame(index=data.columns, columns=data.columns, data=ewma)
```
### IEWMA Covariance Predictor
The Iterated Exponentially Weighted Moving Average (IEWMA) model enhances the traditional EWMA by incorporating a two-step iterative refinement process. This approach improves the accuracy of covariance matrix estimation, particularly the correlation structure.
1. **Volatility Estimation:**
* Volatilities (standard deviations) are estimated from the diagonal elements of the initial EWMA covariance matrix, denoted as $\hat{\Sigma}_t^{(1)}$.
* Mathematically: $$\hat{\sigma}_t = \text{diag}(\hat{\Sigma}_t^{(1)})^{1/2}$$ This yields a diagonal matrix, $\hat{\sigma}_t$, containing the estimated volatilities for each asset.
2. **Correlation Matrix Estimation:**
* Returns are "whitened" by normalizing them with the estimated volatilities from the first step: $$\tilde{r}_t = (\hat{\Sigma}_t^{(1)})^{-1/2} r_t$$
* A new covariance matrix, $\hat{\Sigma}_t^{(2)}$, is calculated using these whitened returns. This matrix captures the correlations without being influenced by the magnitudes of the volatilities.
3. **Reconstruction of Covariance Matrix:**
* The final covariance matrix, $\hat{\Sigma}_t$, is reconstructed by incorporating the refined correlation structure $\hat{\Sigma}_t^{(2)}$ and the initial volatility estimates $\hat{\sigma}_t$: $$\hat{\Sigma}_t = (\hat{\Sigma}_t^{(1)})^{1/2} \hat{\Sigma}_t^{(2)} (\hat{\Sigma}_t^{(1)})^{1/2}$$
#### Why Whitening Returns is Important
Whitening is a critical step in IEWMA due to the following reasons:
* **Normalization:** It standardizes the returns to have unit variance, ensuring that correlation estimates are not biased by differences in the scales of asset volatilities.
* **Improved Accuracy:** Whitening allows for the estimation of pure correlations in the second step, leading to more accurate and reliable covariance matrix estimates.
* **Modeling Flexibility:** The standardized returns can be used in a wider range of statistical models and applications.
#### Code Implementation
```python=
def iterated_ewma(
returns,
vola_halflife,
cov_halflife,
min_periods_vola=20,
min_periods_cov=20,
mean=False,
mu_halflife1=None,
mu_halflife2=None,
clip_at=None,
nan_to_num=True,
):
"""
param returns: pandas dataframe with returns for each asset
param vola_halflife: half life for volatility
param cov_halflife: half life for covariance
param min_preiods_vola: minimum number of periods to use for volatility
ewma estimate
param min_periods_cov: minimum number of periods to use for covariance
ewma estimate
param mean: whether to estimate mean; if False, assumes zero mean data
param mu_halflife1: half life for the first mean adjustment; if None, it is
set to vola_halflife; only applies if mean=True
param mu_halflife2: half life for the second mean adjustment; if None, it is
set to cov_halflife; only applies if mean=True
param clip_at: winsorizes ewma update at +-clip_at*(current ewma) in ewma;
if None, no winsorization is performed
nan_to_num: if True, replace NaNs in returns with 0.0
"""
mu_halflife1 = mu_halflife1 or vola_halflife
mu_halflife2 = mu_halflife2 or cov_halflife
def scale_cov(vola, matrix):
index = matrix.index
columns = matrix.columns
matrix = matrix.values
# Convert (covariance) matrix to correlation matrix
# v = 1 / np.sqrt(np.diagonal(matrix)).reshape(-1, 1)
diag = np.diagonal(matrix).copy()
one_inds = np.where(diag == 0)[0]
if one_inds.size > 0:
diag[one_inds] = 1 # temporarily, to avoid division by zero
v = 1 / np.sqrt(diag).reshape(-1, 1)
v[one_inds] = np.nan
matrix = v * matrix * v.T
cov = vola.reshape(-1, 1) * matrix * vola.reshape(1, -1)
return pd.DataFrame(cov, index=index, columns=columns)
def scale_mean(vola, vec1, vec2):
return vec1 + vola * vec2
if nan_to_num:
if returns.isna().any().any():
returns = returns.fillna(0.0)
# compute the moving mean of the returns
returns, returns_mean = center(
returns=returns, halflife=mu_halflife1, min_periods=0, mean_adj=mean
)
# estimate the volatility, clip some returns before they enter the
# estimation
vola = volatility(
returns=returns,
halflife=vola_halflife,
min_periods=min_periods_vola,
clip_at=clip_at,
)
# adj the returns
# make sure adj is NaN where vola is zero or returns are
# NaN. This handles NaNs (which in some sense is the same as a constant
# return series, i.e., vola = 0)
adj = clip((returns / vola), clip_at=clip_at) # if vola is zero, adj is NaN
adj = adj.fillna(0.0)
# center the adj returns again
adj, adj_mean = center(adj, halflife=mu_halflife2, min_periods=0, mean_adj=mean)
for t, matrix in _ewma_cov(
data=adj,
halflife=cov_halflife,
min_periods=min_periods_cov,
):
if mean:
m = scale_mean(
vola=vola.loc[t].values, vec1=returns_mean.loc[t], vec2=adj_mean.loc[t]
)
else:
m = pd.Series(np.zeros_like(returns.shape[1]), index=returns.columns)
yield IEWMA(
time=t,
mean=m,
covariance=scale_cov(vola=vola.loc[t].values, matrix=matrix),
volatility=vola.loc[t],
)
```
### CM-IEWMA: Combining Multiple Covariance Predictors
The Combined Model IEWMA (CM-IEWMA) leverages the strengths of multiple IEWMA predictors to create a more robust and accurate covariance matrix estimation. This approach recognizes that different IEWMA models, with varying parameters (like half-lives), might capture different aspects of the underlying market dynamics.
#### Setup
The CM-IEWMA starts with $K$ different IEWMA covariance predictors, denoted as $\hat{\Sigma}_t^{(k)}$ (where ($k = 1, 2, \dots, K$). Each predictor can have distinct half-lives for volatility and covariance estimation, allowing for a diversified set of predictions.
The goal is to find the optimal weights $w_{k,t}$ for each predictor at time $t$ to create a combined covariance matrix:
$$
\hat{\Sigma}_t = \sum_{k=1}^K w_{k,t} \hat{\Sigma}_t^{(k)}
$$
#### Cholesky Factorization and Optimization
To achieve a stable and efficient combination of these predictors, the CM-IEWMA employs a method based on Cholesky factorization and convex optimization. Here's the breakdown:
1. **Cholesky Factorization of Precision Matrices:** Each individual precision matrix (inverse covariance matrix), denoted as $(\hat{\Sigma}_t^{(k)})^{-1}$, is factorized into a lower triangular matrix $\hat{L}_t^{(k)}$ and its transpose: $$\hat{\Sigma}_t^{(k)} = (\hat{L}_t^{(k)} (\hat{L}_t^{(k)})^T)^{-1}$$
2. **Combining Cholesky Factors:** The Cholesky factors $\hat{L}_t^{(k)}$ are combined using a set of non-negative weights $\pi_k$ (where $\sum_k \pi_k = 1)$. This weighted sum forms a new lower triangular matrix $\hat{L}_t$: $$\hat{L}_t = \sum_{k=1}^K \pi_k \hat{L}_t^{(k)}$$
3. **Recovering the Combined Predictor:** The final combined covariance matrix is obtained by inverting the product of the combined Cholesky factor and its transpose: $$\hat{\Sigma}_t = (\hat{L}_t \hat{L}_t^T)^{-1}$$
#### Why Use Cholesky Factorization?
* **Numerical Stability:** Cholesky factorization ensures that the combined covariance matrix is always positive definite, which is essential for valid statistical properties and applications like portfolio optimization.
* **Computational Efficiency:** Working with lower triangular matrices simplifies calculations and improves computational speed compared to direct matrix inversion.
* **Flexibility:** The model can be easily adapted to new data or changing market conditions by updating the weights $\pi_k$.
#### Log-Likelihood and Weight Selection
To find the optimal weights $\pi_k$, the CM-IEWMA uses a log-likelihood function derived from the Cholesky decomposition. This function measures how well the combined precision matrix explains the observed returns. The weights are then selected by solving a convex optimization problem that maximizes the aggregate log-likelihood over a look-back window.
\begin{split}
\max_\pi &\ \sum_{j=1}^N \left( \sum_{i=1}^n \log \hat{L}_{t-j,ii} - \frac{1}{2} \|\hat{L}_{t-j}^T r_{t-j}\|^2 \right) \\
\text{subject to} &\ \hat{L}_\tau = \sum_{k=1}^K \pi_k \hat{L}_\tau^{(k)}, \quad \tau=1,\dots,T-N \\
& \pi \geq0, \ \mathbf{1}^T\pi = 1
\end{split}
#### Code Implementation
```python=
def _cholesky_precision(cov):
"""
Computes Cholesky factor of the inverse of each covariance matrix in cov
param cov: dictionary of covariance matrices {time: Sigma}
"""
return {
time: np.linalg.cholesky(np.linalg.inv(item.values))
for time, item in cov.items()
}
def _diag_part(cholesky):
return {key: np.diag(L) for key, L in cholesky.items()}
```
```python=
class _CombinationProblem:
def __init__(self, keys, n, window):
self.keys = keys
K = len(keys)
self._weight = cvx.Variable(len(keys), name="weights")
self.A_param = cvx.Parameter((n * window, K))
self.P_chol_param = cvx.Parameter((K, K))
@property
def _constraints(self):
return [cvx.sum(self._weight) == 1, self._weight >= 0]
@property
def _objective(self):
return cvx.sum(cvx.log(self.A_param @ self._weight)) - 0.5 * cvx.sum_squares(
self.P_chol_param.T @ self._weight
)
def _construct_problem(self):
self.prob = cvx.Problem(cvx.Maximize(self._objective), self._constraints)
def solve(self, **kwargs):
return self.prob.solve(**kwargs)
@property
def weights(self):
return pd.Series(index=self.keys, data=self._weight.value)
@property
def status(self):
return self.prob.status
```
```python=
class _CovarianceCombination:
def __init__(self, sigmas, returns, means=None):
"""
Computes the covariance combination of a set of covariance matrices
param sigmas: dictionary of covariance matrices {key: {time: sigma}}
param returns: pandas DataFrame of returns
param means: dictionary of means {key: {time: mu}}, optional
Note: sigmas[key][time] is a covariance matrix prediction for time+1
"""
# Assert Sigmas and means have same keys if means not None
n = returns.shape[1]
if means is not None:
for key, sigma in sigmas.items():
# Assert sigmas and means have same keys
assert (
sigma.keys() == means[key].keys()
), "sigmas and means must have same keys"
else:
# Set means to zero if not provided
means = {
k: {time: np.zeros(n) for time in sigma.keys()}
for k, sigma in sigmas.items()
}
self.__means = means
self.__sigmas = sigmas
self.__returns = returns
# all those quantities don't depend on the window size
self.__Ls = pd.DataFrame(
{k: _cholesky_precision(sigma) for k, sigma in self.sigmas.items()}
)
self.__Ls_shifted = self.__Ls.shift(1).dropna()
self.__nus = pd.DataFrame(
{key: _nu(Ls, self.means[key]) for key, Ls in self.__Ls.items()}
)
self.__nus_shifted = self.__nus.shift(1).dropna()
@property
def sigmas(self):
return self.__sigmas
@property
def means(self):
return self.__means
@property
def returns(self):
return self.__returns
@property
def K(self):
"""
Returns the number of expert predictors
"""
return len(self.sigmas)
@property
def assets(self):
"""
Returns the assets in the covariance combination problem
"""
return self.returns.columns
def solve(self, window=None, times=None, **kwargs):
"""
The size of the window is crucial to specify the size of the parameters
for the cvxpy problem. Hence those computations are not in the __init__ method
Solves the covariance combination problem at a time steps given in times
param window: number of previous time steps to use in the covariance
combination problem
param times: list of time steps to solve the problem at; if None, solve at all available time steps
"""
# If window is None, use all available data; cap window at length of data
window = window or len(self.__Ls_shifted)
window = min(window, len(self.__Ls_shifted))
# Compute P matrix and its Cholesky factor
Lts_at_r = pd.DataFrame(
{
key: _B_t_col(Ls, self.__nus_shifted[key], self.returns)
for key, Ls in self.__Ls_shifted.items()
}
)
# time steps to solve the problem at
all_available_times = Lts_at_r.index
if times is None:
times = Lts_at_r.index
else:
times = list(times)
# assert times are consecutive in all_available_times
zero_index = all_available_times.get_loc(times[0])
assert list(
all_available_times[zero_index : zero_index + len(times)]
) == list(times)
"times must be consecutive"
# add window - 1 previous time steps to times
times = (
list(all_available_times[zero_index - window + 1 : zero_index]) + times
)
Bs = {time: np.column_stack(Lts_at_r.loc[time]) for time in times}
prod_Bs = pd.Series({time: B.T @ B for time, B in Bs.items()})
# times = prod_Bs.index
P = {
times[i]: sum(prod_Bs.loc[times[i - window + 1] : times[i]])
for i in range(window - 1, len(times))
}
P_chol = {time: np.linalg.cholesky(matrix) for time, matrix in P.items()}
# Compute A matrix
Ls_diag = pd.DataFrame({k: _diag_part(L) for k, L in self.__Ls_shifted.items()})
A = {
times[i]: _A(
Ls_diag.truncate(before=times[i - window + 1], after=times[i]), self.K
)
for i in range(window - 1, len(times))
}
problem = _CombinationProblem(
keys=self.sigmas.keys(), n=len(self.assets), window=window
)
problem._construct_problem()
for time, AA in A.items():
problem.A_param.value = AA
problem.P_chol_param.value = P_chol[time]
try:
yield self._solve(time=time, problem=problem, **kwargs)
except cvx.SolverError:
print(f"Solver did not converge at time {time}")
yield None
def _solve(self, time, problem, **kwargs):
"""
Solves the covariance combination problem at a given time t
"""
problem.solve(**kwargs)
weights = problem.weights
# Get non-shifted L
L = sum(self.__Ls.loc[time] * weights.values) # prediction for time+1
nu = sum(self.__nus.loc[time] * weights.values) # prediction for time+1
mean = pd.Series(index=self.assets, data=np.linalg.inv(L.T) @ nu)
sigma = pd.DataFrame(
index=self.assets, columns=self.assets, data=np.linalg.inv(L @ L.T)
)
mean = pd.Series(index=self.assets, data=np.linalg.inv(L.T) @ nu)
sigma = pd.DataFrame(
index=self.assets, columns=self.assets, data=np.linalg.inv(L @ L.T)
)
return Result(time=time, mean=mean, covariance=sigma, weights=weights)
```
## Evaluating Covariance Predictors
### Log-Likelihood as a Performance Metric
A straightforward way to assess the performance of a covariance predictor is by examining its average log-likelihood on realized returns. The log-likelihood function, under the assumption of multivariate Gaussian returns, is given by:
$$
\frac{1}{2T} \sum_{t=1}^{T} \left( -n \log(2\pi) - \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right),
$$
Higher values indicate a better fit of the predicted covariance matrix $\hat{\Sigma}_t$ to the observed returns $r_t$. This metric allows us to compare different predictors and identify periods of varying performance.
#### Code Implementation
```python=
def log_likelihood(r_t, sigma_t):
n = len(r_t)
log_det_sigma = np.log(np.linalg.det(sigma_t))
mahalanobis_dist = r_t.T @ np.linalg.inv(sigma_t) @ r_t
log_likelihood_value = -0.5 * (n * np.log(2 * np.pi) + log_det_sigma + mahalanobis_dist)
return log_likelihood_value
```
```python=
regrets = {}
for name in log_likelihoods:
regrets[name] = log_likelihoods["PRESCIENT"] - log_likelihoods[name]
```
### Log-Likelihood Regret: A Benchmark Comparison
To gain deeper insights into a predictor's performance, we can introduce the concept of log-likelihood regret. The regret compares the predictor's log-likelihood to that of the best possible constant predictor (using a fixed covariance matrix) over the entire dataset. The best constant predictor is typically the empirical sample covariance matrix:
$$
\frac{1}{2} \left( -n \log(2\pi) + 1 \right) - \log \det \Sigma^{\text{emp}}.
$$
The average log-likelihood regret is then defined as the difference between the log-likelihood of the empirical covariance and that of the predictor:
$$
\text{Regret} = \frac{1}{2T} \sum_{t=1}^{T} \left( \log \det \hat{\Sigma}_t - r_t^T \hat{\Sigma}_t^{-1} r_t \right) - \frac{1}{2} \left( -n \log(2\pi) + 1 \right) - \log \det \Sigma^{\text{emp}}.
$$
The regret quantifies how much the predictor underperforms compared to the best constant predictor. Minimizing regret indicates that the predictor adapts well to changing market conditions.
### Example: Stock Portfolio Covariance Prediction
#### Data Setup
```python=
def get_stock_price(ticker, startd, endd):
prices = yf.download(ticker, start=startd, end=endd)
prices = prices["Adj Close"].dropna(how="all") #drop missing value
return prices
```
```python=
START_DATE = '2000-07-01'
END_DATE = '2023-12-31'
ticker = ['XOM','WMT','AAPL','PG','JNJ','CHL','IBM','SBC','GE','CHV',
'PFE','NOB','NCB','KO','ORCL','HWP','INTC','MRK','PEP','BEL',
'ABT','SLB','P','PA','MCD']
```

#### Half-Life Parameters and Covariance Predictors
```python=
ewma_halflife=125
iewma_pair = (63,125)
cm_iewma_pairs = [(10, 21), (21, 63), (63, 125), (125, 250), (250, 500)]
#EWMA
ewma = dict(_ewma_cov(returns, halflife=ewma_halflife))
iewma = list(iterated_ewma(returns, vola_halflife=iewma_pair[0], cov_halflife=iewma_pair[1]))
iewma = {iterate.time: iterate.covariance for iterate in iewma}
#IEWMA
iewmas = {f"{pair[0]}-{pair[1]}": list(iterated_ewma(returns, vola_halflife=pair[0], cov_halflife=pair[1], min_periods_vola=63, min_periods_cov=63)) for pair in cm_iewma_pairs}
Sigmas = {key: {item.time: item.covariance for item in iewma} for key, iewma in iewmas.items()}
#CM-IEWMA
# Regularize the first covariance matrix
fast = cm_iewma_pairs[0]; fast = f"{fast[0]}-{fast[1]}"
Sigmas[fast] = add_to_diagonal(Sigmas[fast], lamda=0.05)
results = list(from_sigmas(Sigmas, returns, means=None).solve(window=10))
cm_iewma = {result.time: result.covariance for result in results}
weights = {result.time: result.weights for result in results}
```
<font color=red>Add graph</font>


#### Change of Weight for Different Halflime Parameter in CM-IEWMA
```python=
quarterly_weights = pd.DataFrame(weights).T.resample("Y").mean().loc[start_date:end_date]
plt.stackplot(quarterly_weights.index, quarterly_weights.values.T, labels=[f"{pair[0]}-{pair[1]}" for pair in cm_iewma_pairs])
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.);
```

#### Log Likelihood and Log-likelihood Regret
**Log Likelihood**
The experiment result indicate that the PRESCIENT predictor consistently outperforms the other methods, providing the best fit to the observed data. The other predictors, while generally close in performance, show more variability and are less robust during periods of high market volatility.
**Log-likelihood Regret**
The experiment results, indicate that all predictors show comparable performance in terms of log-likelihood regret, with lower values indicating better performance. The spike in regret around 2020 highlights the challenges faced by these predictors during periods of high market volatility.
## Portfolio Experiments
A practical way to evaluate covariance predictors is to assess their impact on real-world portfolio construction and performance. In this section, we'll conduct experiments using a portfolio of 16 stocks, comparing the results obtained using different covariance prediction methods.
### Portfolio Construction Strategies
We'll consider four common portfolio construction strategies and evaluate how different covariance predictors affect their performance:
1. **Equal Weight Portfolio:**
* **Strategy:** Allocates equal weights to all assets in the portfolio, i.e., $w_t = \frac{1}{n} \mathbf{1}$.
* **Pros:** Simple and intuitive to implement.
* **Cons:** May not be optimal in terms of risk-return tradeoff, as it does not consider individual asset risk or correlations.
2. **Minimum Variance Portfolio:**
* **Strategy:** Aims to minimize the portfolio's variance $w^T \hat{\Sigma}_t w$, subject to constraints:
* **Full investment:** $w^T \mathbf{1} = 1$
* **Leverage limit:** $\|w\|_1 \leq L_{\max}$
* **Individual weight bounds:** $w_{\min} \leq w \leq w_{\max}$
* **Pros:** Focuses on risk reduction, potentially leading to a more stable portfolio.
* **Cons:** May miss out on potential returns by overweighting low-volatility assets and underweighting high-return assets.
3. **Risk Parity Portfolio:**
* **Strategy:** Seeks to equalize the risk contribution of each asset to the overall portfolio volatility. The risk contribution of asset $i$ is defined as: $$\frac{\partial \log \sigma(w)}{\partial w_i} = \frac{w_i (\hat{\Sigma}_t w)_i}{w^T \hat{\Sigma}_t w}$$
* **Optimization:** This is achieved by solving the following convex optimization problem: $$\min_{x} \frac{1}{2} x^T \hat{\Sigma}_t x - \sum_{i=1}^n \frac{1}{n} \log x_i$$ where the portfolio weights are then obtained as $w_t = \frac{x^\star}{1^T x^\star}$.
* **Pros:** Offers a diversified approach to risk management, as it avoids concentration of risk in a few assets.
* **Cons:** Requires estimation of the covariance matrix and may be sensitive to changes in correlations.
4. **Maximum Diversification Portfolio:**
* **Strategy:** Maximizes the diversification ratio, balancing individual asset volatilities $\hat{\sigma}_t$ and correlations $\hat{\Sigma}_t$: $$D(w) = \frac{\hat{\sigma}_t^T w}{(w^T \hat{\Sigma}_t w)^{1/2}}$$
* **Optimization:** This is achieved by solving the following convex optimization problem: \begin{split} \min_x& \quad x^T \hat{\Sigma}_t x \\ \text{subject to}& \quad \hat{\sigma}_t^T x = 1, \quad x \geq 0 \end{split} The portfolio weights are then obtained as $w_t = \frac{x^\star}{1^T x^\star}$.
* **Pros:** Aims for high diversification and can potentially enhance risk-adjusted returns.
* **Cons:** May lead to unintended biases towards certain assets or sectors, depending on the covariance matrix estimation.
5. **Mean-Variance Portfolio**
* **Strategy**: Maximizes the expected return for a given level of risk (volatility) or minimizes the risk for a given level of expected return by balancing individual asset returns and covariances.
* **Optimization**: This is achieved by solving the following convex optimization problem:
$$
\begin{aligned}
&\text{maximize} \quad \hat{r}_t^T w \\
&\text{subject to} \quad \| \hat{\Sigma}_t^{1/2} w \|_2 \leq \sigma^{\text{tar}} \\
&\quad \quad \quad \quad \quad \quad 1^T w + c = 1, \\
&\quad \quad \quad \quad \quad \quad \| w \|_1 \leq L_{\max}, \\
&\quad \quad \quad \quad \quad \quad w_{\min} \leq w \leq w_{\max}, \\
&\quad \quad \quad \quad \quad \quad c_{\min} \leq c \leq c_{\max}
\end{aligned}
$$
#### Code Implementation
<font color=red>Make each portoflio strategy as a single function</font>
```python=
import cvxpy as cp
from scipy.optimize import minimize
from scipy.optimize import Bounds, LinearConstraint, NonlinearConstraint
def equal_weight_portfolio(n_assets):
return np.ones(n_assets) / n_assets
def min_variance_portfolio(cov_matrix, L_max=1, w_min=0, w_max=1):
n = cov_matrix.shape[0]
# Define variables
w = cp.Variable(n)
# Define objective
objective = cp.Minimize(cp.quad_form(w, cov_matrix))
# Define constraints
constraints = [
cp.sum(w) == 1,
cp.norm(w, 1) <= L_max,
w >= w_min,
w <= w_max
]
# Solve the problem
problem = cp.Problem(objective, constraints)
problem.solve()
return w.value
def risk_parity_portfolio(cov_matrix):
n = cov_matrix.shape[0]
# Define variables
x = cp.Variable(n)
# Define objective function
objective = cp.Minimize((1/2) * cp.quad_form(x, cov_matrix) - cp.sum(cp.log(x)) / n)
# Define constraints
constraints = [x >= 0] # Ensure all elements in x are non-negative
# Solve the problem
problem = cp.Problem(objective, constraints)
problem.solve()
# Normalize the weights
w = x.value / np.sum(x.value)
return w
def max_diversification_portfolio(cov_matrix):
"""
Calculate the maximum diversification portfolio.
Parameters:
cov_matrix (pd.DataFrame): Covariance matrix of asset returns.
volatilities (pd.Series): Volatilities of the individual assets.
Returns:
pd.Series: Weights of the maximum diversification portfolio.
"""
#calculate volatilities
volatilities = returns.std()
# Number of assets
n = len(volatilities)
# Define the optimization variables
x = cp.Variable(n, nonneg=True)
# Define the objective function
obj = cp.Minimize(cp.quad_form(x, cov_matrix.values))
# Define the constraints
constraints = [volatilities.values @ x == 1, x >= 0]
# Define the problem
prob = cp.Problem(obj, constraints)
# Solve the problem
prob.solve()
# Get the optimal weights
optimal_weights = x.value / np.sum(x.value)
# Convert the weights to a pandas Series
weights = pd.Series(data=optimal_weights, index=cov_matrix.columns)
return weights
def mean_variance_portfolio(r_hat, Sigma_hat, sigma_tar, L_max, w_min, w_max, c_min, c_max):
# Define the objective function
def objective(w):
return -np.dot(r_hat, w[:-1]) # Exclude the last element (c) in the dot product
# Define the risk constraint
def risk_constraint(w):
risk_value = np.dot(np.dot(w[:-1].T, Sigma_hat), w[:-1])
if risk_value < 0:
raise ValueError("Risk value calculated as negative, which is invalid.")
return sigma_tar - np.sqrt(risk_value)
# Define the budget constraint
def budget_constraint(w):
return 1 - np.sum(w[:-1]) - w[-1]
# Define the leverage constraint
def leverage_constraint(w):
return L_max - np.sum(np.abs(w[:-1]))
# Set bounds for weights and c
bounds_w = [(w_min, w_max)] * len(r_hat)
bounds_c = [(c_min, c_max)]
bounds = bounds_w + bounds_c
# Define constraints
cons = [
NonlinearConstraint(risk_constraint, 0, np.inf),
LinearConstraint(np.ones(len(r_hat) + 1), 1 - c_max, 1 - c_min),
NonlinearConstraint(leverage_constraint, 0, np.inf)
]
# Initial guess
initial_guess = np.ones(len(r_hat) + 1) / (len(r_hat) + 1)
# Perform optimization
result = minimize(
objective, initial_guess, method='SLSQP',
bounds=bounds,
constraints=cons,
options={'maxiter': 1000, 'disp': True}
)
if not result.success:
raise ValueError("Optimization did not converge: " + result.message)
optimal_weights = result.x[:-1]
optimal_c = result.x[-1]
return optimal_weights, optimal_c
```
### Volatility Control with Cash
To ensure a fair comparison across strategies and protect against excessive risk-taking, we'll implement a volatility control mechanism. This involves mixing each portfolio with cash to achieve a target level of ex-ante volatility, denoted as $\sigma_{\text{tar}}$.
The adjustment is calculated as follows:
$$
\theta = \frac{\sigma_{\text{tar}}}{\sigma_t}
$$
where $\sigma_t$ is the ex-ante volatility of the portfolio before adjustment. The resulting portfolio weights, including the cash component, are then:
$$
\begin{bmatrix}
\theta w_t \\
1 - \theta
\end{bmatrix}
$$
where $w_t$ is the original portfolio weight vector.
#### Code Implementation
```python=
import numpy as np
def calculate_portfolio_volatility(weights, covariance_matrix):
"""Calculates portfolio volatility."""
return np.sqrt(np.dot(weights.T, np.dot(covariance_matrix, weights)))
def adjust_weights_for_target_volatility(weights, current_volatility, target_volatility):
"""Adjusts weights to achieve target volatility."""
theta = target_volatility / current_volatility
if theta > 1:
return np.append(weights, 0) # No cash if leveraging is needed
cash_weight = 1 - theta
adjusted_weights = theta * weights
return np.append(adjusted_weights, cash_weight)
```
### Experiments
**Equal Weight Portfolio**

```
RW : {'Realized Return': 0.09051058690162095, 'Volatility': 0.009367880343162886, 'Sharpe Ratio': 0.042714514810121486, 'Maximum Drawdown': -0.3241418755710692, 'Realized Volatility': 0.009375755099804763}
EWMA : {'Realized Return': 0.09042090361372868, 'Volatility': 0.009304508141763411, 'Sharpe Ratio': 0.04290583707745277, 'Maximum Drawdown': -0.3241418755710687, 'Realized Volatility': 0.009312407365382204}
IEWMA : {'Realized Return': 0.09203422850195375, 'Volatility': 0.009298029160324685, 'Sharpe Ratio': 0.04358964555801548, 'Maximum Drawdown': -0.32414187557106866, 'Realized Volatility': 0.009306199975786173}
CM-IEWMA : {'Realized Return': 0.0963989330832491, 'Volatility': 0.009217517445246839, 'Sharpe Ratio': 0.04568726919960968, 'Maximum Drawdown': -0.3241418755710684, 'Realized Volatility': 0.009226487972777862}
```
**Minimum Volatility Portfolio**

```
RW : {'Realized Return': 0.07965078199555364, 'Volatility': 0.008719742748481836, 'Sharpe Ratio': 0.04044971693789385, 'Maximum Drawdown': -0.2894583490404754, 'Realized Volatility': 0.0087262545051118}
EWMA : {'Realized Return': 0.08127456041267855, 'Volatility': 0.009088956491420368, 'Sharpe Ratio': 0.039856316346716915, 'Maximum Drawdown': -0.31984752685494927, 'Realized Volatility': 0.009095519900858644}
IEWMA : {'Realized Return': 0.08134744743512856, 'Volatility': 0.009062169808763027, 'Sharpe Ratio': 0.039978210753052176, 'Maximum Drawdown': -0.3214050983516874, 'Realized Volatility': 0.009068758730534941}
CM-IEWMA : {'Realized Return': 0.08535426032504478, 'Volatility': 0.009023962138143231, 'Sharpe Ratio': 0.04180928675161649, 'Maximum Drawdown': -0.36103429719446584, 'Realized Volatility': 0.009031205082306539}
```
**Risk Parity Portfolio**

```
EWMA : {'Realized Return': 0.089771857327134, 'Volatility': 0.009147979841862839, 'Sharpe Ratio': 0.04320887881979153, 'Maximum Drawdown': -0.309067953484985, 'Realized Volatility': 0.009155868844420226}
IEWMA : {'Realized Return': 0.09144190074254399, 'Volatility': 0.009079586688271862, 'Sharpe Ratio': 0.04416639367503873, 'Maximum Drawdown': -0.30406983995964537, 'Realized Volatility': 0.009087800425460799}
CM-IEWMA : {'Realized Return': 0.0957830981595369, 'Volatility': 0.008903372740689319, 'Sharpe Ratio': 0.04671467936073269, 'Maximum Drawdown': -0.2978792517080981, 'Realized Volatility': 0.008912468313557692}
```
**Maximun Diversification Portfolio**

```
RW : {'Realized Return': 0.09270796273922777, 'Volatility': 0.009802008739481045, 'Sharpe Ratio': 0.0421025339561631, 'Maximum Drawdown': -0.35892271232984374, 'Realized Volatility': 0.009809985636449779}
EWMA : {'Realized Return': 0.08954898130510691, 'Volatility': 0.00988393872185069, 'Sharpe Ratio': 0.04062234382228661, 'Maximum Drawdown': -0.37772924430542043, 'Realized Volatility': 0.00989137157069012}
IEWMA : {'Realized Return': 0.09618565249728706, 'Volatility': 0.009560498345906087, 'Sharpe Ratio': 0.04430282440330771, 'Maximum Drawdown': -0.3629176794461102, 'Realized Volatility': 0.009569197443174894}
CM-IEWMA : {'Realized Return': 0.10262782834252171, 'Volatility': 0.009476868285735272, 'Sharpe Ratio': 0.04717517278978665, 'Maximum Drawdown': -0.3404822194817961, 'Realized Volatility': 0.009486745902075076}
```
**Mean Variance Portfolio**

```
RW : {'Realized Return': 0.06865750433300621, 'Volatility': 0.010877773378834986, 'Sharpe Ratio': 0.03047614765287506, 'Maximum Drawdown': -0.42980477878244217, 'Realized Volatility': 0.01088198522788498}
EWMA : {'Realized Return': 0.07092473841308511, 'Volatility': 0.010546457872318919, 'Sharpe Ratio': 0.03192935843888231, 'Maximum Drawdown': -0.4287854912795296, 'Realized Volatility': 0.010551027359283649}
IEWMA : {'Realized Return': 0.07474781207675285, 'Volatility': 0.010628921708819399, 'Sharpe Ratio': 0.03315378982817656, 'Maximum Drawdown': -0.39656380460297874, 'Realized Volatility': 0.010633953453707467}
CM-IEWMA : {'Realized Return': 0.09497297359672063, 'Volatility': 0.011061210069100803, 'Sharpe Ratio': 0.039277469618947854, 'Maximum Drawdown': -0.4090265777568538, 'Realized Volatility': 0.011068913834063873}
```
# Smooth Covariance Predictions
The main idea is to smooth the prediction $\hat{\Sigma}_t$ to obtain a smoother version, $\hat{\Sigma}_t^{sm}$. This is achieved by making $\hat{\Sigma}_t^{sm}$ an Exponentially Weighted Moving Average (EWMA) of $\hat{\Sigma}_t$, balancing smoother predictions and performance through a selected half-life.
The approach involves minimizing the following objective:
$$ \|\hat{\Sigma}_t^{sm} - \hat{\Sigma}_t\|_F^2 + \lambda \|\hat{\Sigma}_t^{sm} - \hat{\Sigma}_{t-1}^{sm}\|_F^2 $$
Here, $\lambda$ is a positive regularization parameter that adjusts the balance between smoothness and performance, similar to controlling the half-life in EWMA post-processing. The first term represents the loss, while the second term is a regularizer promoting smooth covariance predictions.
Additionally, more sophisticated smoothing methods can be developed by modifying the loss or the regularizer. For instance, using the Kullback-Leibler (KL) divergence as a loss yields a piecewise constant prediction, updating predictions only when necessary. This constitutes a convex optimization problem solvable efficiently.
1. **First Term**: $\|\hat{\Sigma}_{t}^{sm} - \hat{\Sigma}_{t}\|_{F}^{2}$
- $\hat{\Sigma}_{t}^{sm}$: This is the smoothed version of the covariance matrix at time $t$.
- $\hat{\Sigma}_{t}$: This is the original covariance matrix at time $t$.
- $\|\cdot\|_{F}^{2}$: This denotes the Frobenius norm squared, which measures the difference between two matrices. It sums the squares of all the entries in the matrix difference.
The first term measures how close the smoothed covariance matrix is to the original covariance matrix. Minimizing this term ensures that $\hat{\Sigma}_{t}^{sm}\) closely follows \(\hat{\Sigma}_{t}$.
2. **Second Term**: $\lambda\|\hat{\Sigma}_{t}^{sm} - \hat{\Sigma}_{t-1}^{sm}\|_{F}^{2}$
- $\hat{\Sigma}_{t-1}^{sm}$: This is the smoothed version of the covariance matrix at time $t-1$.
- $\lambda$: This is a positive regularization parameter. It controls the trade-off between the smoothness of the predictions and how closely they follow the original covariance matrices.
The second term measures the difference between the smoothed covariance matrix at time $t$ and the smoothed covariance matrix at time $t-1$. Minimizing this term ensures that $\hat{\Sigma}_{t}^{sm}$ changes smoothly over time, avoiding abrupt changes.
```python=
import numpy as np
from scipy.optimize import minimize
def smooth_covariance(Sigma_pred, Sigma_sm_prev, lam):
"""
Smooth the predicted covariance matrix using the specified objective function.
Parameters:
- Sigma_pred: The predicted covariance matrix at time t.
- Sigma_sm_prev: The smoothed covariance matrix at time t-1.
- lam: The regularization parameter λ.
Returns:
- Sigma_sm: The smoothed covariance matrix at time t.
"""
# Ensure Sigma_pred and Sigma_sm_prev are numpy arrays
Sigma_pred = np.asarray(Sigma_pred)
Sigma_sm_prev = np.asarray(Sigma_sm_prev)
# Define the objective function
def objective(Sigma_sm_flat):
Sigma_sm = Sigma_sm_flat.reshape(Sigma_pred.shape)
loss = np.linalg.norm(Sigma_sm - Sigma_pred, 'fro')**2
regularizer = lam * np.linalg.norm(Sigma_sm - Sigma_sm_prev, 'fro')**2
return loss + regularizer
# Initial guess for Sigma_sm is the predicted covariance matrix
initial_guess = Sigma_pred.flatten()
# Perform optimization
result = minimize(objective, initial_guess, method='L-BFGS-B')
if not result.success:
raise ValueError("Optimization did not converge: " + result.message)
# Reshape the result back to the original matrix shape
Sigma_sm = result.x.reshape(Sigma_pred.shape)
return Sigma_sm
```
**Experient of Smooth Covariance Predictions**
We select the CM-IEWMA covvariance predictors in Mean_Variance Portfolio to test the effect of using Smooth Covariance Predictors

## Reference
* Johansson, K., Ogut, M. G., Pelger, M., Schmelzer, T., & Boyd, S. (2023). A simple method for predicting covariance matrices of financial returns. Foundations and Trends in Econometrics, 12(4), 324–407.
* https://github.com/cvxgrp/cov_pred_finance