# Questions
## Question set №1
**Does local correlation function is the same as Pearson correlation. In which case the formula of local correlations is equal to Pearson correlation**
In general local correlation function is not the same as Pearson correlation. The only one case where local correlation formula is equal to Pearson correlation is when the horyzon $\sigma \in (0; 1)$.
$$ w_{i,k} = \begin{cases} \left[ 1 - {\left( \frac{t_i - t_k}{\sigma} \right)} ^2 \right] ^ 2, & \mbox{if } \left| \frac{t_i - t_k}{\sigma} \right| \leq 1 \\ 0, & \mbox{otherwise} \end{cases} $$
If $\sigma \in (0; 1)$, then the above formula looks like this:
$$ w_{i,k} = \begin{cases} 1 , & \mbox{if } t_i = t_k \\ 0, & \mbox{otherwise} \end{cases} $$
#### What is biweight kernel function
Biweight kernel function $w_{i,k}$ --- is the function for constructing weights. It produces clear results that are not so hard to inference.
$$ w_{i,k} = \begin{cases} \left[ 1 - {\left( \frac{t_i - t_k}{\sigma} \right)} ^2 \right] ^ 2, & \mbox{if } \left| \frac{t_i - t_k}{\sigma} \right| \leq 1 \\ 0, & \mbox{otherwise} \end{cases} $$
#### What is the meaning of $\sigma$. What if $\sigma$ is small, and what if $\sigma$ is big? What if $\sigma$ tend to infinity.
$\sigma$ --- is the time horizon considered in the local correlation. It basicaly defines width of window for time $t_k$ in which neighbors' weight is non-zero. It also decreases effect of time delta between $t_k$ and $t_i$.
The more value of $\sigma$ the more is degree of localness, meaning that more values would have non-zero weights.
If the $\sigma$ is small, $w_{ik}$ would have non-zero only for nearest neighbors of $k$. If it is very big, huge amount of time points would contribute to the result (since their weights would be non-zero) and graph in correlation map becomes smoother. If $\sigma$ tends to infinity, all time points would conribute to the result, even though their contirbution will be small.
#### Why do we use $\log$ scale?
There are $\sigma$ is in the denominator of $w_{ik}$. As $\sigma$ grows becomes larger, it's contibution to the result will encrease. To compensate this we use $\log$ scale.
#### What the correlation map shows?
The scale space correlation maps shows how the correlation changes over time and at which scales correlation occurs.
### Question set 3
#### What is the multivariate t-distribution?
The **multivariate t-distribution** is a natural generalization of the univariate Student tdistribution. The multivariate t-distribution is a viable alternative to the usual multivariate normal distribution and on the other hand results obtained under normality can be checked for robustness.

**Parameters** for multivariate t-distrivution are:
- $\mu = [\mu_{1}, \dots, \mu_{p}]^T$ --- location (real $p\times 1$ vector);
- $\Sigma$ --- shape matrix (positive-definite real $p\times p$);
- $\nu$ is the degrees of freedom.
#### What is the meaning of $\nu_{x0}$, $\nu_{y0}$, $\sigma_{x0}^{2}$, $\sigma_{y0}^{2}$
$\nu_{x0}$ and $\nu_{y0}$ are the levels of freedom in he multivariate t-distribution
$\sigma_{x0}^{2}$ and $\sigma_{y0}^{2}$ are scales for checking correlation
#### How to choose $\nu_{x0}$, $\nu_{y0}$, $\sigma_{x0}^{2}$, $\sigma_{y0}^{2}$.
There are two ways to choose these parameters: Maximum Likelihood Estimation and Generalised cross validation.
### Proves
#### Prove 1
> Prove that if $\lambda \to \infty$ then $S_{\lambda}x$ converges to linear regression
$S_{\lambda}x = (v^{T}_{1}x)v_{1} + (v^{T}_{2}x)v_{2} + \sum_{i=3}^N (1 + \lambda \gamma) ^ {-1} (v_{i} ^ {T} x)v_{i}$,
where $v_{1}, v_{2}, \dots, v_{n}$ are the orthonormal eigenvectors of $C^{T}C$ with the corresponding eigenvalues $\gamma_{1}, \gamma_{2}, \dots, \gamma_{n}$.
If $\lambda \to \infty$ then $S_{\lambda}x$ tends to $(v^{T}_{1}x)v_{1} + (v^{T}_{2}x)v_{2}$, so it is linear regression of the ${x_{i}} 's$.
#### Prove 2
> Prove that $D_{\lambda}x$ limit converges to formula $-\lambda (I + \lambda C^{T}C)^{-1}C^{T}C(I + \lambda C^{T}C)^{-1}x$
$$
D_{\lambda}=
\lim_{\lambda' \to \lambda}\frac{S_{\lambda'} - S_{\lambda}}{log{\lambda'} - log{\lambda}}=
\lim_{\lambda' \to \lambda}\frac{(I + \lambda'C^TC)^{-1} - (I + \lambda C^TC)^{-1}}{log{\lambda'} - log{\lambda}}=\\
\lim_{\lambda' \to \lambda}\frac{1}{(I + \lambda' C^TC)(log{\lambda'} - log{\lambda})} - \frac{1}{(I + \lambda C^TC)(log{\lambda'} - log{\lambda})}=\\
\lim_{\lambda' \to \lambda}\frac{I + \lambda C^TC - I - \lambda' C^TC}{(I + \lambda' C^TC)(I + \lambda C^TC)(log{\lambda'} - log{\lambda})}=\\
\frac{C^TC}{(I + \lambda'C^TC)(I + \lambda C^TC)} * \lim_{\lambda' \to \lambda}\frac{\lambda - \lambda'}{log{\lambda'} - log{\lambda}}=\\
\frac{C^TC}{(I + \lambda'C^TC)(I + \lambda C^TC)} * (-1 * \lambda)=\\
-\lambda(I + \lambda'C^TC)^{-1}C^TC(I + \lambda C^TC)^{-1}
$$
#### Prove 3
> Prove that $\sigma_{x0}^{2} = (8/10) \sigma_{x}^{2}$
$\sigma_{x0} \sim \chi^2(v, r^2)$
Inv $E(\chi^2(v, r^2)) = \frac{v}{v-2}*r^2$
for $\sigma_{x0}^2$ estimation we used $v=v_{x0}=10$ and $r=\sigma_x$ our $\sigma_{x0}^2 = \frac{8}{10}*\sigma_x^2$
#### Prove 4
> MLE Formula for *mulivariate t* and proove the formilas for local minima of $\lambda$ and $\sigma$
PDF:

Paper: [link](https://arxiv.org/pdf/1707.01130.pdf)
$A=S_{\lambda}x=(I + \lambda C^TC)^{-1}x$
$x=A(I + \lambda C^TC)^{-1} = A + A\lambda C^TC$