written by @marc_lelarge
We start with real random variables (r.v.).
1- Why is variance positive?
Recall that so means that .
answer: Start with
Similarly, we have for the Covariance of the random variables and :
Note that .
We have for , , note that we use the standard notation where capital letters denote random variables and lowercase letters for constants or parameters.
We have
2- How to compute moments?
We start with a remark: if a random variable has a symmetric density function, i.e. for all , then its odd moments are zero: .
answer thanks to the moment generating function: so that we get:
To understand why this is true, we can write : so that we have:
Let's apply this method for the normalized Gaussian random variable . We have
In particular, we have
so that we have , , , …
Note that we already knew that the odd moments are zero but if you need the fourth moment, you need to compute the fourth derivative anyway…
Note that for simple distributions, the direct computation might be easier. For example, for the uniform distribution on the interval with , we have for :
and
so that .
3- Independence implies null covariance but null covariance does not imply independence!
Here is a simple example: consider the random variables equal to , or with equal probability.
We clearly have and but and are not independent as knowing determines . More formally, we have for example and .
4- If is a Gaussian r.v. and is another Gaussian r.v. such that then is a Gaussian vector.
5- If is a Gaussian vector then is equivalent to .
But even if , and , this does not imply that is a Gaussian vector. Here is a simple counter-example: take and define for :
It is easy to see that , moreover, we have
We see that for , we have and when , we have , so there exists a value of for which .
But is never a Gaussain r.v. as
6- Moments of Gaussian r.v.
We have for
Hence the moments for are given by and
In general we have
7- Partititoned Gaussians
We consider now a Gaussian vector that we decompose as . We consider the same decomposition for the parameters:
Note that .
We also introduce the precision matrix and decompose it as:
Note that , indeed we can use the following formula for the inverse of a partitioned matrix:
where .
Hence we see that .
Conditional distributions:
with
Marginal distribution
8- Marginal and Conditional Gaussians
Consider a Gaussian vector:
and a linear Gaussian model:
where are parameters governing the means, and and are precision matrices. Then is a Gaussian vector and we have
public
tutorials