White Noise Process

--- \documentclass{article} \usepackage[utf8]{inputenc} \usepackage{amsfonts} \usepackage{amsmath} \usepackage{mathtools} \usepackage{fullpage} \usepackage{amssymb} \usepackage{amsthm} \usepackage{comment} \usepackage{color} --- $%standard maths \newcommand{\R}{\mathbb{R}} %dot products \newcommand{\fdotp}[1]{\langle #1 \rangle} \newcommand{\dotp}[1]{\left( #1 \right)} %brackets \newcommand{\BK}[1]{\left( #1 \right)} \newcommand{\cBK}[1]{\left{ #1 \right}} \newcommand{\sBK}[1]{\left[ #1 \right]} % Probabilistic operators \newcommand{\E}{\mathbb{E}} \newcommand{\EE}[1]{\mathbb{E}\sBK{ #1 }} \newcommand{\P}{\mathbb{P}} \newcommand{\cov}{\mathbf{Cov}} \newcommand{\var}{\mathbf{Var}} \newcommand{\Normal}{\mathcal{N}} %looks nicer \renewcommand{\phi}{\varphi} \renewcommand{\epsilon}{\varepsilon} %this text \newcommand{\e}{\mathbf{e}}$ # White Noise Process Here is a short tutorial on how to generate white noise, which serves as the source of stochasticity in the context of inverse problems. ## Notation In this discussion we adopt the following notations for inner product. For two functions $a,b \in \mathbb{L}^2(\Omega)$ we set $$ \langle a,b \rangle = \int a(x) \, b(x) \, dx,\qquad \fdotp{ a,b } = \int_\Omega a(x) \, b(x) \, dx, $$ while for two finite dimensional vectors $a,b \in \R^d$ we set $$ (a,b) = \sum_{i=1}^d a_i \, b_i, \qquad \dotp{a,b} = \sum_{i=1}^d a_i \, b_i. $$ ## Preliminaries ### Covariance The covariance between two random variables, whose computation will arise often in this discussion, is defined as $$ \cov (X,Y) = \E[X \, Y^T] - \E[X] \, \E[Y]^T.$$ Assuming $X$ and $Y$ both have mean zero, the expression simplifies to $$ \cov (X,Y) = \E[X \, Y^T].$$ Indeed, the variance of a scalar random variable with mean zero is a special case, $$\cov (X,X) = \var (X) = \E[X^2].$$ ### Finite Element Method (FEM) In the finite element method, a function, $f(x)$ that belongs to a Hilbert space, $\mathcal{H}$ is represented as a weighed sum of $N$ basis functions, $e_1, \ldots ,e_N \in \mathcal{H}$, $$ f(x) = \sum_i^N \alpha_i \, \e_i . $$ Hence, once a set of basis functions have been chosen, the finite element approximation of any function is defined by its coefficient vector, $\{\alpha_i\}_{i=1}^N$. ## White Noise ### Vectors in $\R^d$ Let $u,v \in \R^d$ be two vectors and $Z \sim \Normal(0,I_d)$ be a standard Gaussian vector in $\R^d$. We define the two random variables \begin{align*} X_u = (u,Z) \in \R \qquad \textrm{and} \qquad X_v = (v,Z) \in \R. \end{align*} The mean, variance and covariance of $X_u$ and $X_v$ are as follows, \begin{align*} \EE{X_u} &= \E\Big[ \sum_i u_i \, Z_i \Big] =\sum_i \EE{ u_i \, Z_i } =\sum_i \, u_i \, \EE{ Z_i } = 0, \end{align*} \begin{align*} \var(X_u) &= \E[X_u^2] = \E\Big[\Big(\sum_i u_i \, Z_i\Big)^2 \Big] = \E\Big[\sum_{i,j} u_i \, u_j \, Z_i \, Z_j \Big] = \sum_{i,j} u_i u_j \, \E[Z_i \, Z_j]\\ &= \sum_{i} u_i^2 \, \E[Z_i^2] = \sum_{i} u_i^2 = \| u \| ^2. \end{align*} For the covariance, a similar computation yields \begin{align*} \cov(X_u, X_v) &= (u,v). \end{align*} The random $Z \sim \Normal(0,I_d)$ is often called **white noise** and denoted by $W$. It has the property that, given two test vectors $u$ and $v$, the random variables $X_u = (W,u)$ and $X_v = (W,v)$ satisfy the following conditions. \begin{align*} \E[X_u] = \E[X_v] = 0 \qquad \textrm{and} \qquad \cov(X_u,X_v) = (u,v). \end{align*} We now generalize the notion of white noise to function spaces. ### Functions in $\mathbb{L}^2$ The Hilbert space of square-integrable function on a domain $\Omega \subset \R^d$ is denoted as $\mathbb{L}^2(\Omega)$. The dot product is denoted by $$\fdotp{ f, g } = \int_\Omega f(x) \, g(x) \, dx < \infty. $$ Let $Z$ be a random function and $u, v$ test functions, we define the random variables \begin{align*} X_u = \fdotp{ u,Z } \qquad \textrm{and} \qquad X_v = \fdotp{ v,Z}. \end{align*} Since $Z$ is a random function, $X_u$ and $X_v$ are therefore random. Recall that in FEM, functions are represented as a linear combination of basis functions $\{\e_i\}_{i=1}^N$. Hence we may express $Z,u,v$ as \begin{align*} Z = \sum_i W_i \, \e_i, \quad \textrm{and} \quad u = \sum_i \alpha_i \, \e_i, \quad \textrm{and} \quad v = \sum_i \beta_i \, \e_i, \end{align*} where we choose a random vector $W$ such that $W \sim \Normal(0,\Gamma)$ for a covariance matrix $\Gamma \in \R^{N \times N}$ to be determined below. The goal is now to find some conditions to fulfill such that our random function $Z$ qualifies as white noise. This problem translates into finding a good choice of covariance structure $\Gamma$ such that, given any two test functions $u,v \in \mathbb{L}^2(\Omega)$, the random variables $X_u = \fdotp{ W, u }$ and $X_v = \fdotp{ W, v }$ are centred and satisfy the identity $\cov (X_u, X_v) = \fdotp{ u,v }$. Let us examine closer the definition of the inner product of functions, using the FEM framework. We have that \begin{align*} \fdotp{ u,v } &= \fdotp{ \sum_i \alpha_i \e_i \sum_j \beta_j \e_j } = \sum_{i,j} \alpha_i \beta_j \fdotp{ \e_i, \e_j } = \dotp{ \alpha, M \beta} \end{align*} where the mass matrix $M \in \R^{N,N}$ is defined as $M_{ij} = M_{ij}^T = \fdotp{ \e_i, \e_j }$. The covariance between $X_u$ and $X_v$ reads \begin{align*} \cov(X_u, X_v) &= \E[X_u \, X_v] = \E[ \dotp{ \alpha, M \, w} \, \dotp{ \beta, M \, w} ]\\ &= \E[\alpha^T \, M \, w \, w^T \, M \, \beta] = \alpha^T \, M \, \E[w \, w^T] \, M \, \beta \\ &= \alpha^T \, M \, \Gamma \, M \, \beta = \dotp{ \alpha, M \, \Gamma \, M \beta}. \end{align*} Equating the two expressions of the covariances, i.e. $\dotp{ \alpha, M \, \beta} = \dotp{ \alpha, M \, \Gamma M \beta}$ for any vectors $\alpha, \beta \in \R^N$, shows that the covariance matrix $\Gamma$ must be defined as $$\Gamma = M^{-1}.$$ To summarize, a white noise field in $\mathbb{L}^2(\Omega)$ can be approximated using the finite element method by defining \begin{align*} Z = \sum_{i=1}^N \, W_i \, \e_i \end{align*} with random coefficient $W$ distributed as a centred Gaussian vector with covariance $\Gamma = M^{-1}$. After all said and done, our job is to generate a random vector $W$ from $\Normal(0,M^{-1})$ so that the random function $Z = (W,\e)$ qualifies as white noise. ## Generating Gaussian random variables We can readily sample random numbers from the standard normal $\Normal(0,I)$. For generating a Gaussian random vector with mean $\mu$ and covariance $\Gamma$, one can use the fact that $$ \mu + L \xi \sim \Normal(\mu, \Gamma) $$ if $\xi \sim \Normal(0,1)$ and $L$ is obtained from the Cholesky decomposition of $\Gamma$, $$\Gamma = L \, L^T.$$ We can verify that $\mu + L \xi$ has a mean of $\mu$ and as claimed, its covariance matrix checks out to be $$\E[ L \, \xi \, \xi^T \, L^T] = L\,\E[\xi \, \xi^T]L = L \, L^T = \Gamma.$$ In our case, we would like to generate random vector $W \sim \Normal(0,M^{-1})$, i.e. $\mu = 0$. To get the Cholesky decomposition of $M^{-1}$, we start with decomposing $M = L \, L^T$ and see that $M^{-1} = (L^T)^{-1} \, L^{-1}$. It then follows that $W = (L^T)^{-1} \xi \sim \Normal(0,M^{-1})$ for a standard Gaussian random vector $\xi \sim \Normal(0,I_N)$. To solve for $W$, instead of inverting $L^T$ which can be expensive ($\mathcal{O}(N^3)$) especially when $N$ is large, we solve the linear system $$ L^T \, W = \xi. $$ Note that $W$ is random due to the stochastic nature of $\xi$. ## Conclusion In short, to generate a (random) white noise function, 1. Choose a set of basis functions $\e_1, \ldots, \e_N \in \mathcal{H}$ and compute the mass matrix $M_{ij} = \fdotp{ \e_i, \e_j}$. 2. Perform Cholesky decomposition of the mass matrix, i.e. $M = L \, L^T$. 3. Solve the linear system $L^T \, W = \xi$ for a standard Gaussian random vector $\xi \sim \Normal(0,I_N)$. 4. Define the Gaussian white noise field by \begin{align*} Z = \sum_i W_i \, \e_i. \end{align*} Indeed, it is of paramount importance to exploit the sparsity of the mass matrix $M$ in order to scale this methodology to large mesh sizes $N \gg 1$.