--- tags: Digital Image Processing disqus: hackmd --- # Part 9 ## KL Transform In other kernel, the value is constant, however KL transformation is actually based on the data representation. So, consider the array of vectors as, \begin{equation} X = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\ . \\ . \\ x_n \end{bmatrix} \end{equation} It's mean can be calculated as $\mu_X = E[X]$, whereas its covariance can be calculated as $C_X = E[(X - \mu_X)(X - \mu_X)^T]$. The dimension vector $X$ is $n \times 1$. The dimension of the matrix $C_X$ is $n \times n$. The elements of the covariance matrix elements, say $C_{ii}$ represents variance of $x_i$, whereas for $C_{ij}$ is the covariance of $x_i$ and $x_y$. Also, the matrix on its own is real and symmetric. Consider $e_i$ is an eigenvector of $C_X$, corresponding to eigenvalue $\lambda_i$. It is assumed that $\lambda_j > \lambda_{j + 1},j = 1,2,...,n-1$. Construct a matrix $A$ such that the first column is the eigenvector corresponding to largest eigenvalue, and the last column being the eigenvector corresponding to smallest eigenvalue, that is, descending order in terms of eigenvector arrangement from left to right in accordance with eigenvalue. With this, a transformation can be made as $Y = A(X - \mu_X)$. Some of the properties of $Y$ are, 1. $\mu_Y = 0$. 2. $C_Y = AC_XA^T$. So, the covariance matrix is a diagonal matrix with eigenvalues as the elements. Here, the off-diagonal elements are zero, indicating that elements of $Y$ are uncorrelated. Consider the situation, ![](https://i.imgur.com/uaIzAw1.png) The vector can be represented as, \begin{equation} X = \{(3,4),(4,3),(4,4),(4,5),(5,4),(5,5),(5,6),(6,5)\} \end{equation} The mean vector in this case can be calculated as $\mu_X = (4.5,4.5)$. To find the covariance matrix, we calculate the elements inside expectation values. Taking the average of each element of the matrix, we get the final matrix as $C_X =$ \begin{bmatrix} 0.75 & 0.375 \\ 0.375 & 0.75 \end{bmatrix} The eigenvalues of $C_X$ can be calculated as $\lambda = 0.75 \pm 0.375$. So, $\lambda_1 = 1.125$ and $\lambda_2 = 0.375$. We have $C_Xz = \lambda z$. So, we get the two eigenvectors as, \begin{equation} e_1 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ 1 \end{bmatrix},e_2 = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 \\ -1 \end{bmatrix} \end{equation} So, \begin{equation} A = \frac{1}{\sqrt{2}} \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix} \end{equation} Now, $Y = A(X - \mu_X)$. This relation accounts to establising a new coordinate system looking like, ![](https://i.imgur.com/QFkpr3E.png) Since $A$ has eigenvectors which are orthogonal, $A^{-1} = A^T$. So, original data can be obtained as $X = A^TY + \mu_X$. If the first $k$ largest eigenvalues are taken into consideration, along with the corresponding eigenvectors as $A_k$, then we can get approximate reconstruction as, $\hat{X} = A_k^TY + \mu_X$. Note that $A_k$ has order $k \times n$, $Y$ has order $k$. Therefore, $\hat{X}$ retains the dimensions, however it is an approximation. The error can be written as, \begin{equation} e_{ms} = \sum_{j = 1}^{n}\lambda_j - \sum_{i = 1}^{k}\lambda_i = \sum_{j = k + 1}^{n}\lambda_j \end{equation} So, error is the sum of eigenvalues not considered for making transformation matrices. Since these eigenvalues are small, the MSE is small too! This is quite useful in data compression. ![](https://i.imgur.com/QdO3nSX.png) This is the image as well as the corresponding columns taken into consideration. Their mean yields $\mu_X$. Similarly, $C_X$ can also be obtained. Depending on the value of $k$, different compressed images can be obtained. In KL transformation is much higher than any other transformation. However, its computation complexity is quite high, and therefore, used less.