Part 2 - HackMD

--- tags: Digital Image Processing disqus: hackmd --- # Part 2 Continuing from the discussion of signal processing on 1D signals, the same concepts can be applied to 2D signals or images in this case. An image is a function of two variables, $x$ and $y$. For bandlimited 1D signal, $X(\omega) = 0$, for $|\omega| > \omega_0$. Therefore for 2D bandlimited signal, $F(\omega_x, \omega_y) = 0$, for $|\omega_x| > \omega_{x0}$ and $|\omega_y| > \omega_{y0}$. \begin{equation} f_s(x,y) = f(x,y)comb(x,y;\Delta x, \Delta y) \end{equation} \begin{equation} f_s(x,y) = \sum_{m = -\infty}^{+\infty} \sum_{n = -\infty}^{+\infty}f(m\Delta x, n\Delta y)\delta(x - m\Delta x, y - n\Delta y) \end{equation} which can be represented as, ![](https://i.imgur.com/r7Zyelk.png) As done in 1D signals, the Fourier analysis can be done for 2D signals as discussed before, \begin{equation} F_s(\omega_x,\omega_y) = F(\omega_x,\omega_y)* COMB(\omega_x,\omega_y) \end{equation} Where $*$ is a convolution operator. The 2 dimensional surface we get is known as *Region of Support* (to be denoted as $R$ later). ![](https://i.imgur.com/CD6iKlK.png) For image reconstruction, it is evident that the Nyquist-Shannon theorem must be satisfied as discussed before. Consider one of the peaks from the Fourier spectrum. Let its region of support be denoted as $R$. To extract the sampling frequency used in this case, we can use a lowpass filter with the response function defined as, \begin{equation} H(\omega_x,\omega_y) = \frac{1}{\omega_{xs} \omega_{ys}} ;(\omega_x,\omega_y) \in R \end{equation} For other regions, $H(\omega_x,\omega_y) = 0$. (If one is not familiar with the concept of filters, this function is convolved with the original signal to get the output signal where that patch is obtained. In practical applications, one can see the reconstructed image in terms of 'dpi', or dots per inch. More the value of dpi, the clearer is the image, as shown. ![](https://i.imgur.com/tW6G9R0.png) But if one might think, that we should increase the value of dpi as much as possible to get a clearer image. However, there is no point in increasing it that much as soon as the legibility is sufficient. The size of the image increases, but the image quality remains the same. So, one may choose to increase the dpi until the Nyquist rate condition is satisfied. After that, there will be no change. ## Quantization The second phase in the steps mentioned before is quantization. After we have the sampled image. By definition, quantization is the mapping of the continuous variable to its discrete form. Now that we have the samples, we face another problem. The signal is still not in digital format, since each sample can have countably infinite possibilities as far as the output is concerned. So, it is possible that for the given set of samples, one can represent each pixel location with a finite set of numerical values. The algorithm for quantization can be defined as, 1. Define a set of transition levels $\{t_k, k = 0,1,...L - 1\}$, where $L$ is the total levels. 2. For the sampled function $u'$, $u' = r_k, t_k < u < t_{k+1}$ An example of quantized 1D signal is shown here, ![](https://i.imgur.com/8Xxiojw.png) *Note:* The y axis is in binary format Here, the analogue sinusoidal signal is first sampled and that is then converted into this staircase format. One can notice that instead of having infinite possibilities of the function output, now there are an only discrete set of possibilities. The question arises, how many levels are required? That can be set by the user of course. Some might also have noticed the deviation of the values from the original signal due to quantization. This is known as quantization error. For explaining it a bit more, refer to the image shown below, ![](https://i.imgur.com/WtVFocB.png) ### Lloyd - Max Quantizer Now that we have the quantization error in hand, it is essential to minimize the mean of this error for a perfect quantization. Here some concepts from random variables are taken into consideration. Let $u$ be a unique scalar random variable with continuous probability density function $p_u(u)$. It is now desired to find the optimum L - level quantizer with the required decision and reconstruction levels. The mean squared error is to be minimized. In order to get the mean square, we can evaluate it from the PDF given to us, as, \begin{equation} \xi = E((u - u')^2) = \int_{t_1}^{t_{L+1}} (u - u')^2 p_u(u) du \end{equation} This function needs to be minimized, with respect to decision and reconstruction levels as, \begin{equation} \frac{\partial \xi}{\partial t_k} = 0, \frac{\partial \xi}{\partial r_k} = 0 \end{equation} After some calculation and simplification, we get, \begin{equation} t_k = \frac{r_k + r_{k - 1}}{2}, r_k = E[u | u \in T_k] \end{equation} Here, $T_k = [t_k, t_{k+1})$ Therefore, the optimum transition level lies in between the two optimum reconstruction levels. Also, the reconstruction levels lie at the centre of mass of the probability density function. This is a nonlinear equation, which can be solved using different iterative methods such as the Newton Raphson method. ![](https://i.imgur.com/Fy0UDfR.png) This is the final piecewise constant function obtained, which looks somewhat like a Gaussian function. Using this, an approximate solution obtained for decision levels is, \begin{equation} t_{k+1} = t_{1} + \frac{A\int_{t_1}^{z_k + t_1}[p_u(u)]^{\frac{-1}{3}}du}{\int_{t_1}^{t_{L + 1}}[p_u(u)]^{\frac{-1}{3}}du} \end{equation} Where, $A = t_{L + 1} - t_1$ and $z_k = \frac{k}{L}A$, where, $k = 1,2,...L$. $A$ is the dynamic range and $t_1$ and $t_{L + 1}$ are the overload points and should be finite in order to solve the equation. They have to be assumed prior to the assignment of decision and reconstruction values. Therefore, the mean squared error for the quantizer can be solved to obtain, \begin{equation} \xi \approx \frac{1}{12L^2}\biggl(\int_{t_1}^{t_{L+1}} [p_u(u)]^{\frac{1}{3}}du\biggl)^{3} \end{equation} Normally, two types of probability density functions used are (concerning those used in images), 1. Gaussian \begin{equation} p_u(u) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{\frac{-(u - \mu)^2}{2\sigma^2}} \end{equation} 2. Laplacian \begin{equation} p_u(u) = \frac{\alpha}{2}e^{-\alpha|u - \mu|} \end{equation} Where $\mu$ is the mean and $\sigma^2$ is the variance. Relating Gaussian with Laplacian, $\sigma^2 = \frac{2}{\alpha}$. ### Uniform Distribution Quantizer Here, \begin{equation} p_u(u) = \begin{cases} \frac{1}{t_{L + 1} - t_1} & t_1 \leq u \leq t_{L + 1} \\ 0 & \text{otherwise} \end{cases} \end{equation} Given this, we apply the Lloyd - Max quantizer equation in order to get, \begin{equation} r_K = \frac{t_{K + 1}+t_K}{2} \end{equation} and, \begin{equation} t_K = \frac{t_{K - 1}+t_{K+1}}{2} \end{equation} Therefore the range of the individual levels is constant and is known as quantization step (q). So, $q = \text{constant} = t_{K+1} - t{K}$. So, \begin{equation} q = \frac{t_{L + 1}- t_1}{L} \end{equation} Where $L$ is the maximum quantizer level. Therefore, in this case, the MSE can be calculated as, \begin{equation} \xi = \frac{q^2}{12} \end{equation} Therefore for uniform distribution, the Lloyd - Max quantizer is linear and hence, it is known as a linear quantizer. Consider $B$ bits allocated to the quantizer. So, \begin{equation} \frac{\xi}{\sigma_u^2} = 2^{2B} \end{equation} The signal - to - noise (SNR) is $10log_{10}2^{2B} = 6B \text{ dB}$. So, the SNR for uniform quantizer is $6dB$ per bit. This ends the digitization process.