DIP - HackMD

--- tags: NSYSU, 影像處理, DIP, 期中考, 期末考 --- DIP === [TOC] <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> </p> </div> # CH1 ## Sampling Spatial resolution ## Quantization Intensity resolution ## Aliasing + Band-limited function: the highest frequency is finite and the function is of unlimited duration + Shannon sampling theorem If the band-limited function is sampled at a rate equal to or greater than twice its highest frequency, it is possible to recover completely the original function + Under-sampling → aliasing ## Zooming oversampling ## Shrinking undersampling ## Image Interpolation + Nearest neighbor interpolation + Bilinear interpolation + Bicubic interpolation ## Linear Operation Let $H[f(x,y)] = g(x,y)$ $H$ is a <em>linear operator</em> if: \begin{align} H[a_if_i(x,y) + a_jf_j(x,y)] &= a_iH[f_i(x,y)] + a_jH[f_j(x,y)] \\ &=a_ig_i(x,y) + a_jg_j(x,y) \end{align} ## Spatial Coordinate Transformations $(x, y) = T(v, w)$ | Transformation Name | Affin matrix | coordinate equation | | -------- | -------- | :--------: | | Identity | \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}| $x=v\\ y=w$ | | Scaling | \begin{bmatrix} c_x & 0 & 0 \\ 0 & c_y & 0 \\ 0 & 0 & 1 \end{bmatrix} | $x=c_xv\\ y=c_xw$ | | Rotation | \begin{bmatrix} \cos\theta & \sin\theta & 0 \\ -\sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix} | $x=v\cos\theta - w\sin\theta\\ y=v\cos\theta + w\sin\theta$ | # CH2 ## Image Enhancement ### spatail domain Manipulate pixels directly in an image ### frequency domain Modify the fourier transform of an image ## Negtive Image formula $$ g(x, y) = 255 - f(x, y) $$ ## Log transformation formula $$ s = clog(1+|r|) $$ ## Power-law transformation formula $$ s = cr^\gamma $$ <em>Note:</em> $$ if\quad r = (x/255), c = 1 \\ aka\quad s = c[255*(x/255)^\gamma]\\ s.t.\quad \lim_{\gamma \longrightarrow 0} s = 255 $$ ## Bitplane slicing ```python= img = self.cv2_img / (1 << planeNo) self.enhanced_img = cv2.bitwise_and(img.astype(np.uint8), 1) * 255 ``` ## Histogram Processing ### Unnormalized histogram $$ h(r_k) = n_k,\ for\ k=0,1,2,...,L-1\\ n_k: \text{the number of pixels in f with intensity }r_k\\ M: \text{the number of image rows}\\ N: \text{the number of image columns} $$ ### Normalized Histogram $$ p(r_k) = \frac{h(r_k)}{MN} = \frac{n_k}{MN} $$ * The sum of $p(r_k)$ for all values of $k$ is always 1. * The components of $p(r_k)$ are estimates of the probabilities of intensity levels occurring in an image. ### Histogram Equalization Goal: 為了讓數值分佈不均勻的圖片透過均值化使得分佈接近高斯分布 $$ s = T(r) \quad 0 \le r \le L-1 \\ r: \text{denote the intensities of an image to be processed.} \\ $$ 假設 1. $T(r)$ is a monotonic increasing function in the interval $0 \le r \le L-1$ 2. $0 \le T(r) \le L-1$ for $0 \le r \le L-1$ * Inverse transformation $r = T^{-1}(s) \quad 0 \le s \le L-1$ * random variable $p_r(r)$ denote the PDFs of intensity values $r$ in an image. $p_s(s)$ denote the PDFs of intensity values $s$ in an image. $r, s$ are in two different images. $p_r, p_s$ are different functions. + if $p_r(r)$ and $T(r)$ are known, and $T(r)$ is continuous and differentiable over the range of values of interest, then the PDF of the transformed(mapped) variables $s$ can be obtainedas: $$ p_s(s) = p_r(r)|\frac{dr}{ds}| $$ * Transformation function $$ s = T(r) = (L-1)\int^r_0p_r(w)dw $$ $w$: dummy variable of integration. * calculate $p_s(s)$ According to Leibniz's rule \begin{align} \frac{d_s}{d_r} &= \frac{dT(r)}{d_r} \\ &=(L-1)\frac{d}{d_r}[\int^r_0p_r(w)dw]\\ &=(L-1)p_r(r) \end{align} such that \begin{align} p_s(s) &= p_r(r)|\frac{d_r}{d_s}|\\ &=p_r(r)|\frac{1}{(L-1)p_r(r)}|\\ &=\frac{1}{L-1}\quad 0 \le s \le L-1 \end{align} * ## Spatial Filtering ### Smoothing (low-pass) filters 1. reduces additive noise 2. blurs the image 3. sharpness details are lost 1. Median filter + Replace $f (x,y)$ with median $[ f (x’, y’) ]$ + Useful in eliminating intensity spikes.( salt & pepper noise) + Better at preserving edges. 2. Average filter $$ g(x, y) = \frac{1}{MN}\sum^{M-1}_{i=0}\sum^{N-1}_{j=0}f(i, j) $$ ### Sharpening Filter 1. Enhance finer image details (such as edges) 2. Detect region or object boundaries. 1. Laplacian Based Edge Detectors $$ g(x,y) = f(x,y) + c[\nabla^2f(x,y)] $$ $f(x,y)$ are the input image. $g(x,y)$ are the sharpened image. + Rotationally symmetric, linear operator + Second derivatives ⟹ sensitive to noise + Increase the contrast at the locations of gray-level discontinuities. 2. Unsharp Masking Mask $$ g_{mask}(x,y) = f(x,y)− \bar{f}(x,y) $$ $\bar{f}(x, y)$ is Low pass filtered version from the original image. Final $$ g(x,y) = f(x,y)+ k * g_{mask} (x,y)\quad k ≥ 0 $$ $k=1$ unsharp masking $k>1$ hightboost filtering 3. The Gradient + Edge detection + Constant or slowly varying shades are eliminated + Automated inspection + segmentation 4. # CH3 ## different w ![](https://i.imgur.com/iDydYKm.png)![](https://i.imgur.com/6XBscDy.png)![](https://i.imgur.com/b66KnvN.png) # CH4 - Image Enhancement in the Frequency Domain ## Sampling theorem <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> A continuous, band-limited function can be recovered completely from a set of its samples if the samples are acquired at a rate exceeding twice the highest frequency content of the function. </p> </div> $$ \frac{1}{\Delta T} > 2 \mu_{max} $$ ### Nyquist rate <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> A sampling rate exactly equal to twice the highest frequency is called the <em>Nyquist rate</em> </p> </div> ### over sampled <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> The location of the filter function to achieve the necessary isolation of a single period of the transform for reconstruction of a band-limited function from its samples. </p> </div> #### minimum sampling rate ### under sampled <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> A sampling rate is too coarse such that if sampling were refined, more and more of the differences would be revealed in the sampled signals. </p> </div> ### Aliasing <div class="definition" style="background-color:PaleGoldenRod ;"> <h4><ins>In sampling theorem</ins></h4> <p class="def"> aliasing refers to sampling phenomena that cause different signals to become indistinguishable from one another after sampling or, viewed anotherway, for one signal to "masquerade" as another. </p> </div> **Aliasing in images** 1. Spatial aliasing Spatial aliasing is caused by under-sampling and tend to be more visible in images with repetitive patterns. 2. Temporal aliasing Temporal aliasing is related to time intervals between images of a sequence of dynamic images. |wagon wheel effect|Description| |:------------:|---------------| |![](https://media.giphy.com/media/lJEklktAKM3MQ/giphy.gif "wagon wheel")|The wheel appears to be rotating backwards because it is caused by the frame rate being too low with respect to the speed of wheel rotation in the sequence. ### Moiré Patterns <div class="definition" style="background-color:PaleGoldenRod ;"> <p class="def"> Moiré Pattern is a secondary, visual phenomenon produced, for example, by superimposing two gratings of approximately equal spacing. </p> </div> ![](https://i.imgur.com/LOs4A8C.png) ### Periodicity ![](https://i.imgur.com/af3DEUf.png) ## Frequency manipulation ### Basic scheme of filtering ![](https://i.imgur.com/rknnkjP.png) ### Apply Percentage of Power to image 1. standard cutoff frequency loci: using circles that enclose specified amounts of total image power $P_T$ $P_T$: Specified amounts of total image power $$ P_T = \sum^{P-1}_{u=0}\sum^{Q-1}_{v=0}P(u, v) $$ 2. center DFT 3. clac $\alpha$ percentage $$ \alpha = 100[\sum_u\sum_v\frac{P(u, v)}{P_T}] $$ ### Ideal Low-pass Filter $D_0$: cutoff frequency $$ H(u, v) = \begin{cases} 1 & if\ D(u, v) \le D_0 \\ 0 & if\ D(u, v) \gt D_0 \\ \end{cases} $$ ![](https://i.imgur.com/SOhBQv9.png) * Has ringing effect ### Gaussian Lowpass Filter \begin{align} H(u, v) &= e^{\frac{-D^2 (u,v)}{2\sigma^2}} \\ &= e^{\frac{-D^2(u,v)}{2{D_0}^2}} \\ \end{align} * No ringing effect ### Butterworth Lowpass filter $$ H(u, v) = \frac{1}{1+(\frac{D(u,v)}{D_0})^{2n}} $$ * less ringing effect ![](https://i.imgur.com/7bOanVn.png) ![](https://i.imgur.com/NMiNSz7.png) ### Ideal High Pass Filters $D_0$: cutoff frequency $$ H(u, v) = \begin{cases} 1 & if\ D(u, v) \gt D_0 \\ 0 & if\ D(u, v) \le D_0 \\ \end{cases} $$ ### Butterworth Hight Pass Filter $$ H(u, v) = \frac{1}{1+(\frac{D_0}{D(u,v)})^{2n}} $$ + characteristics of the ILPF using higher values of $n$ + GLPF for lower values of $n$ ![](https://i.imgur.com/x0aeshW.png) ![](https://i.imgur.com/7Qb7Z4K.png) ### Homomorphic Filtering $f(x,y)$: image $i(x,y)$: illumination<mark>(slow spatial variations)</mark> $r(x,y)$: reflectance<mark>(vary abruptly, particularly at the junctions of dissimilar objects)</mark> $$ f(x,y) = i(x,y)r(x,y) $$ **Frenquency domain** \begin{align} z(x,y) &= \ln f(x,y)\\ &= \ln i(x,y) + \ln r(x,y) \end{align} $\Im$: Fourier transformation \begin{align} \Im[z(x,y)] &= \Im[\ln f(x,y)] \\ &= \Im[\ln i(x,y)] + \Im[\ln r(x,y)] \end{align} ![](https://i.imgur.com/uhnlSRH.png) ![](https://i.imgur.com/XDtDb8X.png) # CH5 - Image Restoration ## Introduction |Enhancement|Restoration| |-----------|-----------| |“Better” visual representation|Remove effects of sensing environment| |Subjective|Objective| |No quantitative measures|Mathematical, model dependent quantitative measures| ### Motivation Remove, or at least reduce, blur and noise in a digital image Quantitative goal for spatial filtering ![](https://i.imgur.com/YLWJApr.png) ## Degradation model $$ g(x,y) = h(x, y) \star f(x,y) + n(x,y) $$ ## Noise model properties: 1. white noise: The Fourier Spectrum is constant 2. Random Noise: 假設與空間座標無關 3. 以機率模式描述noise ### Noise Probability Density Functions 1. Gaussian Noise 2. Rayleigh Noise 3. Gamma Noise 4. Exponential Noise 5. Uniform Noise 6. Impulse (salt-and-pepper) Noise ![](https://i.imgur.com/ayOy5VT.png) ![](https://i.imgur.com/B1OJ5pP.png) ![](https://i.imgur.com/EbwPyld.png) ## Restoration in the presence of noise only - spatial filtering ### Mean filters 1. Arithmetic Mean Filter + same as the box filter. + A mean filter smooths local variations in an image, and noise is reduced as a result of blurring. + suited for Gaussian or uniform noise $$ \hat{f}(x,y)=\frac{1}{mn}\sum_{(r,c)\in S_{xy}}g(r,c) $$ 2. Geometric Mean Filter + achieves smoothing comparable to an arithmetic mean filter. + tends to lose less image detail in the process. + suited for Gaussian or uniform noise. $$ \hat{f}(x,y)=[\prod_{(r,c)\in S_{xy}}g(r,c)]^\frac{1}{mn} $$ $\prod$: indicates mltiplication 3. Harmonic Mean Filter + works well for salt noise, Gaussian noise + fails for pepper noise $$ \hat{f}(x,y) = \frac{mn}{\sum_{(r,c)\in S_{xy}}\frac{1}{g(r,c)}} $$ 4. Contraharmonic Mean Filter + Q is called the order of the filter + reducing or virtually eliminating the effects of salt-and-pepper noise + Positive Q: eliminates pepper noise + Negative Q: eliminates salt noise + cannot eliminates salt and pepper simultaneously + Q = 0: arithmetic mean filter + Q = -1: harmonic mean filter $$ \hat{f}(x,y) = \frac{\sum_{(r,c)\in S_{xy}}g(r,c)^{Q+1}}{\sum_{(r,c)\in S_{xy}}g(r,c)^Q} $$ # CH6 - color image processing ## Introduction 1. Pseudo-color processing (false) colors are assigned to a monochrome image. 2. Full color processing images are acquired with full color-sensors/cameras. 白光經由三稜鏡可以分成六種光譜 violet, blue, green, yellow, orange, and red ![](https://i.imgur.com/Ai6MH4B.png) intensity (amount of light): the light is achromatic (devoid of color) Chromatic light spans the electromagnetic (EM) spectrum from approximately 400 nm to 700 nm. | Term | Definition | Unit | | -------- | -------- | -------- | | Radiance | the total amount of light that flows from a light source | Watts | |Luminance|a measure of the amount of energy an observer perceives from a light source|lumens| |Brightness|a subjective descriptor that is impossible to measure.|NONE ### Primary and secondary color #### Mixture of light(Additive primaries) - Magenta (red + blue) - Cyan (green + blue) - Yellow (red + green) \begin{align} white &= red + blue + green \\ &= Magenta + green \\ &= Cyan + red \\ &= Yellow + blue \end{align} example: Color television #### Mixture of pigment A primary color of pigment is defined as one that subtracts or absorbs a primary color of light and reflects or transmits the other two. example: printer ![](https://i.imgur.com/jCY078G.png) ## Tristimulus Values The amounts of red, green, and blue needed to form any particular color and are denoted by X, Y, and Z, respectively. trichromatic coefficients \begin{align} x = \frac{X}{X+Y+Z} \quad y = \frac{Y}{X+Y+Z} \quad z = \frac{Z}{X+Y+Z} \\ \end{align} $$ x + y + z = 1 $$ ## Chromaticity Diagram color composition: x (red), y (green) z = 1 - (x+y) The positions of various spectrum colors (completely saturated or “pure” colors) are indicated along the boundary of the tongue-shaped chromaticity diagram. Points inside this region represent some mixture of the pure colors. Point of equal energy corresponds to equal fractions of the three primary colors The point of equal energy corresponds to zero saturation. As a point leaves the boundary and moves towards the center, more white light is added to the color and it becomes less saturated. The chromaticity diagram can be used for color mixing, since a line joining two points in the diagram represents all the colors that can be obtained by mixing the two colors additively. This is consistent with the remark made earlier that the three pure primary colors by themselves cannot produce all the colors (unless we change the wavelengths as well). ![](https://i.imgur.com/EVrBp9W.png) ## Color models ### Introduction Purpose: is to facilitate the specification of color in some standard fashion properties: A color model is a specification of a 3-D coordinate system and a subspace within that system where each color is represented by a single point. |Color system | Definition| |-----|-----------| |RGB|mainly in color monitors and video cameras.| |CMYK|used in printing devices.| |HSI|Based on the way humans describe and interpret color. <br>It also helps in separating the color and grayscale information in an image.| ### RGB Color model Each color appears in its primary spectral components of red (R), green (G), and blue (B). Mainly used for hardware such as color monitors and color video camera. pixel depth: The number of bits used to represent each pixel in RGB space 8 bits for each of the primary component: depth of 24 bits <em style="background-color:rgb(255, 255, 51)">RGB model is suited for image color generation.</em> #### Implementation Based on a Cartesian coordinate system. ![](https://i.imgur.com/IgGplzb.png) Primary color: red, green, and blue Secondary color: cyan, magenta, and yellow <em style="background-color:rgb(255, 255, 51);">Grayscale (monochrome) is represented by the diagonal joining black to white.</em> #### Variation : Safe RGB safe RGB = all-systems-safe = safe web colors safe = browser colors Purpose: 由於有系統/應用程式無法顯示完整的24bits顏色，故定義一個RGB 子集使得所以系統/程式都能夠正常顯示顏色 ##### Implementation 1. Assuming 256 distinct colors as the minimum capability of any color display device, a standard notation to refer to these “safe” colors is necessary. 2. 40 of these 256 colors are known to be processed differently by various operating systems, leaving <mark>216 colors that are common to most systems.</mark> 3. These 216 colors are formed by a combination of RGB values 4. Each component is restricted to be one of possible six values in the set {0, 51, 102, 153, 204, 255} or using hexadecimal notation {00, 33, 66, 99, CC, FF}. <em>Note that all the values are divisible by 3.</em> ![](https://i.imgur.com/8fwfshY.png) ### CMY Color model Each color is represented by the three secondary colors — cyan (C), magenta (M), and yellow (Y). It is mainly used in devices such as color printers that deposit color pigments. It is related to the RGB color model: $$ \begin{bmatrix} C\\ M\\ Y \end{bmatrix}=1- \begin{bmatrix} R\\ G\\ B \end{bmatrix} $$ ### HSI / HSV color model |Term | Definition | |-----|------------| |Bright(Intensity, value)|embodies the chromatic notion of intensity.| |Hue|associated with the dominant wavelength in a mixture of light waves.| |Saturation|refers to the relative purity or the amount of white light mixed with a hue.| Hue and saturation together are called chromaticity. A color can be described in terms of its brightness and chromaticity. ![](https://i.imgur.com/fHw54bX.png) Hues opposite one another in a color circle are called complements. ![](https://i.imgur.com/oDt8OO0.png) Advantages: + Chrominance (H, S) and luminance (I) components are decoupled. + Hue and saturation is intimately related to the way the human visual system perceives color. + <em style="background-color:rgb(255, 255, 51);">HSI model is suited for image color description.</em> ### Manipulation of HSI Components ![](https://i.imgur.com/gkagZoT.png) ## Pseudo Coloring Assign colors to monochrome images, based on various properties of their gray-level content. ![](https://i.imgur.com/RCMqR1B.png) ## Color Tansformation RGB color vector \begin{bmatrix} R \quad G \quad B \end{bmatrix} HSI color vector \begin{bmatrix} H \quad S \quad I \end{bmatrix} Transformation $$ g(m, n) = T_i \times f(m, n), where\ i=1,2,3,...n $$ ### Color slicing Color slicing is similar to intensity slicing ### Modifying Intensity $$ g(m, n) = kf(m, n) \quad 0 \lt k \lt 1 $$ HSI Space $$ \begin{bmatrix} S_1 \\ S_2 \\ S_3 \end{bmatrix}= \begin{bmatrix} r_1 \\ r_2 \\ kr_3 \end{bmatrix} $$ RGB Space $$ \begin{bmatrix} S_1 \\ S_2 \\ S_3 \end{bmatrix}= \begin{bmatrix} kr_1 \\ kr_2 \\ kr_3 \end{bmatrix} $$ CMY Space $$ \begin{bmatrix} S_1 \\ S_2 \\ S_3 \end{bmatrix}= \begin{bmatrix} kr_1 \\ kr_2 \\ kr_3 \end{bmatrix}+(1-k) $$ ## Intensity Slicing View an image as a 2-D intensity function. Slice the intensity (or density) function by a plane parallel to the coordinate axes. Pixel with gray-values above the plane are color coded with one color and those below are coded with a different color. ![](https://i.imgur.com/Bhc29km.png) + This gives a two-color image. Similar to thresholding but with colors. + Technique can be easily extended to more than one plane. ## Gray Level to Color Transformations + Perform three independent transformations on the gray-level of an input monochrome image. + The outputs of the three transformations are fed to the Red, Green, and Blue channels of a color monitor. ![](https://i.imgur.com/Hepf6Vt.png) ## Color complement Hues opposite one another in a color circle are called complements. This is analogous to gray-scale negatives. + This transformation is useful in enhancing details embedded in dark portions of a color image. + Complementation can be easily implemented in the RGB space. + However, there is no simple equivalent of this in the HSI space. An approximation is possible. ## Smoothing of Color Images ### RGB domain + All the three color components are individually transformed by an appropriate smoothing mask $$ \frac{1}{9}\begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix} $$ <mark>This approach has the potential of introducing colors not present in the original image because the average of two colors is a color intermediate between the two</mark> ### HSI domain + Only the Intensity component is transformed by means of a spatial smoothing mask, leaving the H and S components unchanged. <mark>This approach doesn't introduce colors which don't present in the original image</mark> ## Color Image Sharpening Sharpening of color images can be performed in a manner analogous to smoothing ![](https://i.imgur.com/JuzWEau.png) ## Color Histogram Equalization ![](https://i.imgur.com/spqHDwU.png) ![](https://i.imgur.com/2MOo5Xy.png)![](https://i.imgur.com/Vi1gABM.png) ## Noise in Color Images Typically, noise affects all the three color components. + Across the three color channels, the noise is independent and its statistical characteristic are identical + Due to different illumination conditions or selective malfunction of camera hardware in a particular channel, this may not be the case. + Noise filtering by means of a simple averaging can be accomplished by performing the operation independently on the R, G, and B channels and combining the results. + More complicated filters like the median filter are not as straight-forward to formulate in the color domain and will not be pursued here. <style type = "text/css"> .theorem { display: block; margin: 12px 0; font-style: italic; } .theorem:before { content: "Theorem."; font-weight: bold; font-style: normal; } .lemma { display: block; margin: 12px 0; font-style: italic; } .lemma:before { content: "Lemma."; font-weight: bold; font-style: normal; } .proof { display: block; margin: 12px 0; font-style: normal; } .proof:before { content: "Proof."; font-style: italic; } .proof:after { content: "\25FC"; float:right; } .definition:{ display: block; margin: 12px 0; font-style: normal; } .definition:before { content: "定義"; font-weight: bold; font-style: normal; } p.def{ margin: 0px auto; text-align: justify; flex-direction: center; align-items: center; width: 100%; } </style>