OpenCV with Python -- 2

--- tags: OpenCV --- # OpenCV with Python -- 2 [Back to book mode](https://hackmd.io/@Justin123/opencv) ## Introduction The last article talked about some fundamental knowledge about the images. To move on to the next level, we'll introduce what is convolution and also figure out how we can process the image to get the better result. ## Convolution ### What is convolution? Recall that the image is an array and its type is ==float in [0, 1]== or ==uint8 in [0, 255]==. Therefore, the idea of convolution is to use another small matrix also known as ==kernel== with image to do the **dot product**. See the image below. It's sound weird but we'll see how it works later! <center><img src=https://www.researchgate.net/profile/Chaim_Baskin/publication/318849314/figure/fig1/AS:614287726870532@1523469015098/Image-convolution-with-an-input-image-of-size-7-7-and-a-filter-kernel-of-size-3-3.png width=600></img></center> <center>Image to show convolution</center> --- ### Time to try out! ```python= #!/usr/bin/env python3 # -*- coding: utf-8 -*- import cv2 import numpy as np # Read in the image img = cv2.imread("food.jpg") ''' kernel = [[1/9, 1/9, 1/9], [1/9, 1/9, 1/9], [1/9, 1/9, 1/9]] ''' kernel = np.ones((3, 3)) / 9 # Filter the image result = cv2.filter2D(img, -1, kernel=kernel) # Origin image cv2.imshow("Origig image", img) # New image cv2.imshow("Filter image", result) cv2.waitKey(0) cv2.destroyAllWindows() cv2.waitKey(1) ``` **Note**: Image at the left is the original one and the other one is after filtering. We can observe that the image look blurry then before! ![](https://i.imgur.com/8vtuzXI.jpg) --- ### **Realize the convolution hands by hands** Let's make our own simple convolution function! Before we make our image, we should first know what is **padding**. Observe that the convolution image we show [above](https://hackmd.io/_9KDIbpySLid-pIEQW7DAA?both#What-is-convolution), the image become smaller then before. Therefore, we need additional block around the original matrix to deal with the problems. Here, the behaviour of adding additional block is called ==padding==. After knowing what is padding, we come across another problem - "How much additional blocks we need". First, the new values of each pixels is decided by the very middle values of the kernel when doing convolution. Hence, we should consider to make every pixel in origin image can be placed in the middle of the kernel. Namely, we should pad $\frac{n - 1}{2}$ row and col to each side. ```python= #!/usr/bin/env python3 # -*- coding: utf-8 -*- def simple_convolution(img, kernel): ''' Function to do the simple convolution img: Input image kernel: Kernel matrix, the size of kernel matrix should be odd ''' # Size for padding (same as int((kernel - 1) / 2)) pad_size = kernel.shape[0] // 2 # Use np.pad() to pad the img padding_img = np.pad(img, ((pad_size, pad_size), (pad_size, pad_size), (0, 0)), 'constant', constant_values=(0, 0)) # Set the start and end point of the image start_x, start_y = pad_size, pad_size end_x, end_y = img.shape[0] - pad_size, img.shape[1] - pad_size # Result image result = np.ones((img.shape[0], img.shape[1], 3)) # Start convolution for x in range(start_x, end_x + 1): for y in range(start_y, end_y + 1): ''' Suppose kernel size is 5: # Kernel shape: (5, 5) If the cannel of img is 2, no broadcasting else broadcast to (3, 3, 3) # paddimg_img[..., ...] shape: (5, 5, 3) ''' # Sum the new result with its first and second axis (remain the channel axis) result[x - pad_size, y - pad_size] = np.multiply(padding_img[x - pad_size: x + pad_size + 1, y - pad_size: y + pad_size + 1], kernel).sum(axis=(0, 1)) print("Origin shape: {}\nNew shape: {}".format(img.shape, result.shape)) # Ensure that the result is in type uint8 return result.astype(np.uint8) ``` ## HSV So far, we have introduced **RGB** and **Gray scale** image. There are another format to represent **RGB** image, that is **HSV**. **HSV** stands for three different metrics - ==Hue, Saturation, Value==. Below is the picture of the palette in Keynote. We change the color in both PPT or Keynote by **HSV** instead of RGB. The direction of arrow show what will be changed. When we move the position along the circle, we are going to change the the **Hue** and **Saturation** along the direction of radius. **Value** is changed by the bar below. <center><img src=https://i.imgur.com/4DR8FVa.png width=500></center> Let's go deeper about what is hue, saturation and value. * **Hue** Hue is the **color portion** of the model. * **Saturation** Saturation describes the **amount of gray** in a particular color. * **Value** Value works in conjunction with saturation and describes the **brightness or intensity of the color**. In OpenCV, **hue** range is [0,179], **saturation** range is [0,255], and **value** range is [0,255]. Below is how we convert **RGB** to **HSV** in OpenCV. $$ V = max(R, G, B) \\ S = \begin{cases} \frac{V - min(R, G, B)}{V} & \quad \text{if} \ V \neq 0\\ 0 & \quad \text{otherwise} \end{cases} \\ \\ H = \begin{cases} 60(G−B)/(V−min(R,G,B)) & \quad \text{if} \ V = R \\ 120+60(B−R)/(V−min(R,G,B)) & \quad \text{if} \ V = G \\ 240+60(R−G)/(V−min(R,G,B)) & \quad \text{if} \ V = B \end{cases} \\ $$ If $H<0$ then $H=H+360$. On output $0≤V≤1$, $0≤S≤1$, $0≤H≤360$. Output should be $V = 255*V$, $S=255*S$, $H = H/2$ (to fit to [0, 255]) . **Note**: **R, G, B** here is the channel of each pixel. Therefore, when we writing our own **HSV**, we should consider each pixels. Below is how we compute it. ![](https://i.imgur.com/reBRvLz.jpg) <center> How to get V </center> #### Let's try out * **Built-in function** ```python= #!/usr/bin/env python3 # -*- coding: utf-8 -*- import cv2 import numpy as np img = cv2.imread("car.jpg") # Convert RGB to HSV using cv2.cvtColor() img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # Show the image cv2.imshow("Original image", img) cv2.imshow("HSV image", img_hsv) cv2.waitKey(0) cv2.destroyAllWindows() cv2.waitKey(1) ``` <center><img src=https://i.imgur.com/m1V00Gr.jpg width=600></center> <center>Output</center> * **Hands by hands** (Just for reference) ```python= # Assistive function def convert_channels(channels): # Convert to (0, 1) channels = channels / 255 # Find extreme value max_value = np.max(channels) min_value = np.min(channels) ### Calculate V V = max_value ### Calculate S if (V != 0) : S = (V - min_value) / V else: S = 0 ### Calculate H if V == channels[2]: H = 60 * (channels[1] - channels[0]) / (V - min_value) elif V == channels[1]: H = 120 + 60 * (channels[0] - channels[2]) / (V - min_value) else: H = 240 + 60 * (channels[2] - channels[1]) / (V - min_value) if (H < 0): H += 360 # Stack the result and return result = np.hstack([np.round(H / 2), np.round(S * 255), np.round(V * 255)]) return result.astype(np.uint8) def rgb_to_hsv(img): # Copy the original image result = img.copy() # Go through each pixel for i in range(img.shape[0]): for j in range(img.shape[1]): result[i, j] = convert_channels(result[i, j]) return result ``` You may wonder why we have to convert image to **HSV**. The output looks wierd! However, we will see its amazing power in color filtering in no soon! ## Basic function ### Blurring In order to recognize the shapes or the lines in image processing, we hope to retain import items and drop the noise. Therefore, blurring the image cany help us to smooth the color and made the image processing more easier. There are lots of method to blur the image. However, I'll only introduce three ways. **Average blurring**, **Median blurring** and **Gaussian blurring**. * **Average filter** Average blurring use a indentity matrix and devided by its size. The intuitive idea is to share the information with its neighbor and send the important information to others. $$ \text{3 x 3 Average blur} \begin{bmatrix} \frac{1}{9} & \frac{1}{9} & \frac{1}{9} \\ \frac{1}{9} & \frac{1}{9} & \frac{1}{9} \\ \frac{1}{9} & \frac{1}{9} & \frac{1}{9} \end{bmatrix} $$ * **Median filter** By calculating the median value of the matrix and get the result. The matrix can be all 1 because we don't focus on the matrix but the image itself. * **Gaussian filter** Gaussian blur offer different weight for each entries which control by the sigma of **Gaussian function**. The advantage of using **Gaussian filter** is that the edge information was remained and make the photo looked true to life. In normal event, the weight in the middle of the matrix is the most heighest one. Therefore, the result will be closer to the original one. $$ \text{3 x 3 Gaussian blur with} \ \sigma_{x} = \sigma_{y} = 0 \ \ \ \ \frac{1}{16} \begin{bmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{bmatrix} $$ * **Function in OpenCV** ```python= import numpy as np import cv2 img = cv2.imread("resources/food.jpg") # Add noise to the image to see how blurring works gaussian = np.random.normal(0, 0.3, img.shape) * 255 img_noisy = img + gaussian img_noisy = np.clip(img_noisy, 0, 255).astype(np.uint8) # Convert the image to gray scale img_noisy_gray = cv2.cvtColor(img_noisy, cv2.COLOR_BGR2GRAY) ''' Blur the image ''' # Average blur img_average = cv2.blur(img_noisy_gray, (3, 3)) # Median blur img_median = cv2.medianBlur(img_noisy_gray, 3) # Gaussian blur img_gaussian = cv2.GaussianBlur(img_noisy_gray, (3, 3), 0) # Stack all images (Will introduce later, the user-defined function) result = stackImages(0.5, [[img, img_noisy, img_noisy_gray], / [img_average, img_median, img_gaussian]]) cv2.imshow("result", result) cv2.waitKey(0) cv2.destroyAllWindows() cv2.waitKey(1) ``` ![](https://i.imgur.com/gcQ7eFx.jpg) The top-right corner is the noise image. The images at the second row are **average filter**, **mean filter** and **Gaussian filter** from left to right. ## Edge detection In image processing, we often have to recognize the contour of the image to differentiate the shape. Therefore, we need to do the edge detection before we start find the contour. Therefore, we have to edge detection in advance to find the contour. We're going to introduce two way to do the edge detection. **Canny** and **Laplacian**. Before we start to know what is edge detection, let's see the result first. ![](https://i.imgur.com/ALCG7In.jpg) From image above, we know that the **edges** are the ==differences of two colors==. * **Canny** The Canny edge detector is based on the **first order derivitative** of the image. We won't get any further to find out the math behind it but here is **Sofiane Sahir** [article](https://towardsdatascience.com/canny-edge-detection-step-by-step-in-python-computer-vision-b49c3a2d8123) about how to implement Canny detector. I higly recommend that you guys can read it. * **Laplacian** The Laplacian edge dectector is based on the **second order derivative** of the image. Here is **Alan Saberi's** [viedo](https://www.youtube.com/watch?v=1b3Sr2MGLFg) about LoG filter. * **Function in OpenCV** Documentation for [Canny](https://docs.opencv.org/2.4/modules/imgproc/doc/feature_detection.html?highlight=canny#canny) and [Laplacian](https://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html?highlight=laplacian#laplacian). ```python= import cv2 import numpy as np img = cv2.imread("resources/car.jpg") # Change the image to gray scale img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Blur the image to reduce noise img_blur = cv2.GaussianBlur(img_gray, (3, 3), 0) ''' # Edge detection ''' # Canny detector img_canny = cv2.Canny(img_blur, 100, 100) # LoG detector (Use img_gray, because img_blur is not apparent) ''' Using cv2.CV_16s to prevent overflow ''' img_log = cv2.Laplacian(img_gray, -1, cv2.CV_16S) # Stack the image result = stackImages(0.5, [img, img_canny, img_log]) cv2.imshow("Result", result) cv2.waitKey(0) cv2.destroyAllWindows() cv2.waitKey(1) ``` ![](https://i.imgur.com/7SwY0jt.jpg) The images above are **origin**, **Canny** and **LoG**. ## Summary In today's article, we learned **HSV** format and the importance for both **blurring and edge detection**. In next tutorial, we're going to see how to crop the image and also change the perspective to get the different result. ## Reference 1. [LEARN OPENCV in 3 HOURS with Python (2020)](https://youtu.be/WQeoO7MI0Bs) 2. [The HSV Color Model in Graphic Design](https://www.lifewire.com/what-is-hsv-in-design-1078068) 3. [OpenCV documentation of Color Conversion](https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html) 4. [Concept of Blurring](https://www.tutorialspoint.com/dip/concept_of_blurring.htm) 5. [How to add gaussian noise in an image in Python using PyMorph](https://stackoverflow.com/questions/43699326/how-to-add-gaussian-noise-in-an-image-in-python-using-pymorph) 6. [Canny Edge Detection and LoG difference](https://stackoverflow.com/questions/13429134/canny-edge-detection-and-log-difference)