CV 1 - HackMD

# CV 1 Rgb is additive Cmy is subtractive All are + or - by 3. Hsv is cylinderical attachment. Rgb is like a cube. From white perspective Cmy is 4 dimensional. From black perspective. If no cmyk u get white --- H is colorcode S is intensity V is brightness --- We need diff img domains for compression. Images in diff formats have diff size. --- Set environment First from Rgb, separate every channel. Then we change it into cmyk domain Imagine img Img is combination of reflection * illumination Oriented gradients --- # CV 2 R = Min i sigma | I1 - I2 | error absolute sum L1 norm This is good when information is 70% good and 30% clutter. MIN v/ SIGMA [i1 - I2] Square L2 norm This is good when a bit of noise. --- Max of difference of pixels is used to find edges. Avg temooate to reduce file size but cause blurry image b[i] = mean [a[i:i+7, i:i+7]] Deblur: or sharpen: 1 / len (A) Sum A * Oa Main thing is designing A matrix. The matrix which hekos us fetch stuff from an image. That matrixnis called filter or kernel. This operation is convolution. There are unique results for each filter. If we minus blur image frm original image we get details Convolution in cv. Identity kernel does nothing --- # Cv 3 Sobel kernels help compress images. We have to do this because images can have millions of pixels for a simple image. Min size, max info --- Objective function is the main function that implemente something. Pixel subtraction had issues. L1 distance is Image 1 - image 2 absolute = result add everything. Distance metric falls apart with little variations in image. They will have same distance. It is not really solid. Distance minimize 15-17% accurate --- Us humans look at variance and change. Covariance is the measure if relationshio bw two random variables n to what extent they change together. We maximize in this covarriance (Formula in the img in gallery) Even this is not so good. 23% accuracy. Could be better. This is good for change like pixels shifted a bit to a side. Can also help with color intensities. Also little changes in img like boxes drawn at random points This would fall apart for rotated objects or zoomed in imgs. --- Cifar-io 10 classes 60k imgs. --- # Cv 4 Pixel wise operations: similarity index via correlation and distance based manhattan and euclidean Edge detection thru simple derivated based And image color space transformation. --- Feature wise operations We feed to some form of ML. Advanced version of distance formulas. Features that have been downgraded, we have to extract that. Basic issues related to shadows We will also extract occluded images. How to achieve clarity in picture, when we remove shadow from image, we cannot see the image as pixels. We take images to image histogram --- **Feature extraction** - histogram: most basic feature extractor. We check the count of 0s and 1s. We can even enhance the downed form and choose what features to keep like corners. To get corners we can use edge detection. We reduce image from 500x1 to 2x1. For grayscale images which are 0-255. Which would be 255x1 but we can get rid of redundant pixels by counting 0-10 as a single entity. That would be reduced to 26x1. This size is called bin size. Histogram tells us about brightness mainly. For rgb: 26x3 - different histogram for each channel. For now, we only extracted features, but how will we improve on shadows? We enhance brightness of a specific region or chunk in a picture. When we increase brightness in image, we use this. --- # Cv 5 **Histogram equilization** Hisotgram has to main things Brightness and contrast Add for bright and sub for contrast Incrementally add all pmf frequency for cdf value. **Histogram features** We'll use these features to check what we gotta feed our network. Posiitve examples are the ones that are good in training our network Adversarial are ones that are kinda bad. Vision has more adversarial examples. We gotta filter n clean our images first. Mean is not a good approach. We must extract features --- The histogram features that we will consider are statistical based. We can extract statistical based features N(g) - number of pixels at this gray levels M - total pixel size or img size P(g) - probability of gray level. Thisbis called pmf. P(g) = N(g ) / M Its value will never exceed 1. Because its probability. I.e an image of 3x3 with 255 occuring 3 times would be 3/9 = 1/3. Sum of p(G) IS always gonna be one. This is gonna be cdf. The features based on first order histogrM are: Mean is average intensity of img Standard deviation Skew Energy And entropy ---- We will extract and concatenate these features into the csv file. Meann itself is useless we must get other features to use it. --- ? Kernels To reduce image noise, we can take avg of the img Convolution is * APPLYING FILTER/KERNEL: O11 = I11 . F11 + I12 . F12.... I33 . F33 I IS INPUT AND F IS FILTER INPUT X FILTER = OUTPUT --- 0 padding us dealing with edges To control size of the output we need to use padding Wrap around: we wrap the image's bigger version inside the image Copyedge: approximate the edges n add to edges. (Built in stuff in opencv) Reflect across edge, we reflect the img to backside. Image stats stay the same. --- properties **linearity** filter(I, f1 + f2) = filter(I, f1) + filter(I, f2) We can just add filters together. One could be for smoothing ine could be for edge detection Convolution is good because things stack additively in it. This also goes the same way if we want to apply one same filter to two different images. we can also scale image by multiplying with k before filtering or after filtering Filter can also be scaled as well by multiplying by k multiply k with Image ir multiply k with filter, output is same. --- Filtering vs Convolution Filtering is cross correlation We find similarity using filtering. Similarity is a type of filter. Filter inversion is called convolution (this will be done by opencv) In Convolution we can do semgnentation, find edges lines n everything. --- Convolution mask is a matrix usually ofnsize 1x1 3x3 5x5 or 7x7 (odd number) Flip image or filter (filter mostly because small size) Place the mask on the image. We just have to know what kind of mask to design. Convolution can achieve blurring edge detection sharpening noise reduction corner finding boundaries finding Everything can be done with that mask. **Filter's sum must always be 1*** --- Blurring loses sharpness and detail of the img. 1/9 (111111111) = bluery img Real img - blurry img = sharpness Magnify img by applying a filter of 000020000 --- Original - smoothed = details Original + detail = sharpened --- Box filter kinda fucks with minority in an image which is usually edges. We have to blurrify img while retaining info of edges Gaussian filter to eliminate edge effects and wchieve gaussian blur. In this we have an expnded value of the filter Box filter is entirely white while gaussian is like a spot whose values fades near corner In a 50 x 50 filter sigma should cover like 60-70% of area Filter width / 6sigma = size of sigma --- Optical zoom/ digital zoom introduces artifacts in the img. --- Gaussian is a low pass filter (removes high frequencies) Gaussian filtering is appropriate for additive zero mean noise --- **Non linear filtering** No property exists, computionally expensive but has it's own benefits Median filter is sorting a filter and then finding mid point. We dont use averaging bcz it gets us value that may not exist Median filter would clean the image from artifacts. But first u gotta sort then search then replace. Range on what the median replace is, is decided by us. It fills img using stuff from same img Median cannot stack together, we have to find median twice if we have to use it twice. Last time we used to just add two filters to apply two filters at once No need to apply edges here. It aint applied on the Outer boundary of the img. --- Highpass kwenel finds edges Sobel kernels edges and shit --- Box filters & gaussian & median In median there is an argument Median filter has its own various branches Mode filter Minimum filter - segmentation Max filter Correlation Convolution Filtering Resizing Histograms Rgb, hsv, cmyk --- # CV MID What info can be get frkm this image? Geometric, semantic, vision for action --- Resizing, exposure, segmentstion, edges oriented gradients Srgments --- Objects in different light have different color We perceive color as: illuminance * reflactance Know how to multiply graphs --- Case study what is our skin in diferent colors --- Image interpolation, upsizing n downsizing We can do theu triangle Squares Bilinear is p good for grids Rectangles area find And cubic interpolation --- Image shrink Pixels explode when shrinking img First smoothen the img then scale down then it won't be fucked. --- Smoothen img thru box filter and gaussian filters. Box filter has ones in it Box filter eats and fucks up the edges. Gaussian filter is like a spot with 1s fading near the edges. 2d gaussian formula on cheat sheet. Two operators Convolution is filteration. It has a kernel or a filter Ns we give it an image and its applied on it. To apply convolution, we can flip vertical or flip horizontal. Then we do point wise multiplication. CNeural Network can do flipping stuff on its own. Highpass kernel also finds edges 0 -1 0 -1 4 -1 0 -1 0 Ultimately, edges and similarties are all features. and correlation To find similartires: we can use distance formula to find distance. This is deterministic correlation. The other is stochastic correlation. How to apply different matrices in imgs. Objective function this kernel should be minimizing rxy = sigma (xi-x bar)sigma(yi - ybar) / Square root sigma (xi-xbar)Squre . (Yi-ybar)Square --- Different histograms To perserve most information, we will Map hisotgram at the mid point. Gray to binary conversion To brightness enhance, move left or right Histogram equivalixation **Deterministic formula to increase contrast** Contrast stretching It wont scale if a pixel is 255 then its contrast wont be scaled. Thats wh we moced to probabilitc f(x,y) is pixel. fmin min intensity fmax is max intensity. 8 bit encoded then bpp to 255 10 bit is 1024 g(x,y) --- Histogram equilization h(v) = V is the pizel we tryna transform Cdf cumulative distance formula Its basically counting how many times a pixel occured. Its the frequency of pixel occurence. Then apply formula We look at the pixel we tryna inform. Transforming pixel values --- Gray level transformation **Contrast managing methods** Deterministic and stochastic s = T(r) They are predefined functions that our phones have Inverting is negative image. Log multiplication To negative an image, convert black to white n vice versa. R represents current value S is modified value L = BITS PER PIXEL s = L - 1 - r log transofrmation is if one object is bright in an image then passing thru log transformation, it highlights all features. s = c log (1+r) Original image pizel we take, we add 1 and then take log of it. C IS a variable constant which can vary we can tune it to test various outputs. For instance it will suppress huhe lights like sun or car highlights --- Powerlap transformation gamma correction s = cr^gamma different gammas produce different outouts If gamma is low then it highlights dark parts If gamma is high then light features. We can select which features to highlight in this. Like if img is in night or day. Gamma corrected images of aerial area. Gamma can go to infinite Its used as fuck in biomedical imgs and in general pre processing is required --- Output image is always smaller bcz To control size of outout image we use padding ‐-- Convolution is flippng filter kr kernel then poont wise multiplication simple --- Filter - mean over standard deviation Then poijt wise multiplication on it. X+(X-f(x)) 2x-f(x) X is original image 2 id a constant. F(x) Is box or gaus filter. --- Orgijal - smothed = details Original + detail = sharpened We can also do this at once --- We apply mean it fucks all pixels Median does not Median filter output being beautiful is this very reason. Median filtering is non linear. --- Convo, correlation How do filters cascade. ---