# CV 1
Rgb is additive
Cmy is subtractive
All are + or - by 3.
Hsv is cylinderical attachment.
Rgb is like a cube. From white perspective
Cmy is 4 dimensional. From black perspective.
If no cmyk u get white
---
H is colorcode
S is intensity
V is brightness
---
We need diff img domains for compression. Images in diff formats have diff size.
---
Set environment
First from Rgb, separate every channel.
Then we change it into cmyk domain
Imagine img
Img is combination of reflection * illumination
Oriented gradients
---
# CV 2
R = Min i sigma | I1 - I2 | error absolute sum
L1 norm
This is good when information is 70% good and 30% clutter.
MIN v/ SIGMA [i1 - I2] Square
L2 norm
This is good when a bit of noise.
---
Max of difference of pixels is used to find edges.
Avg temooate to reduce file size but cause blurry image
b[i] = mean [a[i:i+7, i:i+7]]
Deblur: or sharpen:
1 / len (A) Sum A * Oa
Main thing is designing A matrix.
The matrix which hekos us fetch stuff from an image. That matrixnis called filter or kernel. This operation is convolution.
There are unique results for each filter.
If we minus blur image frm original image we get details
Convolution in cv.
Identity kernel does nothing
---
# Cv 3
Sobel kernels help compress images. We have to do this because images can have millions of pixels for a simple image.
Min size, max info
---
Objective function is the main function that implemente something.
Pixel subtraction had issues.
L1 distance is
Image 1 - image 2 absolute = result add everything.
Distance metric falls apart with little variations in image.
They will have same distance.
It is not really solid.
Distance minimize
15-17% accurate
---
Us humans look at variance and change.
Covariance is the measure if relationshio bw two random variables n to what extent they change together.
We maximize in this covarriance
(Formula in the img in gallery)
Even this is not so good.
23% accuracy. Could be better.
This is good for change like pixels shifted a bit to a side. Can also help with color intensities. Also little changes in img like boxes drawn at random points
This would fall apart for rotated objects or zoomed in imgs.
---
Cifar-io 10 classes 60k imgs.
---
# Cv 4
Pixel wise operations: similarity index via correlation and distance based manhattan and euclidean
Edge detection thru simple derivated based
And image color space transformation.
---
Feature wise operations
We feed to some form of ML.
Advanced version of distance formulas.
Features that have been downgraded, we have to extract that. Basic issues related to shadows
We will also extract occluded images.
How to achieve clarity in picture, when we remove shadow from image, we cannot see the image as pixels.
We take images to image histogram
---
**Feature extraction**
- histogram: most basic feature extractor. We check the count of 0s and 1s. We can even enhance the downed form and choose what features to keep like corners. To get corners we can use edge detection. We reduce image from 500x1 to 2x1.
For grayscale images which are 0-255. Which would be 255x1 but we can get rid of redundant pixels by counting 0-10 as a single entity. That would be reduced to 26x1. This size is called bin size.
Histogram tells us about brightness mainly.
For rgb:
26x3 - different histogram for each channel.
For now, we only extracted features, but how will we improve on shadows?
We enhance brightness of a specific region or chunk in a picture.
When we increase brightness in image, we use this.
---
# Cv 5
**Histogram equilization**
Hisotgram has to main things
Brightness and contrast
Add for bright and sub for contrast
Incrementally add all pmf frequency for cdf value.
**Histogram features**
We'll use these features to check what we gotta feed our network.
Posiitve examples are the ones that are good in training our network
Adversarial are ones that are kinda bad. Vision has more adversarial examples.
We gotta filter n clean our images first.
Mean is not a good approach.
We must extract features
---
The histogram features that we will consider are statistical based. We can extract statistical based features
N(g) - number of pixels at this gray levels
M - total pixel size or img size
P(g) - probability of gray level. Thisbis called pmf.
P(g) = N(g ) / M
Its value will never exceed 1. Because its probability.
I.e an image of 3x3 with 255 occuring 3 times would be 3/9 = 1/3.
Sum of p(G) IS always gonna be one. This is gonna be cdf.
The features based on first order histogrM are:
Mean is average intensity of img
Standard deviation
Skew
Energy
And entropy
----
We will extract and concatenate these features into the csv file.
Meann itself is useless we must get other features to use it.
---
? Kernels
To reduce image noise, we can take avg of the img
Convolution is *
APPLYING FILTER/KERNEL:
O11 = I11 . F11 + I12 . F12.... I33 . F33
I IS INPUT AND F IS FILTER
INPUT X FILTER = OUTPUT
---
0 padding us dealing with edges
To control size of the output we need to use padding
Wrap around: we wrap the image's bigger version inside the image
Copyedge: approximate the edges n add to edges.
(Built in stuff in opencv)
Reflect across edge, we reflect the img to backside. Image stats stay the same.
---
properties **linearity**
filter(I, f1 + f2) = filter(I, f1) + filter(I, f2)
We can just add filters together. One could be for smoothing ine could be for edge detection
Convolution is good because things stack additively in it.
This also goes the same way if we want to apply one same filter to two different images.
we can also scale image by multiplying with k before filtering or after filtering
Filter can also be scaled as well by multiplying by k
multiply k with Image ir multiply k with filter, output is same.
---
Filtering vs Convolution
Filtering is cross correlation
We find similarity using filtering.
Similarity is a type of filter.
Filter inversion is called convolution (this will be done by opencv)
In Convolution we can do semgnentation, find edges lines n everything.
---
Convolution mask is a matrix usually ofnsize 1x1 3x3 5x5 or 7x7 (odd number)
Flip image or filter (filter mostly because small size)
Place the mask on the image.
We just have to know what kind of mask to design.
Convolution can achieve blurring edge detection sharpening noise reduction corner finding boundaries finding
Everything can be done with that mask.
**Filter's sum must always be 1***
---
Blurring loses sharpness and detail of the img.
1/9 (111111111) = bluery img
Real img - blurry img = sharpness
Magnify img by applying a filter of
000020000
---
Original - smoothed = details
Original + detail = sharpened
---
Box filter kinda fucks with minority in an image which is usually edges.
We have to blurrify img while retaining info of edges
Gaussian filter to eliminate edge effects and wchieve gaussian blur.
In this we have an expnded value of the filter
Box filter is entirely white while gaussian is like a spot whose values fades near corner
In a 50 x 50 filter sigma should cover like 60-70% of area
Filter width / 6sigma = size of sigma
---
Optical zoom/ digital zoom introduces artifacts in the img.
---
Gaussian is a low pass filter (removes high frequencies)
Gaussian filtering is appropriate for additive zero mean noise
---
**Non linear filtering**
No property exists, computionally expensive but has it's own benefits
Median filter is sorting a filter and then finding mid point.
We dont use averaging bcz it gets us value that may not exist
Median filter would clean the image from artifacts.
But first u gotta sort then search then replace.
Range on what the median replace is, is decided by us.
It fills img using stuff from same img
Median cannot stack together, we have to find median twice if we have to use it twice.
Last time we used to just add two filters to apply two filters at once
No need to apply edges here. It aint applied on the Outer boundary of the img.
---
Highpass kwenel finds edges
Sobel kernels edges and shit
---
Box filters & gaussian & median
In median there is an argument
Median filter has its own various branches
Mode filter
Minimum filter - segmentation
Max filter
Correlation
Convolution
Filtering
Resizing
Histograms
Rgb, hsv, cmyk
---
# CV MID
What info can be get frkm this image?
Geometric, semantic, vision for action
---
Resizing, exposure, segmentstion, edges oriented gradients
Srgments
---
Objects in different light have different color
We perceive color as:
illuminance * reflactance
Know how to multiply graphs
---
Case study what is our skin in diferent colors
---
Image interpolation, upsizing n downsizing
We can do theu triangle
Squares
Bilinear is p good for grids
Rectangles area find
And cubic interpolation
---
Image shrink
Pixels explode when shrinking img
First smoothen the img then scale down then it won't be fucked.
---
Smoothen img thru box filter and gaussian filters.
Box filter has ones in it
Box filter eats and fucks up the edges.
Gaussian filter is like a spot with 1s fading near the edges.
2d gaussian formula on cheat sheet.
Two operators
Convolution is filteration. It has a kernel or a filter Ns we give it an image and its applied on it. To apply convolution, we can flip vertical or flip horizontal. Then we do point wise multiplication.
CNeural Network can do flipping stuff on its own.
Highpass kernel also finds edges
0 -1 0
-1 4 -1
0 -1 0
Ultimately, edges and similarties are all features.
and correlation
To find similartires: we can use distance formula to find distance. This is deterministic correlation.
The other is stochastic correlation.
How to apply different matrices in imgs.
Objective function this kernel should be minimizing
rxy = sigma (xi-x bar)sigma(yi - ybar) /
Square root sigma (xi-xbar)Squre . (Yi-ybar)Square
---
Different histograms
To perserve most information, we will
Map hisotgram at the mid point.
Gray to binary conversion
To brightness enhance, move left or right
Histogram equivalixation
**Deterministic formula to increase contrast**
Contrast stretching
It wont scale if a pixel is 255 then its contrast wont be scaled. Thats wh we moced to probabilitc
f(x,y) is pixel. fmin min intensity
fmax is max intensity.
8 bit encoded then bpp to 255
10 bit is 1024
g(x,y)
---
Histogram equilization
h(v) =
V is the pizel we tryna transform
Cdf cumulative distance formula
Its basically counting how many times a pixel occured. Its the frequency of pixel occurence.
Then apply formula
We look at the pixel we tryna inform.
Transforming pixel values
---
Gray level transformation
**Contrast managing methods**
Deterministic and stochastic
s = T(r)
They are predefined functions that our phones have
Inverting is negative image.
Log multiplication
To negative an image, convert black to white n vice versa.
R represents current value
S is modified value
L = BITS PER PIXEL
s = L - 1 - r
log transofrmation is if one object is bright in an image then passing thru log transformation, it highlights all features.
s = c log (1+r)
Original image pizel we take, we add 1 and then take log of it. C IS a variable constant which can vary we can tune it to test various outputs.
For instance it will suppress huhe lights like sun or car highlights
---
Powerlap transformation
gamma correction
s = cr^gamma
different gammas produce different outouts
If gamma is low then it highlights dark parts
If gamma is high then light features.
We can select which features to highlight in this. Like if img is in night or day.
Gamma corrected images of aerial area.
Gamma can go to infinite
Its used as fuck in biomedical imgs and in general pre processing is required
---
Output image is always smaller bcz
To control size of outout image we use padding
‐--
Convolution is flippng filter kr kernel then poont wise multiplication simple
---
Filter - mean over standard deviation
Then poijt wise multiplication on it.
X+(X-f(x))
2x-f(x)
X is original image
2 id a constant.
F(x) Is box or gaus filter.
---
Orgijal - smothed = details
Original + detail = sharpened
We can also do this at once
---
We apply mean it fucks all pixels
Median does not
Median filter output being beautiful is this very reason.
Median filtering is non linear.
---
Convo, correlation
How do filters cascade.
---