MLRF: Lecture 03
Agenda for lecture 3
- Introduction
- Finish lecture about local feature detectors
- Local feature descriptors
- Descriptor matching and indexing
- Projective transformations
- Homography estimation
A word about Divise clustering
HAC is bottom-up, divisive clustering is performed top-down
Classical approach:
- Start with all data
- Apply flat clustering
- Recursively apply the approach on each cluster until some termination
Pros: can have more than 2 sub-trees
Summary of last lecture
Global image descriptors
- Color histogram
- Limited descriptive power
- Which distance function ?
Clustering
- K-means
- Hierarchical Agglomerative Clustering
Local feature detectors
- Image gradients
- Edge detector: Sobel, Canny
- Corner detector: Harris
- Large image gradient in 2 directions
- Corner detector: FAST
- Corner detectors: LoG, DoG, DoH
- Blob detector: MSER
Nest practice sessions
Compute and match descriptors for max. 1 hour (from practice session 2)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Play with ORB keypoint mathcing to implement a simple AR technique (practice session 3)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Introduction
How are panorama pcitures created from multiple pictures?
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
- Detect small parts invariant under viewpoint change: keypoints
- Find pairs of mathcing keypoints using a description of their neighborhood
- Compute the most likely transformation to blend images together
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Local feature descriptors
Harris & Stephen conclusion
Harris-Stephens trick to avoid computing eigenvalues:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Nowadays linear algebra is cheap, so compute the real eigenvalues.
Thein filter using , being a threshold
This is the Shi-Tomasi variant
Build your own edge/corner detector
We need the eigenvalues and of the structure tensor (hessian matrix with block-wise summing)
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
dst = cv2.cornerEigenValsAndVecs(src, neighborhood_size, sobel_aperture)
dst = cv2.cornerMinEigenVal(src, neighborhood_size, sobel_aperture)
Harris summary
Pros
Translation invariant
- Large gradients in both directions = stable points
Cons
Not so fast
- Avoid to compute all those derivatives
Not scale invariant
- Detect corners at different scales
Corner detectors, binary tests FAST
Features from accelerated segment test (FAST)
Keypoint detector used by ORB
Segment test
Compare pixel intensity with surrounding pixels (circle of 16 pixels)
If contiguous pixels are either:
- all darker than
- all brighter than
then is detected as a corner
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Tricks
- Cascading
- Machine learning
- How to perform non-maximal suppression
FAST summary
Pros
Very fast
- 20 times faster than Harris
- 40 times faster than DoG
Very robust to transformations (perspective in particular)
Cons
Very sensitive to blur
Corner detectors at different scales LoG, DoG, DoH
Laplacian of Gaussian (LoG)
The theoretical, slow way
Band-pass filter
It detects objects of a certain size
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Laplacian = second derivative
Like sobel with 1 more derivation
Taylor, again:
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
New filter
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Laplacian filter
Edge detector, like Sobel but with second derivatives
Image Not Showing
Possible Reasons
- The image file may be corrupted
- The server hosting the image is unavailable
- The image path is incorrect
- The image format is not supported
Learn More →
Laplacian of Gaussian
mexican hat

LoG = detector of circular shapes
Detector of circular shapes:

LoG filter extrema locates "blobs"
- maxima = dark blobs on light background
- minima = light blobs on dark background
Detecting corners/blobs
Build a scale space representation: stack of images (3D) with increasing

Difference of Gaussian (DoG)
Fast approximation of LoG (Used by SIFT)
LoG can be approximate by a Difference of 2 Gaussians (DoG) at different scales

DoG filter
It is a band-pass filter


L'idee est : "est-ce que mes bosses sont comprises entre telles ou telles longueurs d'onde"
DoG filter
Intuition
- Gaussian (g) is a low pass filter

DoG computation in practice
Take an image

Blur it

Take the difference

DoG scale generation trick
DoG computation use "octaves"
- "Octave" because frequency doubles/halves between octaves
- If , then levels per octave
- Downsample images for next octave (like double sized kernel)
- Compute the DoG between images

DoG: Corner selection
Throw out weak responses and edges
Estimate gradients
- Similar to harris, look at nearby responses
- Not whole image, only a few points! faster
- Throw out weak responses
Find cornery things
- Same deal, structure matrix, use dete and trace info (SIFT variant)

Determination of Hessian (DoH)


LoG vs DoG vs DoH

On prefere un "Laplacian of Gaussian" pour detecter les petites etoiles
LoG, DoG DoH summary
Pros
Very robust to transformations
Adjustable size detector
Cons
Slow
Blob detectors MSER
Maximally Stable Extremal Regions (MSER)
Detects regions which are stable over thresholds
- Create min & max-tree of the image
- Tree of included components when thresholdinf the image ar each possible level

- Select most stable regions
- between and
- is maximally stable iif as local minimum at

Summary
Pros
Very robust to transformations
- Affine transformations
- Intensity changes
Quite fast
Cons
Does not support blur
Local fetaure detectors: Conclusion
- Harris Stephens: can be very stable when combined with DoG
- Shi-Tomasi: Assumes affine transformation (avoid it with perspective)
- DoG: slow but very robust (perspective, blur, illumination)
- DoH: faster than DoG, misses small elements, better with perspective
- FAST: very fast, robust to perspective change (like DoG), but blur quickly kills it
- MSER: fast, very stable, good choice when no blur

Introduction
Given som keypoints in image 1, what are the more similar ones in image 2 ?

This is a nearest neighbor problem in descriptor space
This is also a geometrical problem in coordinate space
Matching
Matching problem
Goal: given 2 sets of descriptors, find the best matching pairs
Need a distance/norm: depends on the descriptor
- Distribution (histogram)? Stats?
- Data type ?
- Float, integers: Euclidean, cosine
- Binary: Hamming
1-way matching
For each in the set of descriptors , find the closest element in



Symmetry test aka cross check aka 2-way matching
For each in the set of descriptor , find the closest element in such as is also the closest element to


Ratio test
For each in , find the 2 closest elements and in

Calibrate the ratio
Adjust it on a training set !
For each correct/incorrect match in your annotated database, plot the next to next closest distance PDF.
What is a good ratio in D. Lowe's experiment ?

Geometric validation

Summary

Indexing
Indexing pipeline
Use case: We have a database of images and we want to find an object from it

Bruteforce matching aka linear matching
Simply scan all data and keep the closest elements
Does not scale to large databases, but can be faster on small ones
kD-Trees
binary tree in which every leaf node is a k-dimensional point
FLANN - Efficient indexing
- Original version: hierarchical k-means
- Construction: repetitive k-means on data (then inside clusters) until minimum cluster size is reached
- Lookup: traverse the tree in a best-bin-first manner with backtrack queue, backtrack until enough point are returned
Locally Sensitive Hasing (LSH)
Hash items using family of hash function which project similar items in the same bucket with high probability
Not cryptographic hashing !


En fonction de quel cote notre separatrice se trouve par rapport au point, on met notre bit a 1 ou 0


Approximation de la distance en binaire

Locality Sensitive Hashing (LSH)
- Fast and efficient with large spaces and lot of data
- Return a "good match", maybe not the best one
- kNN can be costly
Which indexing ?
Advices for practice session:



Image Mappings Overview

Math. foundations & assumptions

- For planar surfaces, 3D to 2D perspectives projection reduces to a 2D to a 2D transformation
- This is just a change of coordinate system
- This transformation is invertible
Translation

Scale

Rotation

Notation: Partitioned matrices

Affine

Projective

- Each point in 2D is actually a vector in 3D
- Equivalent up to scaling factor
- Have to normalize to get back to 2D
- Using homogrpahy to project point
- Multiply by
Summary

Homography estimation, Geometric validation
We want to recover H from keypoint matches


How many correspondences are needed ?
Depends on the type of transform:
- How many for translation ?
- For rotation ?
- …
- For general projective transform ?
Reminded: we have 2 knowns for each match

Linear system

Use Linear Least Square to solve

Solve the system

How reliable is the estimate ?

Even worse

Is our data perfect ?

On a pas mal de bruit
Overcoming Least Square Limitations
- We need a robust estimation
RANSAC: RANdom SAmple Consensus


Algorithm


RANSAC works well with extreme noises
