# MLRF: Lecture 03
# Agenda for lecture 3
1. Introduction
2. Finish lecture about local feature **detectors**
3. Local feature descriptors
4. Descriptor matching and indexing
5. Projective transformations
6. Homography estimation
# A word about Divise clustering
HAC is *bottom-up*, divisive clustering is performed *top-down*
Classical approach:
1. Start with all data
2. Apply flat clustering
3. Recursively apply the approach on each cluster until some termination
Pros: can have more than 2 sub-trees
# Summary of last lecture
Global image descriptors
- Color histogram
- Limited descriptive power
- Which distance function ?
Clustering
- K-means
- Hierarchical Agglomerative Clustering
Local feature detectors
- Image gradients
- Edge detector: Sobel, Canny
- Corner detector: Harris
- Large image gradient in 2 directions
- Corner detector: FAST
- Corner detectors: LoG, DoG, DoH
- Blob detector: MSER
# Nest practice sessions
**Compute and match descriptors** for max. 1 hour (from practice session 2)

Play with **ORB keypoint mathcing** to implement a simple AR technique (practice session 3)

# Introduction
*How are panorama pcitures created from multiple pictures?*

1. Detect small parts invariant under viewpoint change: **keypoints**
2. Find pairs of mathcing keypoints using a **description** of their neighborhood
3. Compute the **most likely transformation** to blend images together

# Local feature descriptors
## Harris & Stephen conclusion
Harris-Stephens trick to avoid computing eigenvalues:

:::danger
Nowadays linear algebra is cheap, so **compute the real eigenvalues**.
:::
Thein filter using $\min(\lambda_1, \lambda_2)\gt\lambda$, $\lambda$ being a threshold
:::success
This is the Shi-Tomasi variant
:::
## Build your own edge/corner detector
We need the eigenvalues $\lambda_1$ and $\lambda_2$ of the structure tensor (hessian matrix with block-wise summing)

```python
dst = cv2.cornerEigenValsAndVecs(src, neighborhood_size, sobel_aperture)
dst = cv2.cornerMinEigenVal(src, neighborhood_size, sobel_aperture)
```
## Harris summary
### Pros
Translation invariant
- Large gradients in both directions = stable points
### Cons
Not so fast
- Avoid to compute all those derivatives
Not scale invariant
- Detect corners at different *scales*
# Corner detectors, binary tests FAST
## Features from accelerated segment test (FAST)
*Keypoint detector used by ORB*
:::info
**Segment test**
Compare pixel $P$ intensity $I_p$ with surrounding pixels (circle of 16 pixels)
:::
:::success
If $n$ contiguous pixels are either:
- all darker than $I_p-t$
- all brighter than $I_p+t$
then $P$ is detected as a corner
:::

## Tricks
1. Cascading
- If $n=12$ ($\frac{3}{4}$ of the circle)
3. Machine learning
4. How to perform **non-maximal suppression**
## FAST summary
### Pros
Very fast
- 20 times faster than Harris
- 40 times faster than DoG
Very robust to transformations (perspective in particular)
### Cons
Very sensitive to blur
# Corner detectors at different scales LoG, DoG, DoH
## Laplacian of Gaussian (LoG)
> The theoretical, slow way
:::info
**Band-pass filter**
It detects objects of a certain size
:::

### Laplacian = second derivative
Like sobel with 1 more derivation
Taylor, again:

New filter $I_{xx} = \begin{bmatrix}&1 &-2 &1\end{bmatrix} \times I$

### Laplacian filter $\nabla^2I(x,y)$
Edge detector, like Sobel but with second derivatives

### Laplacian **of Gaussian**
> mexican hat

### LoG = detector of circular shapes
:::success
Detector of circular shapes:
:::

LoG filter extrema locates "blobs"
- maxima = dark blobs on light background
- minima = light blobs on dark background
### Detecting corners/blobs
Build a scale space representation: *stack of images (3D) with increasing $\sigma$*

## Difference of Gaussian (DoG)
Fast approximation of LoG (Used by SIFT)
:::info
LoG can be approximate by a Difference of 2 Gaussians (DoG) at different scales

:::
### DoG filter
It is a band-pass filter


> L'idee est : "est-ce que mes bosses sont comprises entre telles ou telles longueurs d'onde"
### DoG filter
Intuition
- Gaussian (g) is a low pass filter

### DoG computation in practice
Take an image

Blur it

Take the difference

### DoG scale generation trick
:::info
DoG computation use "octaves"
:::
- "Octave" because frequency doubles/halves between octaves
- If $\sigma=\sqrt{2}$, then $3$ levels per octave
- Downsample images for next octave (like double sized kernel)
- Compute the DoG between images

### DoG: Corner selection
:::info
Throw out weak responses and edges
:::
Estimate gradients
- Similar to harris, look at nearby responses
- Not whole image, only a few points! faster
- Throw out weak responses
Find cornery things
- Same deal, structure matrix, use dete and trace info (SIFT variant)

## Determination of Hessian (DoH)
:::info
Faster approximation
:::


## LoG vs DoG vs DoH

:::success
On prefere un "Laplacian of Gaussian" pour detecter les petites etoiles
:::
## LoG, DoG DoH summary
### Pros
Very robust to transformations
- Perspective
- Blur
Adjustable size detector
### Cons
Slow
# Blob detectors MSER
## Maximally Stable Extremal Regions (MSER)
> Detects regions which are stable over thresholds
1. Create min & max-tree of the image
- Tree of included components when thresholdinf the image ar each possible level
- 
- 
- Le cerveau a une tumeur
2. Select most stable regions
- between $t-\triangle$ and $t+\triangle$
- $R_{t\*}$ is maximally stable iif $q(t)=\vert R_{t-\triangle}\text{\ } R_{t+\triangle}\vert/\vert R_t\vert$ as local minimum at $t^{\*}$
- 
### Summary
#### Pros
Very robust to transformations
- Affine transformations
- Intensity changes
Quite fast
#### Cons
Does not support blur
# Local fetaure detectors: Conclusion
- Harris Stephens: can be very stable when combined with DoG
- Shi-Tomasi: Assumes affine transformation (avoid it with perspective)
- DoG: slow but very robust (perspective, blur, illumination)
- DoH: faster than DoG, misses small elements, better with perspective
- FAST: very fast, robust to perspective change (like DoG), but blur quickly kills it
- MSER: fast, very stable, good choice when no blur

# Introduction
> Given som keypoints in image 1, what are the more similar ones in image 2 ?

:::info
This is a **nearest neighbor problem** in **descriptor space**
:::
:::warning
This is also a **geometrical problem** in **coordinate space**
:::
# Matching
## Matching problem
:::danger
Goal: given 2 sets of descriptors, find the best matching pairs
:::
Need a **distance/norm**: depends on the descriptor
- Distribution (histogram)? Stats?
- Data type ?
- Float, integers: Euclidean, cosine
- Binary: Hamming
## 1-way matching
:::info
For each $x_i$ in the set of descriptors $D_1$, find the closest element $y_i$ in $D_2$
:::


:::success
We have a match $m(x_i, y_i)$ for each $x_i$
:::

## Symmetry test aka cross check aka 2-way matching
:::info
For each $x_i$ in the set of descriptor $D_1$, find the closest element $y_i$ in $D_2$ such as $x_i$ is **also the closest** element to $y_i$
:::


## Ratio test
:::info
For each $x_i$ in $D_1$, find the 2 closest elements $y_i$ and $y_j$ in $D_2$
:::

### Calibrate the ratio
:::warning
Adjust it on a training set !
:::
For each correct/incorrect match in your annotated database, plot the *next to next closest distance* PDF.
**What is a good ratio in D. Lowe's experiment ?**

## Geometric validation

## Summary

# Indexing
## Indexing pipeline
Use case: We have a database of images and we want to find an object from it

## Bruteforce matching aka linear matching
:::info
Simply scan all data and keep the closest elements
:::
Does not scale to large databases, but can be faster on small ones
## kD-Trees
> binary tree in which every leaf node is a k-dimensional point
## FLANN - Efficient indexing
- Original version: hierarchical k-means
- Construction: repetitive k-means on data (then inside clusters) until minimum cluster size is reached
- Lookup: traverse the tree in a best-bin-first manner with backtrack queue, backtrack until enough point are returned
## Locally Sensitive Hasing (LSH)
:::info
Hash items using family of hash function which project similar items in the same bucket with high probability
:::
:::danger
Not cryptographic hashing !
:::


En fonction de quel cote notre separatrice se trouve par rapport au point, on met notre bit a 1 ou 0


:::success
Approximation de la distance $\cos$ en binaire
:::

## Locality Sensitive Hashing (LSH)
- Fast and efficient with large spaces and lot of data
- Return a "good match", maybe not the best one
- kNN can be costly
## Which indexing ?
:::success
Experiment
:::
Advices for practice session:


# Projective transformations
## A linear transformation of pixel coordinates

## Image Mappings Overview

## Math. foundations & assumptions

- For planar surfaces, 3D to 2D perspectives projection reduces to a 2D to a 2D transformation
- This is just a change of coordinate system
- This transformation is **invertible**
## Translation

## Scale

## Rotation

### Notation: Partitioned matrices

## Affine

## Projective

## More on projective transform
- Each point in 2D is actually a vector in 3D
- Equivalent up to scaling factor
- Have to normalize to get back to 2D
- Using homogrpahy to project point
- Multiply $\tilde x$ by $\tilde H$
## Summary

# Homography estimation, Geometric validation
## We want to recover H from keypoint matches

## Recover the parameters of a perspective transform

## How many correspondences are needed ?
Depends on the type of transform:
- How many for translation ?
- For rotation ?
- ...
- For general projective transform ?
:::danger
Reminded: we have 2 knowns for each match
:::

### Linear system

### Use Linear Least Square to solve $Ma=b$

### Solve the system

## How reliable is the estimate ?

### Even worse

## Is our data perfect ?

On a pas mal de bruit
## Overcoming Least Square Limitations
- We need a **robust estimation**
### RANSAC: RANdom SAmple Consensus


#### Algorithm


## RANSAC works well with extreme noises
