###### tags: `Study notes`
# RANSAC (RANdom SAmple Consensus)
* Basic assumptions:
1. The data can be divided into **inliers** and **outliers**. The distribution of **inliers** can be explained by some set of model parameters(line model, plane model, ..., and etc.). On the other hand, **outliers** are noises that do not fit the model.
2. Given a small set of inliers, there exists a procedure which can estimate the model parameters that optimally explains or fits this data.
## Overview
1. Select a random subset of the original data. Call this subset the **hypothetical inliers**.
2. A model is fitted to the set of **hypothetical inliers**.
3. All other data are then tested against the fitted model. Those points that fit the estimated model well, according to some model-specific **loss function**, are considered as part of the consensus set.
4. Record the number of **inliers**.
5. ***Repeat*** the above procedure.
6. Choose the model parameters with the most number of **inliers**
Here are three questions:
1. How many points are chosen to fit the model?
2. How to define the **loss function** or **threshold**?
3. How many time to repeat the procedure?
## RANSAC Algorithm (Psuedocode):
```
Given:
data – A set of observations.
model – A model to explain observed data points.
n – Minimum number of data points required to estimate model parameters.
k – Maximum number of iterations allowed in the algorithm.
t – Threshold value to determine data points that are fit well by model.
d – Number of close data points required to assert that a model fits well to data.
Return:
bestFit – model parameters which best fit the data (or nul if no good model is found)
iterations = 0
bestFit = nul
bestErr = something really large
while iterations < k do
maybeInliers := n randomly selected values from data
maybeModel := model parameters fitted to maybeInliers
alsoInliers := empty set
for every point in data not in maybeInliers do
if point fits maybeModel with an error smaller than t
add point to alsoInliers
end for
if the number of elements in alsoInliers is > d then
// This implies that we may have found a good model
// now test how good it is.
betterModel := model parameters fitted to all points in maybeInliers and alsoInliers
thisErr := a measure of how well betterModel fits these points
if thisErr < bestErr then
bestFit := betterModel
bestErr := thisErr
end if
end if
increment iterations
end while
return bestFit
```
## Model
### Curve model:
* Curve equation: $y = ax^2 + bx + c$
* Given three points $p_1 = (x_1, y_1),\ p_2 = (x_2, y_2),\ p_3 = (x_3, y_3)$
$\begin{bmatrix}
y_1 \\
y_2 \\
y_3
\end{bmatrix} =
\begin{bmatrix}
x_1^2 & x_1 & 1 \\
x_2^2 & x_2 & 1 \\
x_3^2 & x_3 & 1
\end{bmatrix}
\begin{bmatrix}
a \\
b \\
c
\end{bmatrix}
\rightarrow
\begin{bmatrix}
a \\
b \\
c
\end{bmatrix}=
\begin{bmatrix}
x_1^2 & x_1 & 1 \\
x_2^2 & x_2 & 1 \\
x_3^2 & x_3 & 1
\end{bmatrix}^{-1}
\begin{bmatrix}
y_1 \\
y_2 \\
y_3
\end{bmatrix}$
### Plane model:
* Plane equation: $ax + by + cz + d = 0 \rightarrow$ estimate $\mathbf{n} = (a, b, c)$ and **$d$**
* Given three points $p_1 = (x_1, y_1, z_1),\ p_2 = (x_2, y_2, z_2),\ p_3 = (x_3, y_3, z_3)$
Suppose vector **v**, **w**, normal **n** are:
$\mathbf{v} = \overrightarrow{p_1 p_2}=p2-p1$
$\mathbf{w} = \overrightarrow{p_1p_3}=p3-p1$
$\mathbf{n} = v \times w = (a, b, c)$
then
$d = - n \cdot p_1 = -n \cdot p_2 = -n \cdot p_3 = -(n[0]*p_1[0] + n[1]*p_1[1] + n[2]*p_1[2])$
## Source Code (Python)
Github: https://github.com/tom13133/ransac
## Reference:
1. Wikipedia(Chinese)
https://zh.wikipedia.org/wiki/%E9%9A%A8%E6%A9%9F%E6%8A%BD%E6%A8%A3%E4%B8%80%E8%87%B4
2. Wikipedia(English)
https://en.wikipedia.org/wiki/Random_sample_consensus