# Triangulation
<style>
figure {
/* border: 1px #cccccc solid; */
text-align: center;
padding: 4px;
margin: auto;
}
figcaption {
background-color: black;
color: white;
font-style: italic;
padding: 1px;
text-align: center;
}
</style>
:::success
Reconstructing 3D point from its projection points on two or more images.
:::
<figure style="text-align: center">
<img src="https://hackmd.io/_uploads/SyG9QNcwa.png" width=400 />
</figure>
## Two-view Triangulation
<figure style="border: 1px #cccccc solid">
<img src="https://hackmd.io/_uploads/S1iksR8Fp.png" width=400 />
<figcaption> Figure 1
</figcaption>
</figure>
So simple! The 3D point lies in the intersection of two projection line.
$$
P = l \times l'
$$
Of course things are not that easy. Because in most cases, the two lines may never intersect due to noises. So what we can only find an approximate solution instead.
### Problem Definition
Referring to **Figure 1**, we employ the left camera coordinate system as our camera coordinate system. Then we can derive two corredponding projection points:
$$
\begin{align}
p &=MP=K[I \ 0]P \\
p' &=M'P=K'[R \ t]P
\end{align}
$$
For simplicity, suppose we have two cameras with known camera intrinsic parameters $K$ and $K'$ respectively. We also know the relative orientations and offsets $R, T$ of these cameras with respect to each other.
Our goal is to reconstruct 3D point $P$ from information above.
:::info
Given: $K, K', R, t, p, p'$
Find: $P$
:::
### Approach 1: Linear Method for Triangulation
Obtain four constraints from the following formula:
$$
M =
\begin{bmatrix}
m1 \\
m2 \\
m3
\end{bmatrix}, \
p = (u, v), \ p' = (u', v')
$$
$$
\begin{align}
u &= \frac{m_{1}P}{m_{3}P} \Rightarrow m_{1}P - u (m_{3}P) = 0 \\
v &= \frac{m_{2}P}{m_{3}P} \Rightarrow m_{2}P - v (m_{3}P) = 0 \\
u' &= \frac{m_{1}'P}{m_{3}'P} \Rightarrow m_{1}'P - u' (m_{3}'P) = 0 \\
v' &= \frac{m_{2}'P}{m_{3}'P} \Rightarrow m_{2}'P - v' (m_{3}P) = 0 \\
\end{align}
$$
With constraints, then we can get homogeneous linear equation:
$$
AP =
\begin{bmatrix}
m_{1} - um_{3} \\
m_{2} - vm_{3} \\
m_{1}' - u'm_{3}' \\
m_{2}' - v'm_{3}'
\end{bmatrix}
\begin{bmatrix}
P_{x} \\
P_{y} \\
P_{z} \\
\end{bmatrix}
= 0
$$
Four constraints, three variables, problem solved.
### Approach 2: Non-linear Method for Triangulation
Gauss-Newton or Levenberg-Marquadt for least squared error. Please refer to [cs231 course note](https://web.stanford.edu/class/cs231a/course_notes/04-stereo-systems.pdf) for detailed inference.
$$
\min\limits_{\hat{P}} \Vert M\hat{P}-p \Vert^{2} + \Vert M'\hat{P}-p '\Vert^{2}
$$

### In Reality
The two approaches given above assumes we know $K, K', R, t$ of cameras. Intrinsic parameters $K, K'$ is usually easily obtained from camera spec. However, we always do not know extrinsic paramters $R, t$ in reality. Hence, extra technique has to be employed to obtain these parameters. For example, [obtain $R, t$ from essential matrix](https://hackmd.io/emuCyxF8QQiGdeZkvmpj3g#Goal-2-Decompose-from-essential-matrix).
:::success
- Essential Matrix
- Given: $p, p', K, K'$
- Solve: $P, R, T$
- Fundamental Matrix
- Given: $p, p'$
- Solve: $P, R, T, K, K'$
- Triangulation
- Given: $p, p', K, K', R, T$
- Solve: $P$
:::
## Multi-view Triangulation (Structure from Motion)
When we go beyond two-view to multi-view for Triangulation, things become much more complicated. There is an entire chapter to discuss this topic named [Structure from Motion](https://hackmd.io/@jackyyeh/BJxMZtUUT/%2FDMdTrMVBSdyXhk5GsNSXtg), exploring how to simultaneously determine both the 3D structure of the scene and the parameters across multiple cameras.
<figure style="text-align: center">
<img src="https://hackmd.io/_uploads/ry12OF496.png" width="400">
</figure>
## References
- https://web.stanford.edu/class/cs231a/course_notes/04-stereo-systems.pdf
- http://luthuli.cs.uiuc.edu/~daf/courses/CV23/Slides/lec18_sfm-daf.pdf