Triangulation - HackMD

# Triangulation <style> figure { /* border: 1px #cccccc solid; */ text-align: center; padding: 4px; margin: auto; } figcaption { background-color: black; color: white; font-style: italic; padding: 1px; text-align: center; } </style> :::success Reconstructing 3D point from its projection points on two or more images. ::: <figure style="text-align: center"> <img src="https://hackmd.io/_uploads/SyG9QNcwa.png" width=400 /> </figure> ## Two-view Triangulation <figure style="border: 1px #cccccc solid"> <img src="https://hackmd.io/_uploads/S1iksR8Fp.png" width=400 /> <figcaption> Figure 1 </figcaption> </figure> So simple! The 3D point lies in the intersection of two projection line. $$ P = l \times l' $$ Of course things are not that easy. Because in most cases, the two lines may never intersect due to noises. So what we can only find an approximate solution instead. ### Problem Definition Referring to **Figure 1**, we employ the left camera coordinate system as our camera coordinate system. Then we can derive two corredponding projection points: $$ \begin{align} p &=MP=K[I \ 0]P \\ p' &=M'P=K'[R \ t]P \end{align} $$ For simplicity, suppose we have two cameras with known camera intrinsic parameters $K$ and $K'$ respectively. We also know the relative orientations and offsets $R, T$ of these cameras with respect to each other. Our goal is to reconstruct 3D point $P$ from information above. :::info Given: $K, K', R, t, p, p'$ Find: $P$ ::: ### Approach 1: Linear Method for Triangulation Obtain four constraints from the following formula: $$ M = \begin{bmatrix} m1 \\ m2 \\ m3 \end{bmatrix}, \ p = (u, v), \ p' = (u', v') $$ $$ \begin{align} u &= \frac{m_{1}P}{m_{3}P} \Rightarrow m_{1}P - u (m_{3}P) = 0 \\ v &= \frac{m_{2}P}{m_{3}P} \Rightarrow m_{2}P - v (m_{3}P) = 0 \\ u' &= \frac{m_{1}'P}{m_{3}'P} \Rightarrow m_{1}'P - u' (m_{3}'P) = 0 \\ v' &= \frac{m_{2}'P}{m_{3}'P} \Rightarrow m_{2}'P - v' (m_{3}P) = 0 \\ \end{align} $$ With constraints, then we can get homogeneous linear equation: $$ AP = \begin{bmatrix} m_{1} - um_{3} \\ m_{2} - vm_{3} \\ m_{1}' - u'm_{3}' \\ m_{2}' - v'm_{3}' \end{bmatrix} \begin{bmatrix} P_{x} \\ P_{y} \\ P_{z} \\ \end{bmatrix} = 0 $$ Four constraints, three variables, problem solved. ### Approach 2: Non-linear Method for Triangulation Gauss-Newton or Levenberg-Marquadt for least squared error. Please refer to [cs231 course note](https://web.stanford.edu/class/cs231a/course_notes/04-stereo-systems.pdf) for detailed inference. $$ \min\limits_{\hat{P}} \Vert M\hat{P}-p \Vert^{2} + \Vert M'\hat{P}-p '\Vert^{2} $$ ![image](https://hackmd.io/_uploads/BJfprzwFa.png) ### In Reality The two approaches given above assumes we know $K, K', R, t$ of cameras. Intrinsic parameters $K, K'$ is usually easily obtained from camera spec. However, we always do not know extrinsic paramters $R, t$ in reality. Hence, extra technique has to be employed to obtain these parameters. For example, [obtain $R, t$ from essential matrix](https://hackmd.io/emuCyxF8QQiGdeZkvmpj3g#Goal-2-Decompose-from-essential-matrix). :::success - Essential Matrix - Given: $p, p', K, K'$ - Solve: $P, R, T$ - Fundamental Matrix - Given: $p, p'$ - Solve: $P, R, T, K, K'$ - Triangulation - Given: $p, p', K, K', R, T$ - Solve: $P$ ::: ## Multi-view Triangulation (Structure from Motion) When we go beyond two-view to multi-view for Triangulation, things become much more complicated. There is an entire chapter to discuss this topic named [Structure from Motion](https://hackmd.io/@jackyyeh/BJxMZtUUT/%2FDMdTrMVBSdyXhk5GsNSXtg), exploring how to simultaneously determine both the 3D structure of the scene and the parameters across multiple cameras. <figure style="text-align: center"> <img src="https://hackmd.io/_uploads/ry12OF496.png" width="400"> </figure> ## References - https://web.stanford.edu/class/cs231a/course_notes/04-stereo-systems.pdf - http://luthuli.cs.uiuc.edu/~daf/courses/CV23/Slides/lec18_sfm-daf.pdf