# Perspective-n-Point / PnP Problem <style> figure { /* border: 1px #cccccc solid; */ padding: 4px; margin: auto; text-align: center; } figcaption { background-color: black; color: white; font-style: italic; padding: 1px; text-align: center; } </style> ## Problem Definition <figure> <img src="https://hackmd.io/_uploads/rJGH6n-9T.png" width="500"> </figure> Perspective-n-Point is the problem of estimating the pose of a calibrated camera given a set of n 3D points in the world and their corresponding 2D points in the image. :::success Given: 1. Intrinsic matrix $K$ 2. 2D-3D correspondence points Output: 1. Camera pose $R, t$ ::: In previous [camera calibration post](https://hackmd.io/o0UqmikhQdKBHLiaY5FV-Q?both#Solve-Intrinsic--Extrinsic-Matrix-by-DLT), we introduce how to solve camera intrinsic and extrinsic Matrix by DLT altogether. How about introduce an assumption that intrinsic matrix is known, can we solve extrinsic matrix in a simpler way? Yes! And Perspective-n-Point (PnP) is one of the answer. ## P3P / P3+1P ### Why exactly three? Basically, if we are given enough correspondence points, we can mathematically recover the camera pose. But what is the definition of enough? Math says: only three correspondence points is good! Hence, ==PnP problem becomes P3P problem.== Why exactly 3? This has something to do with **Degrees of Freedom**, which is well explained in this [video](https://youtu.be/0JGC5hZYCVE?t=250). Here is the summary: - P0P: 6 DoFs - P1P: 4 DoFs - P2P: 2 DoFs - P3P: 0 DoFs ### Why P3P = P3+1P? Unfortunately, even though P3P algorithm can reduce the DoFs of solution to zero, it cannot completely remove ambiguity of solution. It still generate four solution candidates. But it is good enough because we can utilize the fourth pair to remove ambiguity. (we will discuss it later) To sum up, the minimum number of correspondence points PnP requires is 3+1. - Three pairs to generate four solution candidates. - One pair to remove ambiguity. :::warning Therefore, **PnP** is often referred to as **P3P**, and some claims **P3+1P** would fit best as a name. Don't be confused! ::: ## How to Solve P3P? Transformation from a given 2D-3D correspondences $(x_{i}, X_{w})$: $$ \overbrace{x_{i}}^\text{unknown} = \overbrace{K}^\text{known} \underbrace{[R \ t]}_\text{unknown} \overbrace{X_{w}}^\text{known} $$ <figure> <img style="border: 1px #cccccc solid;" src="https://hackmd.io/_uploads/HygjQbQ96.png" > <figcaption>tetrahedron geometry</figcaption> </figure> ### Algo - **Step 1:** Compute $\cos \langle \vec{ox_{i}}, \vec{ox_{j}} \rangle$ With intrinsic matrix $K$, we can get three line vectors $\vec{ox_{1}}, \vec{ox_{2}}, \vec{ox_{3}}$. Then compute cosine of inner product between lines by $\cos \langle \vec{ox_{1}}, \vec{ox_{2}} \rangle$, $\cos \langle \vec{ox_{2}}, \vec{ox_{3}} \rangle$, $\cos \langle \vec{ox_{1}}, \vec{ox_{3}} \rangle$. - **Step 2:** Compute distances Get the length from camera center to 3D points by [Law of cosines](https://en.wikipedia.org/wiki/Law_of_cosines) According to tetrahedron geometry, we can get the following equations: $$ \begin{align} \overline{X_{1}X_{2}}^{2} &= \overline{OX_{1}}^{2} + \overline{OX_{2}}^{2} - 2\cos \langle ox_{1}, ox_{2} \rangle \\ \overline{X_{1}X_{3}}^{2} &= \overline{OX_{1}}^{2} + \overline{OX_{3}}^{2} - 2\cos \langle ox_{1}, ox_{3} \rangle \\ \overline{X_{2}X_{3}}^{2} &= \overline{OX_{2}}^{2} + \overline{OX_{3}}^{2} - 2\cos \langle ox_{2}, ox_{3} \rangle \end{align} $$ This step generates four possible solutions as following figure shows. <figure style="text-align: center"> <img src="https://hackmd.io/_uploads/ryboFTrDp.png" width=400 /> </figure> - **Step 3:** Identify correct solution by the 4th point Solve all possible solutions and determine which one aligns most accurately with the 4th point. - **Step 4:** Recover camera pose Recover camera pose by computing coordinate transformation :::success Given: 1. $X_{1}, X_{2}, X_{3}$ camera coordinates 2. $X_{1}, X_{2}, X_{3}$ world coordinates 3. Camera pose for the camera coordinate system: $[I \ 0]$ Output: 1. Camera pose for world coordinate system: $[R \ t]$ ::: This is the problem of aligning two sets of 3D points, commonly known as 3D registration. We do not discuss in details here. Please refer to this [post](https://jingnanshi.com/blog/arun_method_for_3d_reg.html) for explanation. ## PnP with RANSAC P3P methods assume that the data is noise free, but it is nearly impossible in practice. So it is common to apply [RANSAC](https://hackmd.io/6DtvZNO9RJqXrFtQpA4n-g) with PnP method to make the solution robust to outliers in the set of point correspondences. ### Algo Assume there are $N$ correspondence points, where $N \ge 3$: 1. Select 3 points randomly 2. Computer camera pose using P3P algo 3. Count the number of points that support current hypothesis. Go back to step (1) until sufficient trials. 4. Select best solution ## PnP vs. Essential Matrix Essential matrix will be explained in [epipolar geometry post](https://hackmd.io/emuCyxF8QQiGdeZkvmpj3g). - PnP: compute camera pose by using a single image's information. - Essential Matrix: compute camera pose by using stereo image's information. Interestingly, sometimes we combine PnP and essential matrix algorithms to compute a series of camera poses. This method is used to solve 3D-2D visual odometry, which will be discussed in [visual odometry](https://hackmd.io/we7hocqTTOixuLFNE_Dy8g?view) post. ## Advanced Approaches - EPnP: The EPnP algorithm is particularly well-suited for real-time applications, such as augmented reality, where pose estimation must be performed quickly. - SQPnP: SQPnP was described by Terzakis and Lourakis in an ECCV 2020 paper. It is a non-minimal, non-polynomial solver which casts PnP as a non-linear quadratic program. ## References - [Medium post by Rashik Shrestha](https://medium.com/@rashik.shrestha/perspective-n-point-pnp-f2c7dd4ef1ed) - [Projective 3-Point Algorithm using Grunert's Method (Cyrill Stachniss)](https://www.youtube.com/watch?v=N1aCvzFll6Q) - https://github.com/polygon-software/python-visual-odometry/blob/master/Chapter%204%20-%20Camera%20Calibration.ipynb - [多视图几何(下)](https://youtu.be/NbnhJJGVPo8?list=PLFI1Cd4723_SQ-h8W0kT3igOoUCYjWRHG&t=3417)