# 單應性之應用-重投影x變更視角
==單應性矩陣僅能針對針孔成像處理, 下述推導是針對齊次座標處理, 毋須考慮深度.==
==若考慮扭曲係數(角度不同, 係數不同)為非線性運算子, 就必須要考慮深度==
**簡單地說:我們早已獲得已知某視角下的影像,接下來針對真實世界的某個平面去做視角轉換,最後取得該虛擬影像!**
在已知相機的內外參後,重投影(Reprojection)就是當前我們以某個已知姿勢拍攝得影像,透過改變視角將現有影像去作線性座標變換可得。換句話說,是以現有資訊去推敲改變至該視角下的狀況,當然一定會與直接在該視角拍攝得到的影像有落差。可應用於:Bird's view in ADAS,或是OpenGL LookAt函式等等。
## 公式推導
我們在 $C_1$ 姿勢時拍攝一張影像 $I_1$,想要重投影在 $C_2$ 視角時下得到影像 $I_2$。其中$C_1$視角透過 $(\sideset{^2}{_1}R,\sideset{^2}{_1}t)$ 便可剛性變換至$C_2$視角。
<center>
<img width=35% src=https://hackmd.io/_uploads/r18PPGyfn.png>
</center>
首先$X_1$,$X_2$都是3維世界座標點$X$,卻以不同相機座標作為表達,分別是$C_1$,$C_2$相機座標系。在這邊可以理解成:
$$
X_1=[\sideset{^{1}}{_w}R|\sideset{^{1}}{_w}t]X; \ \ \ \
X_2=[\sideset{^{2}}{_w}R|\sideset{^{2}}{_w}t]X;
$$
從上述示意圖,我們知道$X_2$與$X_1$的關係為:
$$
X_2 = \sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t
$$
插播: 但是... $(\sideset{^2}{_1}R , \sideset{^2}{_1}t)$ 怎麼來?
:::spoiler 內文有詳細推導
From [1],
針孔成像的相機模型包含內參矩陣 $K$ 和外參矩陣 $\sideset{^{c_1}}{_w}M$,**外參是描述世界座標至相機座標的轉換**。整體相機模型則是描述著世界座標至影像座標的轉換:
\begin{align*} s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} &= \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \boldsymbol{K} \hspace{0.2em} ^{c_1}\textrm{M}_w \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \end{align*}
我們重新定義$\sideset{^{c_1}}{_w}M$在齊次座標上,而有:
\begin{align*} \begin{bmatrix} X_c \\ Y_c \\ Z_c \\ 1 \end{bmatrix} &= \hspace{0.2em} ^{c_1}\textrm{M}_w \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} ^{c_1}\textrm{R}_w & ^{c_1}\textrm{t}_w \\ 0_{1\times3} & 1 \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \end{align*}
為了以$C_2$相機座標系表達3維世界座標點$X$,我們需知道如何從$C_1$轉換至$C_2$相機座標系 $^{c_2}\textrm{M}_{c_1}$:
$$
^{c_2}\textrm{M}_{c_1} = \hspace{0.2em} ^{c_2}\textrm{M}_{w} \cdot \hspace{0.1em} ^{w}\textrm{M}_{c_1} = \hspace{0.2em} ^{c_2}\textrm{M}_{w} \cdot \hspace{0.1em} \left( ^{c_1}\textrm{M}_{w} \right )^{-1} = \begin{bmatrix} ^{c_2}\textrm{R}_{w} & ^{c_2}\textrm{t}_{w} \\ 0_{3 \times 1} & 1 \end{bmatrix} \cdot \begin{bmatrix} ^{c_1}\textrm{R}_{w}^T & - \hspace{0.2em} ^{c_1}\textrm{R}_{w}^T \cdot \hspace{0.2em} ^{c_1}\textrm{t}_{w} \\ 0_{1 \times 3} & 1 \end{bmatrix}=\begin{bmatrix} ^{c_2}\textrm{R}_{c_1} & ^{c_2}\textrm{t}_{c_1} \\ 0_{3 \times 1} & 1 \end{bmatrix}
$$
最後我們可得:
$$
\sideset{^2}{_1}R = \sideset{^{C_2}}{_{C_1}}R = \hspace{0.2em} ^{c_2}\textrm{R}_{w} \cdot \hspace{0.1em} ^{c_1}\textrm{R}_{w}^{T}
$$
$$
\sideset{^2}{_1}t = \sideset{^{C_2}}{_{C_1}}t = \hspace{0.2em} ^{c_2}\textrm{R}_{w} \cdot \left( - \hspace{0.1em} ^{c_1}\textrm{R}_{w}^{T} \cdot \hspace{0.1em} ^{c_1}\textrm{t}_{w} \right ) + \hspace{0.1em} ^{c_2}\textrm{t}_{w}
$$
***
:::
<br>
我們定義$N_1=(n_1,n_2,n_3)^T$是在$C_1$相機坐標系下平面$\pi$的單位法向量;$d_1$是點$C_1$至平面$\pi$的距離,又或者可理解成平面函式$n_1X+n_2Y+n_3Z-d_1=0$,而有:
$$
n_1X+n_2Y+n_3Z=d_1 \\
\implies \frac{N_1^TX_1}{d_1}=1 , \ \ \forall X_1 \in \pi
$$
:::spoiler 點到直線之距離
$點C_1$在相機坐標系$C_1$下是原點。
<img width=80% src=https://hackmd.io/_uploads/Hyyfl71Gn.png>
:::
<br>
接下來,我們透過上述式子得到:
$$
X_2=\sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t\cdot 1=\sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t\cdot (\frac{N_1^TX_1}{d_1}) = (\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)X_1
$$
令$G=(\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)$,因為上面$X_1$和$X_2$都是相機坐標系,但我們已經有影像$I_1$了,當然會更想知道影像坐標系的變換。定義$x_1$,$x_2$分別為影像座標系$I_1$,$I_2$下的座標;內參為$K$,而有:
$$
x_1=KX_1,\\
x_2=KX_2,\\
\implies x_2=KGK^{-1}x_1 = H x_1
$$
最後我們可得單應性變換$H=K(\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)K^{-1}$.使得影像$I_1$變換至$I_2$.
***
雖然我們已經有公式,但問題是$N_1,d_1$要怎麼拿到?
回顧文章的最開頭!
**簡單地說:我們早已獲得已知某視角下的影像,接下來針對真實世界的某個平面去做視角轉換,最後取得該虛擬影像!**
其實$N_1,d_1$就是真實世界的某個平面$\pi$在以相機座標系$C_1$下得到的平面函式($N_1^TX_1-d_1=0$)
## 應用-bird's view
相機校正時,我們常會假設棋盤格是$Z_w=0$的平面在世界座標系下。那麼換成相機座標系$C_1$下取得平面函式($N_1^TX_1-d_1=0$)便可得$(N_1,d_1)$:
$$
N_1=\sideset{^1}{_w}R N_w=\sideset{^1}{_w}R \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}, \ \ \ \ \ \ \ \
d_1=N_1^TO_1=N_1^T([\sideset{^1}{_w}R|\sideset{^1}{_w}t] \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix})
$$
其中$O_1$是世界座標的原點以相機座標系$C_1$表達; $N_w$是世界座標系下平面$Z_w=0$的法向量。
***
- issue: 不知道opencv如何定義坐標系的順序-R_desired實際轉的角度?內參外參的正負號?
<br>
- 左圖:input image, 右圖:output image(變更視角成bird's view).
<center>
<img width=75% src=https://hackmd.io/_uploads/rJVTt71M3.png>
</center>
- C++程式:
```c++=
//opencv 4.7.0 , from stackoverflow.
#include <iostream>
#include <opencv2/calib3d.hpp>
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
void ShowImageInWindow(string name, Mat& src, float ratio = 0.4f)
{
namedWindow(name, WINDOW_NORMAL);
imshow(name, src);
resizeWindow(name, (int)(src.cols * ratio), (int)(src.rows * ratio));
waitKey(0);
}
namespace
{
enum Pattern { CHESSBOARD, CIRCLES_GRID, ASYMMETRIC_CIRCLES_GRID };
void calcChessboardCorners(Size boardSize, float squareSize, vector<Point3f>& corners, Pattern patternType = CHESSBOARD)
{
corners.resize(0);
switch (patternType)
{
case CHESSBOARD:
case CIRCLES_GRID:
//! [compute-chessboard-object-points]
for (int i = 0; i < boardSize.height; i++)
for (int j = 0; j < boardSize.width; j++)
//To try to center the chessboard frame, we substract the image size
corners.push_back(Point3f(float((j - boardSize.width / 2) * squareSize),
float((i - boardSize.height / 2) * squareSize), 0));
//! [compute-chessboard-object-points]
break;
case ASYMMETRIC_CIRCLES_GRID:
for (int i = 0; i < boardSize.height; i++)
for (int j = 0; j < boardSize.width; j++)
corners.push_back(Point3f(float((2 * j + i % 2) * squareSize),
float(i * squareSize), 0));
break;
default:
CV_Error(Error::StsBadArg, "Unknown pattern type\n");
}
}
void computeC2MC1(const Mat& R1, const Mat& tvec1, const Mat& R2, const Mat& tvec2,
Mat& R_1to2, Mat& tvec_1to2)
{
//c2Mc1 = c2Mo * oMc1 = c2Mo * c1Mo.inv()
R_1to2 = R2 * R1.t();
tvec_1to2 = R2 * (-R1.t() * tvec1) + tvec2;
}
} //namespace
int main()
{
Mat img = imread("left02.jpg");
Mat img_corners = img.clone(), img_pose = img.clone(), img_bird_eye_view = img.clone();
vector<Point2f> corners;
Size patternSize(9, 6);
bool found = findChessboardCorners(img, patternSize, corners);
drawChessboardCorners(img_corners, patternSize, corners, found);
imshow("Chessboard corners detection", img_corners);
vector<Point3f> objectPoints;
float squareSize = 2.5e-2;
calcChessboardCorners(patternSize, squareSize, objectPoints);
FileStorage fs("left_intrinsics.yml", FileStorage::READ);
Mat intrinsic, distCoeffs;
fs["camera_matrix"] >> intrinsic;
fs["distortion_coefficients"] >> distCoeffs;
Mat rvec, tvec;
solvePnP(objectPoints, corners, intrinsic, distCoeffs, rvec, tvec);
//aruco::drawAxis(img_pose, cameraMatrix, distCoeffs, rvec, tvec, 2 * squareSize);
imshow("Pose", img_pose);
//roated with z-axis
Mat R_desired = (Mat_<double>(3, 3) <<
0, 1, 0,
-1, 0, 0,
0, 0, 1);
Mat R;
Rodrigues(rvec, R);
cout << R << endl;
Mat normal = (Mat_<double>(3, 1) << 0, 0, 1);// The normal of plane in world
Mat normal1 = R * normal;// The normal of plane in Camera Pose 1.
Mat origin(3, 1, CV_64F, Scalar(0));//World's origin
Mat origin1 = R * origin + tvec; //World's origin in Camera Pose 1.
double d_inv1 = 1.0 / normal1.dot(origin1);
Mat R_1to2, tvec_1to2;
Mat tvec_desired = tvec.clone();
computeC2MC1(R, tvec, R_desired, tvec_desired, R_1to2, tvec_1to2);
Mat H = R_1to2 + d_inv1 * tvec_1to2 * normal1.t();
H = intrinsic * H * intrinsic.inv();
H = H / H.at<double>(2, 2);
std::cout << "H:\n" << H << std::endl;
warpPerspective(img_pose, img_bird_eye_view, H, img.size());
Mat compare;
hconcat(img_pose, img_bird_eye_view, compare);
imshow("Bird eye view", compare);
waitKey();
return 0;
}
```
:::spoiler 程式所需的資料
- left02.jpg

- left_intrinsics.yml
```c++=
%YAML:1.0
---
nframes: 13
image_width: 640
image_height: 480
board_width: 9
board_height: 6
square_size: 2.5000000372529030e-02
aspectRatio: 1.
flags: 2
camera_matrix: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [ 5.3591573396163199e+02, 0., 3.4228315473308373e+02, 0.,
5.3591573396163199e+02, 2.3557082909788173e+02, 0., 0., 1. ]
distortion_coefficients: !!opencv-matrix
rows: 5
cols: 1
dt: d
data: [ -2.6637260909660682e-01, -3.8588898922304653e-02,
1.7831947042852964e-03, -2.8122100441115472e-04,
2.3839153080878486e-01 ]
avg_reprojection_error: 3.9259098975581364e-01
per_view_reprojection_errors: !!opencv-matrix
rows: 13
cols: 1
dt: f
data: [ 1.92965463e-01, 1.18204820e+00, 1.73180386e-01,
1.93417311e-01, 1.59574091e-01, 1.79683909e-01, 2.30989486e-01,
2.41952404e-01, 2.96267658e-01, 1.67184874e-01, 2.02002615e-01,
3.81039530e-01, 1.74401343e-01 ]
extrinsic_parameters: !!opencv-matrix
rows: 13
cols: 6
dt: d
data: [ 1.6866673097722978e-01, 2.7567195383689680e-01,
1.3463666677617407e-02, -7.5217911266918208e-02,
-1.0895943925991841e-01, 3.9970206949907272e-01,
4.1331287656496363e-01, 6.4989015618432178e-01,
-1.3371537960145106e+00, -5.8571677080547203e-02,
8.2925805670236566e-02, 3.5381014833230601e-01,
-2.7703695013795054e-01, 1.8693309320100124e-01,
3.5485225341087834e-01, -3.9846501015652937e-02,
-1.0041611109510440e-01, 3.1815947023777164e-01,
-1.1090615673109079e-01, 2.3965970843402720e-01,
-2.1135637810781923e-03, -9.8410654744228568e-02,
-6.7330010965873974e-02, 3.3085237266887146e-01,
-2.9186914919266310e-01, 4.2838824536930098e-01,
1.3127376448141377e+00, 5.8492717894568363e-02,
-1.1531702553211766e-01, 3.1718597226747441e-01,
4.0775746983982769e-01, 3.0372749654555553e-01,
1.6490540383167107e+00, 1.6727077792571535e-01,
-6.5571043573575183e-02, 3.3646131272177648e-01,
1.7933504280050525e-01, 3.4558984092172601e-01,
1.8685292421609112e+00, 1.9533408668697443e-02,
-7.1821904367276174e-02, 3.8942937075181105e-01,
-9.0969163793927624e-02, 4.7978599772080688e-01,
1.7534054022831906e+00, 7.9050417654120575e-02,
-8.7941963150599309e-02, 3.1666076957685929e-01,
2.0297932232462285e-01, -4.2392077549829726e-01,
1.3241327935810543e-01, -6.6346241810532544e-02,
-8.1019305580944570e-02, 2.7830224494208888e-01,
-4.1905731583840156e-01, -4.9969284527936553e-01,
1.3355787183928016e+00, 4.6902734761583582e-02,
-1.1100626108196045e-01, 3.3805630488128308e-01,
-2.3853178487346252e-01, 3.4785724405059820e-01,
1.5307655926865789e+00, 5.0764487316281588e-02,
-1.0259706994505384e-01, 3.2220131320183526e-01,
4.6395663682204152e-01, -2.8347019688901215e-01,
1.2385662249906069e+00, 3.3699309698414767e-02,
-9.1617248179872074e-02, 2.9144614839683858e-01,
-1.6997848268735108e-01, -4.7116903885245226e-01,
1.3459942250907577e+00, 4.5015523494596366e-02,
-1.0817857239600029e-01, 3.1243767202759759e-01 ]
```
:::
## Reference
1.[Opencv, Demo 3: Homography from the camera displacement](https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html#tutorial_homography_Demo3)
2. [YT, 3D Computer Vision | Lecture 4 (Part 1): Robust homography estimation](https://www.youtube.com/watch?v=W8vgVoQdwAM)
3. [StackOverflow, Bird's eye view perspective transformation from camera calibration opencv python ](https://stackoverflow.com/questions/48576087/birds-eye-view-perspective-transformation-from-camera-calibration-opencv-python)
4. [點到直線的距離公式](https://highscope.ch.ntu.edu.tw/wordpress/?p=47407)
###### tags: `Image Processing`