單應性之應用-重投影ｘ變更視角

# 單應性之應用-重投影ｘ變更視角 ==單應性矩陣僅能針對針孔成像處理, 下述推導是針對齊次座標處理, 毋須考慮深度.== ==若考慮扭曲係數(角度不同, 係數不同)為非線性運算子, 就必須要考慮深度== **簡單地說：我們早已獲得已知某視角下的影像，接下來針對真實世界的某個平面去做視角轉換，最後取得該虛擬影像！** 在已知相機的內外參後，重投影(Reprojection)就是當前我們以某個已知姿勢拍攝得影像，透過改變視角將現有影像去作線性座標變換可得。換句話說，是以現有資訊去推敲改變至該視角下的狀況，當然一定會與直接在該視角拍攝得到的影像有落差。可應用於：Bird's view in ADAS，或是OpenGL LookAt函式等等。 ## 公式推導我們在 $C_1$ 姿勢時拍攝一張影像 $I_1$，想要重投影在 $C_2$ 視角時下得到影像 $I_2$。其中$C_1$視角透過 $(\sideset{^2}{_1}R,\sideset{^2}{_1}t)$ 便可剛性變換至$C_2$視角。 <center> <img width=35% src=https://hackmd.io/_uploads/r18PPGyfn.png> </center> 首先$X_1$,$X_2$都是3維世界座標點$X$，卻以不同相機座標作為表達，分別是$C_1$,$C_2$相機座標系。在這邊可以理解成： $$ X_1=[\sideset{^{1}}{_w}R|\sideset{^{1}}{_w}t]X; \ \ \ \ X_2=[\sideset{^{2}}{_w}R|\sideset{^{2}}{_w}t]X; $$ 從上述示意圖，我們知道$X_2$與$X_1$的關係為： $$ X_2 = \sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t $$ 插播: 但是... $(\sideset{^2}{_1}R , \sideset{^2}{_1}t)$ 怎麼來? :::spoiler 內文有詳細推導 From [1], 針孔成像的相機模型包含內參矩陣 $K$ 和外參矩陣 $\sideset{^{c_1}}{_w}M$，**外參是描述世界座標至相機座標的轉換**。整體相機模型則是描述著世界座標至影像座標的轉換： \begin{align*} s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} &= \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \boldsymbol{K} \hspace{0.2em} ^{c_1}\textrm{M}_w \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \end{align*} 我們重新定義$\sideset{^{c_1}}{_w}M$在齊次座標上，而有： \begin{align*} \begin{bmatrix} X_c \\ Y_c \\ Z_c \\ 1 \end{bmatrix} &= \hspace{0.2em} ^{c_1}\textrm{M}_w \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} ^{c_1}\textrm{R}_w & ^{c_1}\textrm{t}_w \\ 0_{1\times3} & 1 \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \\ &= \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} X_w \\ Y_w \\ Z_w \\ 1 \end{bmatrix} \end{align*} 為了以$C_2$相機座標系表達3維世界座標點$X$，我們需知道如何從$C_1$轉換至$C_2$相機座標系 $^{c_2}\textrm{M}_{c_1}$： $$ ^{c_2}\textrm{M}_{c_1} = \hspace{0.2em} ^{c_2}\textrm{M}_{w} \cdot \hspace{0.1em} ^{w}\textrm{M}_{c_1} = \hspace{0.2em} ^{c_2}\textrm{M}_{w} \cdot \hspace{0.1em} \left( ^{c_1}\textrm{M}_{w} \right )^{-1} = \begin{bmatrix} ^{c_2}\textrm{R}_{w} & ^{c_2}\textrm{t}_{w} \\ 0_{3 \times 1} & 1 \end{bmatrix} \cdot \begin{bmatrix} ^{c_1}\textrm{R}_{w}^T & - \hspace{0.2em} ^{c_1}\textrm{R}_{w}^T \cdot \hspace{0.2em} ^{c_1}\textrm{t}_{w} \\ 0_{1 \times 3} & 1 \end{bmatrix}=\begin{bmatrix} ^{c_2}\textrm{R}_{c_1} & ^{c_2}\textrm{t}_{c_1} \\ 0_{3 \times 1} & 1 \end{bmatrix} $$ 最後我們可得： $$ \sideset{^2}{_1}R = \sideset{^{C_2}}{_{C_1}}R = \hspace{0.2em} ^{c_2}\textrm{R}_{w} \cdot \hspace{0.1em} ^{c_1}\textrm{R}_{w}^{T} $$ $$ \sideset{^2}{_1}t = \sideset{^{C_2}}{_{C_1}}t = \hspace{0.2em} ^{c_2}\textrm{R}_{w} \cdot \left( - \hspace{0.1em} ^{c_1}\textrm{R}_{w}^{T} \cdot \hspace{0.1em} ^{c_1}\textrm{t}_{w} \right ) + \hspace{0.1em} ^{c_2}\textrm{t}_{w} $$ *** ::: <br> 我們定義$N_1=(n_1,n_2,n_3)^T$是在$C_1$相機坐標系下平面$\pi$的單位法向量；$d_1$是點$C_1$至平面$\pi$的距離，又或者可理解成平面函式$n_1X+n_2Y+n_3Z-d_1=0$，而有： $$ n_1X+n_2Y+n_3Z=d_1 \\ \implies \frac{N_1^TX_1}{d_1}=1 , \ \ \forall X_1 \in \pi $$ :::spoiler 點到直線之距離 $點C_1$在相機坐標系$C_1$下是原點。 <img width=80% src=https://hackmd.io/_uploads/Hyyfl71Gn.png> ::: <br> 接下來，我們透過上述式子得到： $$ X_2=\sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t\cdot 1=\sideset{^{2}}{_{1}}RX_1+\sideset{^{2}}{_{1}}t\cdot (\frac{N_1^TX_1}{d_1}) = (\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)X_1 $$ 令$G=(\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)$，因為上面$X_1$和$X_2$都是相機坐標系，但我們已經有影像$I_1$了，當然會更想知道影像坐標系的變換。定義$x_1$,$x_2$分別為影像座標系$I_1$,$I_2$下的座標；內參為$K$，而有： $$ x_1=KX_1,\\ x_2=KX_2,\\ \implies x_2=KGK^{-1}x_1 = H x_1 $$ 最後我們可得單應性變換$H=K(\sideset{^{2}}{_{1}}R+\sideset{^{2}}{_{1}}tN_1^T/d_1)K^{-1}$．使得影像$I_1$變換至$I_2$. *** 雖然我們已經有公式，但問題是$N_1,d_1$要怎麼拿到？　回顧文章的最開頭！ **簡單地說：我們早已獲得已知某視角下的影像，接下來針對真實世界的某個平面去做視角轉換，最後取得該虛擬影像！** 其實$N_1,d_1$就是真實世界的某個平面$\pi$在以相機座標系$C_1$下得到的平面函式（$N_1^TX_1-d_1=0$） ## 應用-bird's view 相機校正時，我們常會假設棋盤格是$Z_w=0$的平面在世界座標系下。那麼換成相機座標系$C_1$下取得平面函式（$N_1^TX_1-d_1=0$）便可得$(N_1,d_1)$： $$ N_1=\sideset{^1}{_w}R N_w=\sideset{^1}{_w}R \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}, \ \ \ \ \ \ \ \ d_1=N_1^TO_1=N_1^T([\sideset{^1}{_w}R|\sideset{^1}{_w}t] \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}) $$ 其中$O_1$是世界座標的原點以相機座標系$C_1$表達; $N_w$是世界座標系下平面$Z_w=0$的法向量。 *** - issue: 不知道opencv如何定義坐標系的順序－Ｒ_desired實際轉的角度？內參外參的正負號？ <br> - 左圖：input image, 右圖：output image(變更視角成bird's view). <center> <img width=75% src=https://hackmd.io/_uploads/rJVTt71M3.png> </center> - C++程式： ```c++= //opencv 4.7.0 , from stackoverflow. #include <iostream> #include <opencv2/calib3d.hpp> #include <opencv2/core.hpp> #include <opencv2/highgui.hpp> #include <opencv2/imgproc.hpp> void ShowImageInWindow(string name, Mat& src, float ratio = 0.4f) { namedWindow(name, WINDOW_NORMAL); imshow(name, src); resizeWindow(name, (int)(src.cols * ratio), (int)(src.rows * ratio)); waitKey(0); } namespace { enum Pattern { CHESSBOARD, CIRCLES_GRID, ASYMMETRIC_CIRCLES_GRID }; void calcChessboardCorners(Size boardSize, float squareSize, vector<Point3f>& corners, Pattern patternType = CHESSBOARD) { corners.resize(0); switch (patternType) { case CHESSBOARD: case CIRCLES_GRID: //! [compute-chessboard-object-points] for (int i = 0; i < boardSize.height; i++) for (int j = 0; j < boardSize.width; j++) //To try to center the chessboard frame, we substract the image size corners.push_back(Point3f(float((j - boardSize.width / 2) * squareSize), float((i - boardSize.height / 2) * squareSize), 0)); //! [compute-chessboard-object-points] break; case ASYMMETRIC_CIRCLES_GRID: for (int i = 0; i < boardSize.height; i++) for (int j = 0; j < boardSize.width; j++) corners.push_back(Point3f(float((2 * j + i % 2) * squareSize), float(i * squareSize), 0)); break; default: CV_Error(Error::StsBadArg, "Unknown pattern type\n"); } } void computeC2MC1(const Mat& R1, const Mat& tvec1, const Mat& R2, const Mat& tvec2, Mat& R_1to2, Mat& tvec_1to2) { //c2Mc1 = c2Mo * oMc1 = c2Mo * c1Mo.inv() R_1to2 = R2 * R1.t(); tvec_1to2 = R2 * (-R1.t() * tvec1) + tvec2; } } //namespace int main() { Mat img = imread("left02.jpg"); Mat img_corners = img.clone(), img_pose = img.clone(), img_bird_eye_view = img.clone(); vector<Point2f> corners; Size patternSize(9, 6); bool found = findChessboardCorners(img, patternSize, corners); drawChessboardCorners(img_corners, patternSize, corners, found); imshow("Chessboard corners detection", img_corners); vector<Point3f> objectPoints; float squareSize = 2.5e-2; calcChessboardCorners(patternSize, squareSize, objectPoints); FileStorage fs("left_intrinsics.yml", FileStorage::READ); Mat intrinsic, distCoeffs; fs["camera_matrix"] >> intrinsic; fs["distortion_coefficients"] >> distCoeffs; Mat rvec, tvec; solvePnP(objectPoints, corners, intrinsic, distCoeffs, rvec, tvec); //aruco::drawAxis(img_pose, cameraMatrix, distCoeffs, rvec, tvec, 2 * squareSize); imshow("Pose", img_pose); //roated with z-axis Mat R_desired = (Mat_<double>(3, 3) << 0, 1, 0, -1, 0, 0, 0, 0, 1); Mat R; Rodrigues(rvec, R); cout << R << endl; Mat normal = (Mat_<double>(3, 1) << 0, 0, 1);// The normal of plane in world Mat normal1 = R * normal;// The normal of plane in Camera Pose 1. Mat origin(3, 1, CV_64F, Scalar(0));//World's origin Mat origin1 = R * origin + tvec; //World's origin in Camera Pose 1. double d_inv1 = 1.0 / normal1.dot(origin1); Mat R_1to2, tvec_1to2; Mat tvec_desired = tvec.clone(); computeC2MC1(R, tvec, R_desired, tvec_desired, R_1to2, tvec_1to2); Mat H = R_1to2 + d_inv1 * tvec_1to2 * normal1.t(); H = intrinsic * H * intrinsic.inv(); H = H / H.at<double>(2, 2); std::cout << "H:\n" << H << std::endl; warpPerspective(img_pose, img_bird_eye_view, H, img.size()); Mat compare; hconcat(img_pose, img_bird_eye_view, compare); imshow("Bird eye view", compare); waitKey(); return 0; } ``` :::spoiler 程式所需的資料 - left02.jpg ![](https://hackmd.io/_uploads/HkgYvm1M3.jpg) - left_intrinsics.yml ```c++= %YAML:1.0 --- nframes: 13 image_width: 640 image_height: 480 board_width: 9 board_height: 6 square_size: 2.5000000372529030e-02 aspectRatio: 1. flags: 2 camera_matrix: !!opencv-matrix rows: 3 cols: 3 dt: d data: [ 5.3591573396163199e+02, 0., 3.4228315473308373e+02, 0., 5.3591573396163199e+02, 2.3557082909788173e+02, 0., 0., 1. ] distortion_coefficients: !!opencv-matrix rows: 5 cols: 1 dt: d data: [ -2.6637260909660682e-01, -3.8588898922304653e-02, 1.7831947042852964e-03, -2.8122100441115472e-04, 2.3839153080878486e-01 ] avg_reprojection_error: 3.9259098975581364e-01 per_view_reprojection_errors: !!opencv-matrix rows: 13 cols: 1 dt: f data: [ 1.92965463e-01, 1.18204820e+00, 1.73180386e-01, 1.93417311e-01, 1.59574091e-01, 1.79683909e-01, 2.30989486e-01, 2.41952404e-01, 2.96267658e-01, 1.67184874e-01, 2.02002615e-01, 3.81039530e-01, 1.74401343e-01 ] extrinsic_parameters: !!opencv-matrix rows: 13 cols: 6 dt: d data: [ 1.6866673097722978e-01, 2.7567195383689680e-01, 1.3463666677617407e-02, -7.5217911266918208e-02, -1.0895943925991841e-01, 3.9970206949907272e-01, 4.1331287656496363e-01, 6.4989015618432178e-01, -1.3371537960145106e+00, -5.8571677080547203e-02, 8.2925805670236566e-02, 3.5381014833230601e-01, -2.7703695013795054e-01, 1.8693309320100124e-01, 3.5485225341087834e-01, -3.9846501015652937e-02, -1.0041611109510440e-01, 3.1815947023777164e-01, -1.1090615673109079e-01, 2.3965970843402720e-01, -2.1135637810781923e-03, -9.8410654744228568e-02, -6.7330010965873974e-02, 3.3085237266887146e-01, -2.9186914919266310e-01, 4.2838824536930098e-01, 1.3127376448141377e+00, 5.8492717894568363e-02, -1.1531702553211766e-01, 3.1718597226747441e-01, 4.0775746983982769e-01, 3.0372749654555553e-01, 1.6490540383167107e+00, 1.6727077792571535e-01, -6.5571043573575183e-02, 3.3646131272177648e-01, 1.7933504280050525e-01, 3.4558984092172601e-01, 1.8685292421609112e+00, 1.9533408668697443e-02, -7.1821904367276174e-02, 3.8942937075181105e-01, -9.0969163793927624e-02, 4.7978599772080688e-01, 1.7534054022831906e+00, 7.9050417654120575e-02, -8.7941963150599309e-02, 3.1666076957685929e-01, 2.0297932232462285e-01, -4.2392077549829726e-01, 1.3241327935810543e-01, -6.6346241810532544e-02, -8.1019305580944570e-02, 2.7830224494208888e-01, -4.1905731583840156e-01, -4.9969284527936553e-01, 1.3355787183928016e+00, 4.6902734761583582e-02, -1.1100626108196045e-01, 3.3805630488128308e-01, -2.3853178487346252e-01, 3.4785724405059820e-01, 1.5307655926865789e+00, 5.0764487316281588e-02, -1.0259706994505384e-01, 3.2220131320183526e-01, 4.6395663682204152e-01, -2.8347019688901215e-01, 1.2385662249906069e+00, 3.3699309698414767e-02, -9.1617248179872074e-02, 2.9144614839683858e-01, -1.6997848268735108e-01, -4.7116903885245226e-01, 1.3459942250907577e+00, 4.5015523494596366e-02, -1.0817857239600029e-01, 3.1243767202759759e-01 ] ``` ::: ## Reference 1.[Opencv, Demo 3: Homography from the camera displacement](https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html#tutorial_homography_Demo3) 2. [YT, 3D Computer Vision | Lecture 4 (Part 1): Robust homography estimation](https://www.youtube.com/watch?v=W8vgVoQdwAM) 3. [StackOverflow, Bird's eye view perspective transformation from camera calibration opencv python ](https://stackoverflow.com/questions/48576087/birds-eye-view-perspective-transformation-from-camera-calibration-opencv-python) 4. [點到直線的距離公式](https://highscope.ch.ntu.edu.tw/wordpress/?p=47407) ###### tags: `Image Processing`